BLUE

Olivier Codol

@oliviercodol.bsky.social

Post-doc in RL, motor control, neuroscience at Mila & U de Montréal in the Lajoie & Perich labs

95 followers59 following26 posts

OColiviercodol.bsky.socialOct 10, 2024 8:31pm

Sure! Thanks for checking beforehand, appreciate it :)

OColiviercodol.bsky.socialOct 9, 2024 2:29am

The objective function itself is something I tried to steer away from, which is why it is identical in my models, to fix it so as to abstract it away from the conclusions I agree the “intuition” has limits, even a baby moves with purpose eventually, but learning continues!

OColiviercodol.bsky.socialOct 7, 2024 5:04pm

We close up by discussing some of our views on learning and plasticity in cortical structures in the brain. Happy to chat more with anyone thinking about these questions!

OColiviercodol.bsky.socialOct 7, 2024 5:04pm

Finally, we show that the neural representations produced by RL have stabilization properties when fine-tuning to new environmental dynamics. Unlike supervised learning, this leads to representational reorganization that mirrors cortical plasticity in monkeys.

OColiviercodol.bsky.socialOct 7, 2024 5:04pm

We tested these results in a biomechanically simplified setting & found that this completely breaks down, underlining that these different neural "solutions" actually depend on a more complex output-state maps. This sheds a new light on the idea of "universal solutions" for neural networks.

OColiviercodol.bsky.socialOct 7, 2024 5:03pm

We found that models trained with RL aligned much better to monkey data in an matched reaching task. This was true with a crude geometrical metric (CCA) and dynamics metric (DSA) over tasks/datasets, and monkeys.

OColiviercodol.bsky.socialOct 7, 2024 5:03pm

A long-standing question in psych and neuro is what serves as a dominant "teaching signal" over which to optimize when learning new skills. Instead of approaching this behaviourally, we compared monkey neural recordings to modelling predictions under the same objective function using MotorNet.

Reposted by Olivier Codol

JDdudman.bsky.socialSep 12, 2024 9:06pm

Inspired by similar lines of thinking (different flavors of RL algorithm converge to same optimal policy) we want to watch the networks learn and see how closely that matches the learning process observed in the brain. Ie www.nature.com/articles/s41...

Mesolimbic dopamine adapts the rate of learning from action - Nature

Analysis of data collected from mice learning a trace conditioning paradigm shows that phasic dopamine activity in the brain can regulate direct learning of behavioural policies, and dopamine sets&nbs...

OColiviercodol.bsky.socialSep 10, 2024 5:00am

Editor gatekeeping was ongoing beforehand though, in any journal. I don’t see the new eLife system tuning it up, if anything it’s more obvious because it is the sole remaining gatekeeping mechanism.

OColiviercodol.bsky.socialJan 5, 2024 4:01pm

Overall we hope the changes go in the right direction to further enable smooth research efforts. Please do leave feedback, suggestions, and reports if you find bugs or would like to see some features implemented. Happy new year!

Olivier Codol

@oliviercodol.bsky.social

Post-doc in RL, motor control, neuroscience at Mila & U de Montréal in the Lajoie & Perich labs

95 followers59 following26 posts