BLUE
JD
John David Pressman
@jdp.extropian.net
LLM developer, alignment-accelerationist, Fedorovist ancestor simulator, Dreamtime enjoyer. All posts public domain under CC0 1.0.
302 followers166 following481 posts
JDjdp.extropian.net

The hippocampus doesn't just focus on associative relations but also premises memory on reward signals. This implies that a "learning" terminal reward is some kind of signal to the hippocampus, and I would imagine it's to mark where in-context learning happened. pubmed.ncbi.nlm.nih.gov/21851992/

A neoHebbian framework for episodic memory; role of dopamine-dependent late LTP - PubMed
A neoHebbian framework for episodic memory; role of dopamine-dependent late LTP - PubMed

According to the Hebb rule, the change in the strength of a synapse depends only on the local interaction of presynaptic and postsynaptic events. Studies at many types of synapses indicate that the ea...

0
JDjdp.extropian.net

That is, why have a "learning" reward type separate from wanting and liking? If the purpose of these terminal rewards is to tag memories for inclusion in the hippocampus then it would make sense to have a specific reward signal for when you manage to locally figure out a pattern so it can be stored.

Excerpt from "Qualia Formalism and a Symmetry Theory of Valence" discussing how the brain has three distinct reward types: wanting, liking, and learning.
1
JDjdp.extropian.net

I don't have any mechanistic evidence but I will note that the human brain having a "learning" terminal reward makes a lot more sense in the context of hippocampal reward gating if it's there to fish out in-context learned patterns from daily experience. bsky.app/profile/jdp....

1
JDjdp.extropian.net

I think these visualizations are referenced/reproduced in the book Silence On The Wire by Michel Zalewski.

1
JDjdp.extropian.net

No I suspect humans bootstrap text from understanding other modalities. There's a reason we teach children with picture books.

1
JDjdp.extropian.net

For what it's worth a great deal of why LLMs confabulate is that they don't have robust memories. It's not that they "predict the next token", that's basically an IQ test (e.g. Raven's Progressive Matrices) and any cognitive process can be framed that way. It's that they basically have dementia.

3
JDjdp.extropian.net

What I found particularly interesting in that article was how it explicitly enumerates the different kinds of play children can engage in. It seems that a playground for AI agents would also want to be designed around an explicit list of possible affordances for different things the agent can do.

An excerpt from playgroundideas.org's 10 Principles Of Playground Design. It has a list of kinds of play children can engage in and example including Active Play (e.g. running), Sensory Play (e.g. Touching textures), Creative Play (e.g. Drawing), Imaginative Play (e.g. Playing house), Social Play (e.g. Talking) and Reflective Play (e.g. Daydreaming)
0
JD
John David Pressman
@jdp.extropian.net
LLM developer, alignment-accelerationist, Fedorovist ancestor simulator, Dreamtime enjoyer. All posts public domain under CC0 1.0.
302 followers166 following481 posts