BLUE
Profile banner
MK
Michael Kopp
@mkk20.bsky.social
Are memory and compute really different sides of the same coin in ANNs or NNNs?
8 followers27 following15 posts
MKmkk20.bsky.social

Thing is that Theorems 4 and 5 of arxiv.org/abs/2007.13505 ensure that the mHN type clustering is less of a potentially flaky snake oil move, but actually robust mathematically. As examples on real world data show, this type of pooling is effective.

Modern Hopfield Networks and Attention for Immune Repertoire Classification
Modern Hopfield Networks and Attention for Immune Repertoire Classification

A central mechanism in machine learning is to identify, store, and recognize patterns. How to learn, access, and retrieve such patterns is crucial in Hopfield networks and the more recent transformer ...

0
MKmkk20.bsky.social

Nice question, thank you. I would suspect that classical clustering algorithms will be less effective compared to, say, "HopfieldPoolingLayers". The ultimate example of how effective mHNs are at MIL type problems is this: arxiv.org/abs/2007.13505

Modern Hopfield Networks and Attention for Immune Repertoire Classification
Modern Hopfield Networks and Attention for Immune Repertoire Classification

A central mechanism in machine learning is to identify, store, and recognize patterns. How to learn, access, and retrieve such patterns is crucial in Hopfield networks and the more recent transformer ...

0
MKmkk20.bsky.social

I mean ...

0
MKmkk20.bsky.social

Did you hear about the tech industry banning the phrase "Irish Stew" from being a valid password? Apparently it isn't Stroganoff!

0
MKmkk20.bsky.social

Yes, would agree. Self-attention or cross-attention are the ultimate content-associative memories, xLSTM's memory mechanism is great for extrapolation (Fig 7). Also, in the past, I think openAI five use self-attention layers feeding into an LSTM at each time step - effectively validating your hunch.

0
MKmkk20.bsky.social

Thanks. Any comments welcome.

1
Profile banner
MK
Michael Kopp
@mkk20.bsky.social
Are memory and compute really different sides of the same coin in ANNs or NNNs?
8 followers27 following15 posts