Thing is that Theorems 4 and 5 of arxiv.org/abs/2007.13505 ensure that the mHN type clustering is less of a potentially flaky snake oil move, but actually robust mathematically. As examples on real world data show, this type of pooling is effective.
A central mechanism in machine learning is to identify, store, and recognize patterns. How to learn, access, and retrieve such patterns is crucial in Hopfield networks and the more recent transformer ...
Nice question, thank you. I would suspect that classical clustering algorithms will be less effective compared to, say, "HopfieldPoolingLayers". The ultimate example of how effective mHNs are at MIL type problems is this: arxiv.org/abs/2007.13505
A central mechanism in machine learning is to identify, store, and recognize patterns. How to learn, access, and retrieve such patterns is crucial in Hopfield networks and the more recent transformer ...
I mean ...
Did you hear about the tech industry banning the phrase "Irish Stew" from being a valid password? Apparently it isn't Stroganoff!
Yes, would agree. Self-attention or cross-attention are the ultimate content-associative memories, xLSTM's memory mechanism is great for extrapolation (Fig 7). Also, in the past, I think openAI five use self-attention layers feeding into an LSTM at each time step - effectively validating your hunch.
Thanks. Any comments welcome.