Assessing the Performance of a Hebbian Learning Rule in Semi-Supervised Learning
Summary
Imagine you’re trying to teach a young child to identify different objects,
like apples, bananas, or shoes. If you show them only a few pictures of each
item, they might struggle to recognize these objects in various forms. But, as
you show them more pictures and examples, they get better at identifying them.
Now, consider computer models that work similarly. They need examples (or
data) to learn and recognize patterns.
This study investigated how different computer models learned to recognize
two types of patterns: handwritten numbers (from the MNIST dataset) and
different clothing items, like shirts and shoes (from the Fashion MNIST dataset).
These datasets are like the picture books you’d show a child, but for computers.
The computer models Recurrent Bayesian Confidence Propagation Neural
Network (R-BCPNN) and the Feedforward BCPNN (FF-BCPNN) were the
main models. ”Recurrent” means the model has a memory of sorts — it can
recall past information, much like how our brains remember past experiences.
”Feedforward” on the other hand, doesn’t have this memory feature. These
were benchmarked against other, similar, models.
The study found that the R-BCPNN, was especially good at recognizing
patterns even when there weren’t many labeled examples to learn from. It’s like
a child seeing many pictures of apples but only a few have the text ”apple”.
This is crucial in real-world scenarios where we often don’t have thousands of
labeled examples to teach our models. In such situations, the R-BCPNN shines,
showing its potential in ”semi-supervised learning”. This term means that the
model is learning with both labeled examples (like a picture of an apple labeled
”apple”) and unlabeled ones (just the picture with no name).
The way the models learn was to first use unsupervised learning to learn
representations of all the different samples. This meant that they never received
the labels corresponding to the samples. The idea was that even without the
labels, the models could group similar samples together by representing them
in a different way. Then, the models could use these new representations when
training on a few labeled samples.
An initial hypothesis was that it would be possible to choose samples from
commonly recognized patterns (called ’popular’ attractors) to make the models
perform better. It’s like assuming a child would better identify an apple if shown
the most common apple pictures. However, labeling these kinds of samples did not
increase the classification accuracy.
In conclusion, this study delved deep into the world of computer models
and their learning capabilities. It underscored the importance of ”memory”
in these models, especially when there’s limited data. It also opened doors to
understanding the human brain better and designing models that might one day
replicate our learning processes. The world of computer learning is vast, and
this research has laid a foundation for further research on biologically plausible
computer models in semi-supervised learning.