This page is part of a multi-part series on Model-Agnostic Meta-Learning. If you are already familiar with the topic, use the menu on the left side to jump straight to the part that is of interest for you. Otherwise, we suggest you start at here.
If you tried the exercise above, you have undoubtedly received a very high accuracy score. Even though you likely have never seen some of the characters, you can classify them given only a single example, potentially without realizing that what you are able to do off the top of your head would be pretty impressive to an average deep neural network.
In this article, we give an interactive introduction to
model-agnostic meta-learning (MAML)
It is well known in the machine learning community that models must be trained with a large number of examples before meaningful predictions can be made for unseen data.
However, we do not always have enough data available to
cater to this need: A sufficient amount of data may be expensive or even
impossible to acquire.
Nevertheless, there are good reasons to believe that this is not an inherent issue
of learning.
Humans are known to excel at generalizing after seeing only a few
samples
The MAML method we present in this article has prominently emerged from research in two fields which each address one of the above requirements. While introducing these fields to you, we will also equip you with the most important terms and concepts we will need along the rest of the article.
While clearly, one sample is not enough for a model without prior knowledge, we can pretrain models on tasks that we assume to be similar to the target tasks. The idea in its core is to derive an inductive bias from a set of problem classes to perform better on other, newly encountered, problem-classes. This similarity assumption allows the model to collect meta-knowledge that is not obtained from a single task but from the distribution of tasks. The learning of this meta-knowledge is called "meta-learning".
Achieving rapid convergence of machine learning models on a few samples is known as
"few-shot learning". If you are presented with \(N\) samples and are expected to learn a classification
problem with \( M \)
classes, we speak of an \( M \)-way-\(N\)-shot problem.
The small exercise from the beginning, which we offer either as a \(20\)- or \(5\)-way-1-shot problem, is a
prominent example of a few-shot learning task, whose symbols are
taken from
the Omniglot dataset
Having set the scene, we can now dig into MAML and its variants. Continue reading on the next page to find out why MAML is called "model-agnostic" or go straight to an explanation of MAML.