Add Transfer Learning section
This commit is contained in:
parent
832e21ee41
commit
50d36011a5
@ -1613,12 +1613,69 @@ top-1 accuracy of 75.2\% and 67.4\% respectively.
|
|||||||
\section{Transfer Learning}
|
\section{Transfer Learning}
|
||||||
\label{sec:background-transfer-learning}
|
\label{sec:background-transfer-learning}
|
||||||
|
|
||||||
Give a definition of transfer learning and explain how it is
|
Transfer learning refers to the application of a learning algorithm to
|
||||||
done. Compare fine-tuning just the last layers vs. propagating changes
|
a target domain by utilizing knowledge already learned from a
|
||||||
through the whole network. What are advantages to transfer learning?
|
different source domain \cite{zhuang2021}. The learned representations
|
||||||
Are there any disadvantages?
|
from the source domain are thus \emph{transferred} to solve a related
|
||||||
|
problem in another domain. Transfer learning works because
|
||||||
|
semantically meaningful information an algorithm has learned from a
|
||||||
|
(large) data set is often meaningful in other contexts as well, even
|
||||||
|
though the \emph{new problem} is not exactly the same problem for
|
||||||
|
which the original model had been trained for. An analogy to
|
||||||
|
day-to-day life as humans can be drawn with sports. Intuitively,
|
||||||
|
skills learned during soccer such as ball control, improved endurance
|
||||||
|
and strategic thinking are often also useful in other ball
|
||||||
|
sports. Someone who is adept at certain kinds of sports will likely be
|
||||||
|
able to pick up similar types much faster.
|
||||||
|
|
||||||
Estimated 2 pages for this section.
|
In mathematical terms, \textcite{pan2010} define transfer learning as:
|
||||||
|
|
||||||
|
\begin{quote}{\cite[p.1347]{pan2010}}
|
||||||
|
Given a source domain $\mathcal{D}_{S}$ and learning task
|
||||||
|
$\mathcal{T}_{S}$, a target domain $\mathcal{D}_{T}$ and learning task
|
||||||
|
$\mathcal{T}_{T}$, transfer learning aims to help improve the learning of the
|
||||||
|
target predictive function $f_{T}(\cdot)$ in $\mathcal{D}_{T}$ using the knowledge
|
||||||
|
in $\mathcal{D}_{S}$ and $\mathcal{T}_{S}$, where $\mathcal{D}_{S}\neq\mathcal{D}_{T}$, or $\mathcal{T}_{S}\neq\mathcal{T}_{T}$.
|
||||||
|
\end{quote}
|
||||||
|
|
||||||
|
In the machine learning world, collecting and labeling data for
|
||||||
|
training a model is often time consuming, expensive and sometimes not
|
||||||
|
possible. Deep learning based models especially require substantial
|
||||||
|
amounts of data to be able to robustly classify images or solve other
|
||||||
|
tasks. Semi-supervised or unsupervised (see
|
||||||
|
section~\ref{sec:theory-ml}) learning approaches can partially
|
||||||
|
mitigate this problem, but having accurate ground truth data is
|
||||||
|
usually a requirement nonetheless. Through the publication of large
|
||||||
|
labeled data sets such as via the \glspl{ilsvrc}, a basis for
|
||||||
|
(pre-)training exists from which the model can be optimized for
|
||||||
|
downstream tasks.
|
||||||
|
|
||||||
|
Transfer learning is not a panacea, however. Care has to be taken to
|
||||||
|
only use models which have been pretrained in a source domain which is
|
||||||
|
similar to the target domain in terms of feature space. While this may
|
||||||
|
seem to be an easy task, it is often not known in advance if transfer
|
||||||
|
learning is the correct approach. Furthermore, choosing whether to
|
||||||
|
only remove the fully-connected layers at the end of a pretrained
|
||||||
|
model or to fine-tune all parameters introduces at least one
|
||||||
|
additional hyperparameter. These decisions have to be made by
|
||||||
|
comparing the source domain with the target domain, how much data in
|
||||||
|
the target domain is available, how much computational resources are
|
||||||
|
available and observing which layers are responsible for which
|
||||||
|
features. Since earlier layers usually contain low-level and later
|
||||||
|
layers high-level information, resetting the weights of the last few
|
||||||
|
layers or replacing them with different ones entirely is also an
|
||||||
|
option.
|
||||||
|
|
||||||
|
To summarize, while transfer learning is an effective tool and is
|
||||||
|
likely a major factor in the proliferation of deep learning based
|
||||||
|
models, not all domains are suited for it. The additional decisions
|
||||||
|
which have to be made as a result of using transfer learning can
|
||||||
|
introduce more complexity than would otherwise be necessary for a
|
||||||
|
particular problem. It does, however, allow researchers to get started
|
||||||
|
quickly and to iterate faster because popular network architectures
|
||||||
|
pretrained on Imagenet are integrated into the major machine learning
|
||||||
|
frameworks. Transfer learning is used extensively in this work to
|
||||||
|
train a classifier as well as an object detection model.
|
||||||
|
|
||||||
\section{Hyperparameter Optimization}
|
\section{Hyperparameter Optimization}
|
||||||
\label{sec:background-hypopt}
|
\label{sec:background-hypopt}
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user