diff --git a/thesis/thesis.tex b/thesis/thesis.tex index 1d494a2..cb766e8 100644 --- a/thesis/thesis.tex +++ b/thesis/thesis.tex @@ -1525,11 +1525,43 @@ section~\ref{sec:methods-classification}. \subsubsection{DenseNet} \label{sssec:theory-densenet} +The authors of DenseNet \cite{huang2017} go one step further than +ResNets by connecting every convolutional layer to every other layer +in the chain. Previously, each layer was connected in sequence with +the one before and the one after it. Residual connections establish a +link between the previous layer and the next one, but still do not +always propagate enough information forward. These \emph{shortcut +connections} from earlier layers to later layers are thus only taking +place in an episodic way for short sections in the chain. DenseNets +are structured in a way such that every layer receives the feature map +of every previous layer as input. In ResNets, information from +previous layers is added on to the next layer via element-wise +addition. DenseNets concatenate the features of the previous +layers. The number of feature maps per layer has to be kept low so +that the subsequent layers can still process their inputs. Otherwise, +the last layer in each dense block would receive too many channels +which increases computational complexity. + +The authors construct their network from multiple dense blocks which +are connected via a batch normalization layer, a one by one +convolutional layer and a two by two pooling layer to reduce the +spatial resolution for the next dense block. Each dense block consists +of a batch normalization layer, a \gls{relu} layer and a three by +three convolutional layer. In order to keep the number of feature maps +low, the authors introduce a \emph{growth rate} $k$ as a +hyperparameter. The growth rate can be as low as $k=4$ and still allow +the network to learn highly relevant representations. + +In their experiments, the authors evaluate different combinations of +dense blocks and growth rates against ImageNet. Their DenseNet-161 +($k=48$) achieves a top-5 error rate with single-crop of 6.15\% and +with multi-crop 5.3\%. Their DenseNet-BC variant requires only one +third of the amount of parameters of a ResNet-101 network to achieve +the same test error on the CIFAR-10 dataset. + \subsubsection{MobileNet v3} \label{sssec:theory-mobilenet-v3} - - \section{Transfer Learning} \label{sec:background-transfer-learning}