Add design section in prototype design
This commit is contained in:
parent
1820d695f4
commit
bfc9488602
@ -130,6 +130,7 @@ Challenge}
|
||||
\newacronym{se}{SE}{Squeeze-Excitation}
|
||||
\newacronym{bn}{BN}{Batch Normalization}
|
||||
\newacronym{uav}{UAV}{Unmanned Aerial Vehicle}
|
||||
\newacronym{csi}{CSI}{Camera Serial Interface}
|
||||
|
||||
\begin{document}
|
||||
|
||||
@ -1962,11 +1963,6 @@ from them.
|
||||
\label{chap:design}
|
||||
|
||||
\begin{enumerate}
|
||||
\item Describe the architecture of the prototype (two-stage approach
|
||||
and how it is implemented with an object detector and
|
||||
classifier). How the individual stages are connected (object
|
||||
detector generates cutouts which are passed to classifier). Periodic
|
||||
image capture and inference on the Jetson Nano.
|
||||
\item Closely examine the used models (YOLOv7 and ResNet) regarding
|
||||
their structure as well as unique features. Additionally, list the
|
||||
augmentations which were done during training of the object
|
||||
@ -2013,10 +2009,6 @@ recall values of 70\%.
|
||||
\section{Design}
|
||||
\label{sec:design}
|
||||
|
||||
Reference methods section (~\ref{sec:methods}) to explain two-stage
|
||||
structure of the approach. Reference the description of the processing
|
||||
loop on the prototype in Figure~\ref{fig:setup}.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width=0.8\textwidth]{graphics/setup.pdf}
|
||||
@ -2034,7 +2026,59 @@ loop on the prototype in Figure~\ref{fig:setup}.
|
||||
\label{fig:setup}
|
||||
\end{figure}
|
||||
|
||||
Estimated 1 page for this section.
|
||||
Figure~\ref{fig:setup} shows the overall processing loop which happens
|
||||
on the device. The camera is directly attached to the Nvidia Jetson
|
||||
Nano via a \gls{csi} cable. Since the cable is quite rigid, the camera
|
||||
must be mounted on a small \emph{stand} such as a tripod. Images
|
||||
coming in from the camera are then passed to the object detection
|
||||
model running on the Nvidia Jetson Nano. The model detects all plants
|
||||
in the image and returns the coordinates of a bounding box per
|
||||
plant. These coordinates are used to \emph{cut out} each plant from
|
||||
the original image. The cutout is then passed to the second model
|
||||
running on the Nvidia Jetson Nano which determines if the plant is
|
||||
water-stressed or not. The percentage values of the prediction are
|
||||
mapped to a scale between one and ten, where ten indicates that the
|
||||
plant is in a very dire state. This number is available via a
|
||||
\gls{rest} endpoint with additional information such as current time
|
||||
as well as how long it has been since the state has been better than
|
||||
three. The endpoint publishes this information for every plant which
|
||||
has been detected.
|
||||
|
||||
The water stress prediction itself consists of two stages. First,
|
||||
plants are detected and, second, each individual plant is
|
||||
classified. This two-stage approach lends itself well to a two-stage
|
||||
model structure. Since the first stage is an object detection task, we
|
||||
employ an object detection model and pass the individual plant images
|
||||
to a second model---the classifier.
|
||||
|
||||
While most object detection models could be trained to determine the
|
||||
difference between water-stressed and healthy, the reason for this
|
||||
two-stage design lies in the availability of data. To our knowledge,
|
||||
there are no sufficiently large enough data sets available which
|
||||
contain labeling information for water-stressed and healthy. Instead,
|
||||
most data sets only classify common objects such as plane, person,
|
||||
car, bicycle, and so forth (e.g. \gls{coco} \cite{lin2015}). However,
|
||||
the classes \emph{plant} and \emph{houseplant} are present in most
|
||||
data sets and provide the basis for our object detection model. The
|
||||
size of these data sets allows us to train the object detection model
|
||||
with a large number of samples which would have been unfeasible to
|
||||
label on our own. The classifier is then trained with a smaller data
|
||||
set which only comprises individual plants and their associated
|
||||
classification (\emph{stressed} or \emph{healthy}).
|
||||
|
||||
Both data sets (object detection and classification) only allow us to
|
||||
train and validate each model separately. A third data set is needed
|
||||
to evaluate the detection/classification pipeline as a whole. To this
|
||||
end, we construct our own data set where all plants per image are
|
||||
labeled with bounding boxes as well as the classes \emph{stressed} or
|
||||
\emph{healthy}. This data set is small in comparison to the one with
|
||||
which the object detection model is trained, but suffices because it
|
||||
is only used for evaluation. Labeling each sample in the evaluation
|
||||
data set manually is still a laborious task which is why each image is
|
||||
\emph{preannotated} by the already existing object detection and
|
||||
classification model. The task of labeling thus becomes a task of
|
||||
manually correcting the annotations which have been generated by the
|
||||
models.
|
||||
|
||||
\section{Selected Methods}
|
||||
\label{sec:selected-methods}
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user