Add Semantic Modeling section

2021-10-23 20:41:09 +02:00 · 2021-10-23 20:41:09 +02:00 · 266b60704e
commit 266b60704e
parent 7c55d2f024
1 changed files with 28 additions and 1 deletions
--- a/sim.tex
+++ b/sim.tex
@ -481,7 +481,34 @@ and the CT are successfully used in music recognition and speech recognition
 with the \emph{mel-frequency cepstrum coefficients} (MFCC). The CT is used in
 MPEG7 for \emph{color histogram encoding} and for texture computation.

-\section{Semantic Modeling 200 words}
+\section{Semantic Modeling}
+
+The limits of similarity modeling present themselves when audio or visual
+information only partially provides input for detecting higher semantics. Such
+higher semantics can be the detection of emotion in videos. Simple questions
+such as whether a particular person or object is visible or not, whether the
+person interacts with someone and if that happens multiple times throughout the
+video, are not as semantically complex as recognizing emotion. The latter
+usually requires knowledge of the context in which the event is taking place.
+
+\emph{Factor analysis} or \emph{latent semantic indexing} exploits the fact that
+some information is encoded in multiple clusters within a feature space. By
+extracting these factors, which are similar across multiple groups, it is
+possible to explain many features of the feature vectors.
+
+The building blocks of feature engineering are localization, correlation,
+quantization and aggregation. Localization is the process of getting from an
+input signal to shorter signals which can be analyzed. The output is then
+compared to pre-existing knowledge in the correlation step. Sometimes the
+information is the quantized in different ways and then aggregated to pick the
+most important factors. 
+
+To evaluate which features are good, multiple metrics exist. One such metric has
+already been discussed when clusters of samples are tightly coupled and strongly
+separated from other clusters, called stability. Another metric is the fast
+computation of the features. Features should be easily interpretable, which is
+not always obvious when some transformation has been applied. If the features
+generalize well to different situations regardless of context, they are robust.

 \section{Learning over Time 600 words}