Add Semantic Modeling section

2021-10-23 20:41:09 +02:00 · 2021-10-23 20:41:09 +02:00 · 266b60704e
commit 266b60704e
parent 7c55d2f024
1 changed files with 28 additions and 1 deletions
--- a/sim.tex
+++ b/sim.tex
@ -481,7 +481,34 @@ and the CT are successfully used in music recognition and speech recognition
 with the \emph{mel-frequency cepstrum coefficients} (MFCC). The CT is used in
 MPEG7 for \emph{color histogram encoding} and for texture computation.
-\section{Semantic Modeling 200 words}
+\section{Semantic Modeling}
 The limits of similarity modeling present themselves when audio or visual
 information only partially provides input for detecting higher semantics. Such
 higher semantics can be the detection of emotion in videos. Simple questions
 such as whether a particular person or object is visible or not, whether the
 person interacts with someone and if that happens multiple times throughout the
 video, are not as semantically complex as recognizing emotion. The latter
 usually requires knowledge of the context in which the event is taking place.
 \emph{Factor analysis} or \emph{latent semantic indexing} exploits the fact that
 some information is encoded in multiple clusters within a feature space. By
 extracting these factors, which are similar across multiple groups, it is
 possible to explain many features of the feature vectors.
 The building blocks of feature engineering are localization, correlation,
 quantization and aggregation. Localization is the process of getting from an
 input signal to shorter signals which can be analyzed. The output is then
 compared to pre-existing knowledge in the correlation step. Sometimes the
 information is the quantized in different ways and then aggregated to pick the
 most important factors. 
 To evaluate which features are good, multiple metrics exist. One such metric has
 already been discussed when clusters of samples are tightly coupled and strongly
 separated from other clusters, called stability. Another metric is the fast
 computation of the features. Features should be easily interpretable, which is
 not always obvious when some transformation has been applied. If the features
 generalize well to different situations regardless of context, they are robust.
 \section{Learning over Time 600 words}