Add Semantic Modeling section
This commit is contained in:
parent
7c55d2f024
commit
266b60704e
29
sim.tex
29
sim.tex
@ -481,7 +481,34 @@ and the CT are successfully used in music recognition and speech recognition
|
|||||||
with the \emph{mel-frequency cepstrum coefficients} (MFCC). The CT is used in
|
with the \emph{mel-frequency cepstrum coefficients} (MFCC). The CT is used in
|
||||||
MPEG7 for \emph{color histogram encoding} and for texture computation.
|
MPEG7 for \emph{color histogram encoding} and for texture computation.
|
||||||
|
|
||||||
\section{Semantic Modeling 200 words}
|
\section{Semantic Modeling}
|
||||||
|
|
||||||
|
The limits of similarity modeling present themselves when audio or visual
|
||||||
|
information only partially provides input for detecting higher semantics. Such
|
||||||
|
higher semantics can be the detection of emotion in videos. Simple questions
|
||||||
|
such as whether a particular person or object is visible or not, whether the
|
||||||
|
person interacts with someone and if that happens multiple times throughout the
|
||||||
|
video, are not as semantically complex as recognizing emotion. The latter
|
||||||
|
usually requires knowledge of the context in which the event is taking place.
|
||||||
|
|
||||||
|
\emph{Factor analysis} or \emph{latent semantic indexing} exploits the fact that
|
||||||
|
some information is encoded in multiple clusters within a feature space. By
|
||||||
|
extracting these factors, which are similar across multiple groups, it is
|
||||||
|
possible to explain many features of the feature vectors.
|
||||||
|
|
||||||
|
The building blocks of feature engineering are localization, correlation,
|
||||||
|
quantization and aggregation. Localization is the process of getting from an
|
||||||
|
input signal to shorter signals which can be analyzed. The output is then
|
||||||
|
compared to pre-existing knowledge in the correlation step. Sometimes the
|
||||||
|
information is the quantized in different ways and then aggregated to pick the
|
||||||
|
most important factors.
|
||||||
|
|
||||||
|
To evaluate which features are good, multiple metrics exist. One such metric has
|
||||||
|
already been discussed when clusters of samples are tightly coupled and strongly
|
||||||
|
separated from other clusters, called stability. Another metric is the fast
|
||||||
|
computation of the features. Features should be easily interpretable, which is
|
||||||
|
not always obvious when some transformation has been applied. If the features
|
||||||
|
generalize well to different situations regardless of context, they are robust.
|
||||||
|
|
||||||
\section{Learning over Time 600 words}
|
\section{Learning over Time 600 words}
|
||||||
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user