Next: Results with degraded
Up: Preliminary results
Previous: Preliminary results
In this case, training
process is run with clear audio data and the system is tested with degraded
audio data. First, the acoustic recognizer is tested and second, visual
parameters are added to the input. Figure C.8 summarizes these
results. The horizontal line represents the score obtained using only the
four visual parameters (V). The bottom curve represents the scores obtained
with auditory only data during the training and the test session (A). The
top curve represents the scores obtained with audio-visual data during the
training and the test session (AV). The audio-visual recognition rate (AV)
is lower than the recognition rate using visual parameters only (A) in the
left area of the vertical dashed line. This might be due to the unbalanced
weight of the audio and visual data in our HMM vectors: 26 A + 4 V
parameters. In the right area only, a gain in the audio-visual recognition
rate (AV) has been achieved compared to the visual alone (V). In all
cases, AV scores are higher than A scores. These first results are
unsatisfactory, since visual information is not processed at its best by
the HMMs.
Figure C.8
: Test scores of the audio (A) and audio-visual (AV)
recognizers after training the HMMs in clear acoustic conditions
Esprit Project 8579/MIAMI (Schomaker et al., '95)
Thu May 18 16:00:17 MET DST 1995