Results with clear acoustic training

Next: Results with degraded Up: Preliminary results Previous: Preliminary results

Results with clear acoustic training

In this case, training process is run with clear audio data and the system is tested with degraded audio data. First, the acoustic recognizer is tested and second, visual parameters are added to the input. Figure C.8 summarizes these results. The horizontal line represents the score obtained using only the four visual parameters (V). The bottom curve represents the scores obtained with auditory only data during the training and the test session (A). The top curve represents the scores obtained with audio-visual data during the training and the test session (AV). The audio-visual recognition rate (AV) is lower than the recognition rate using visual parameters only (A) in the left area of the vertical dashed line. This might be due to the unbalanced weight of the audio and visual data in our HMM vectors: 26 A + 4 V parameters. In the right area only, a gain in the audio-visual recognition rate (AV) has been achieved compared to the visual alone (V). In all cases, AV scores are higher than A scores. These first results are unsatisfactory, since visual information is not processed at its best by the HMMs.

Figure C.8 : Test scores of the audio (A) and audio-visual (AV) recognizers after training the HMMs in clear acoustic conditions

Esprit Project 8579/MIAMI (Schomaker et al., '95)
Thu May 18 16:00:17 MET DST 1995