Next: Acknowledgement Up: An Introduction to Previous: Binaural signal enhancement

The Cortex: Psychology of Binaural Hearing

Most models of the subcortical auditory system assume a bottom-up, signal-driven process up to their output, the running binaural-activity pattern. The cortex, consequently, takes this pattern as an input. The evaluation of the binaural-activity pattern can be conceived as a top-down, hypothesis-driven process. According to this line of thinking, cortical centers set up hypotheses, e.g., in terms of expected patterns, and then try to confirm these hypotheses with appropriate means, e.g., with task-specific pattern-recognition procedures. When setting up hypotheses, the cortex reflects on cognition, namely, on knowledge and awareness of the current situation and the world in general. Further, it takes into account input from other senses, such as visual or tactile information. After forming hypotheses, higher nervous stages may feed back to more peripheral modules to prompt and control optimum hypothesis testing. They may, for example, induce movement of the ears-and-head array or influence the spectral decomposition process in the subcortical auditory system.

The following two examples help to illustrate the structure of problems that arise at this point from a technological point of view. First, in a ``cocktail-party'' situation a human listener can follow one talker and then, immediately, switch his attention to another. A signal-processing hearing aid should be able to do the same thing, deliberately controlled by its user. Second, a measuring instrument to evaluate the acoustic quality of concert halls will certainly take into account psychoacoustic descriptors like auditory spaciousness, reverberance, auditory transparency, etc. However, the general impression of space and quality that a listener develops in a room, may be codetermined by visual cues, by the specific kind of performance, by the listener's attitude, and by factors like fashion or taste, among other things.

There is no doubt that the involvement of the cortex in the evaluation process adds a considerable amount of ``subjectivity'' to binaural hearing, which poses serious problems to Binaural Technology. Engineers, as most scientists, are trained to deal with the object as being independent of the observer (assumption of ``objectivity'') and prefer to neglect phenomena that cannot be measured or assessed in a strictly ``objective'' way. They further tend to believe that any problem can be understood by splitting it up into parts, and analyzing these parts separately. At the cortical level, however, we deal with percepts, i.e. objects that do not exist as separate entities, but as part of a subject-object (perceiver-percept) relationship. It should also be noted that listeners normally listen in a ``gestalt'' mode, i.e., they perceive globally rather than segmentally. An analysis of the common engineering type may thus completely miss relevant features.

Perceiver and percept interact and may both vary considerably during the process of perception. For example, the auditory events may change when listeners focus on specific components such as the sound of a particular instrument in an orchestra. Further, the attitude of perceivers towards their percepts many vary in the course of an experimental series, thus leading to response modification.

Figure A.3 : Schematic of a subject in a listening experiment. Perception as well as judgement are variant, as modeled by the assumption of response-moderating factors.

A simple psychological model of the auditory perception and judgment process, shown in figure A.3 , will now be used to elaborate on the variance of listeners' auditory events in a given acoustic setting and the variance of their respective responses. The schematic symbolizes a subject in a listening experiment. Sound waves impinge upon the two ears, are preprocessed and guided to higher centers of the central nervous system, where they give rise to the formation of an auditory event in the subject's perceptual space. The auditory event is a percept of the listener being tested, i.e., only he/she has direct access to it. The rest of the world is only informed about the occurrence of the said percept, if the subject responds in such a way as to allow conclusion to be made from the response to the percept (indirect access). In formal experiments the subject will usually be instructed to respond in a specified way, for example by formal judgement on specific attributes of the auditory event. If the response is a quantitative descriptor of perceptual attributes, we may speak of measurement. Consequently, in listening experiments, subjects can serve as an instrument for the measurements of their own perception, i.e., as both the object of measurement and the ``meter''. The schematic in Fig.3 features a second input into both the auditory-perception and the judgement blocks where ``response-moderating factors'' are fed in to introduce variance to the perception and judgement processes.

Following this line of thinking an important task of auditory psychology can be to identify such response-moderating factors and to clarify their role in binaural listening. Many of these factors represent conventional knowledge or experience from related fields of perceptual acoustics, e.g., noise- and speech-quality evaluation. It is well known that the judgements of listeners on auditory events may depend on the cognitive ``image'' which the listeners have with respect to the sound sources involved (source-related factors). It may happen, for instance, that the auditory events evoked by sources that are considered aggressive (e.g., trucks), are judged louder than those from other sources (e.g., passenger cars) - given the same acoustical signals. The ``image'' of the source in the listeners' minds may be based, among other things, on cues from other senses (e.g., visual) and/or on prior knowledge. Situative factors are a further determinant in this context, i.e., subjects judge an auditory event bearing the complete (multi-modal) situation in mind in which they occur. Another set of factors is given by the individual characteristics of each listener (personal factors), for example his/her subjective attitude towards a specific sound phenomenon, an attitude that may even change in the course of an experiment. Response-moderating factors that draw upon cognition tend to be especially effective when the sounds listened to transmit specific information, i.e., act as carriers of meaning. This is obvious in the case of speech sounds, but also in other cases. The sound of a running automobile engine, for instance, may signal to the driver that the engine is operating normally.

The fact that response moderating factors do not only act on judgements but also on the process of perception itself, may seem to be less obvious at a first glance, but is, nevertheless, also conventional wisdom. We all know that people in a complex sound situation have a tendency to miss what they do not pay attention to and/or do not expect to hear. There is psychoacoustical evidence that, e.g., the spectral selectivity of the cochlea is influenced by attention. At this point, the ability to switch at will between a global and an analytic mode of listening, should also be noted. It is commonly accepted amongst psychologists that percepts are the result of both the actual sensory input at a given time and of expectation.

If we want to build sophisticated Binaural-Technology equipment for complex tasks, there is no doubt that psychological effects have to be taken into account. Let us consider, as an example, a binaural-surveillance system for acoustic monitoring of a factory floor. Such a system must know the relevance and meaning of many classes of signals and must pay selective attention to very specific ones, when an abnormal situation has been detected. A system for the evaluation of acoustic qualities of spaces for musical performances must detect and consider a range of different shades of binaural signals, depending on the kind and purpose of the performances. It might even have to take into account the taste of the local audience or that of the most influential local music reviewer. An intelligent binaural hearing aid should know to a certain extent, which components of the incoming acoustic signals are relevant to its user, e.g., track a talker who has just uttered the users name.

As a consequence, we shall see in the future of Binaural Technology that psychological models will be exploited and implemented technologically, though, may be not, for a while, in the form of massively parallel biologic computing as in the cortex. There are already discussions about and early examples of combinations of expert systems and other knowledge-based systems with artificial heads, auditory displays and auditory-system models. When we think of applications like complex human/machine interfaces, multi-media systems, interactive virtual environments, and teleoperation systems, it becomes obvious that conventional Binaural Technology must be combined with, or integrated into, systems that are able to make decisions and control actions in an intelligent way. With this view in mind it is clear that Binaural Technology is still in an early stage of development. There are many relevant technological challenges and business opportunities ahead.

Next: Acknowledgement Up: An Introduction to Previous: Binaural signal enhancement

Esprit Project 8579/MIAMI (Schomaker et al., '95)
Thu May 18 16:00:17 MET DST 1995