** written by:** Peter van Hengel

** date:** May 10 2001

Coppyright: Sound Intelligence

** Introduction**

The cochlea model performs a frequency
analysis of the incoming sound (as given by the resampler). This frequency analysis
could be compared to a short term FFT or wavelet analysis, but has substantial
advantages over these. The most important advantage is that in the cochlea analysis
*continuity* in both time and frequency can be guaranteed.

The code implemented here is based
on a model of the human inner ear (=*cochlea*) developed in the early 1980's
in Groningen. In the real cochlea the incoming sound is split into its contributing
frequency components by a membrane (the *basilar membrane*
or BM) which has varying mechanical properties along its length. Figure
1 shows the position of the cochlea and various related structures in
the human head, as well as a cross-section of the cochlea, showing the
position of the BM.

**Figure 1a:** Overview of the human hearing organ.

**Figure 1b:**
Cross-section of the cochlea (at the plane indicated in the inset). 1:
scala media, 2: scala vestibuli, 3: scala tympani, 4: spiral ganglion,
5: auditory nerve. The **BM** separates the scala media from scala tympani.

The variation in mechanical properties along the BM causes high frequencies
to be expressed at the start of the cochlea (*base*) and lower frequencies
further towards the end (*apex*).
This leads to a place-frequency-relation where every frequency (in the
range of hearing) has a corresponding position in the cochlea that will
react maximally to that frequency. Figure 2 shows the BM with some
frequencies at the corresponding locations.

**Figure 2:** Relation between BM position and frequency.

For further details
on the human cochlea see any book on hearing (e.g. [1], [2], or the www-site
"*Promenade
round the Cochlea*").

** Transmission line**

The "Groningen" model
is a so-called *transmission line model *of the cochlea, as opposed to
the more common *filterbank model*. In a *filterbank model*
the incoming sound is processed by a (large) number of separate
filters, each representing a section of the real cochlea. These filters
will in general have overlapping frequency responses, but the motion in
one filter does not affect the motion in any of the other filters. This
implies that continuity in the spatial (=frequency) direction can not
be guaranteed.

In the *transmission line* version a numerical
implementation is constructed of the equations of motion governing the BM. This
leads to a set of coupled oscillators (from now on the term section will be
used).

** Note**

If the mechanics of the cochlea are assumed linear, a transmission line model can be converted into a filterbank. This is done by using the impulse response of each of the sections of the transmission line as filter coefficients in a set of non-recursive filters. The filterbank is then implemented as a set of overlap-and-add convolution filters (In [3] this implementation is used).

In order to be able to introduce nonlinearity in later stages, and to keep modeling as close to the human system as possible, the transmission line model is used.

Details about the transmission line model can be found in e.g. [4], [5]. For details about the filterbank implementation see [3].

** Equations governing the transmission line model**

The equation of motion of each of the sections of the transmission line model is:

[C.1]

In this equation *y* denotes the displacement of the section and *p* the
local driving pressure. These values do, of course, depend on the position *x*, but this dependence is omitted from the formulae in order to improve readability. Dots are used to denote time-derivatives.

In equation C.1 *m* is the mass (per
unit area) of the BM, *d*(*x*) is the damping (per unit area), which
depends on the position *x* on the BM. The stiffness (per unit area) *s*(*x*)
also depends on the position. This dependence is used to introduce the
place-frequency relation mentioned before. A formula describing the
place-frequency relation was given by Greenwood (1991) [6], which is
used here. This place-frequency relation can be used to compute the
stiffness as a function of position from .

The equation given by Greenwood [6] gives the characteristic frequency for every position (the frequency that reacts maximally). It is used in the model as a (free) resonance frequency, which is slightly higher, depending on other settings of the model. With the standard settings of the model this difference is small (less than 1%). The assumption that the Greenwood map gives the resonance frequency allows easy computation of the stiffness from . See [5] for more details about the difference between characteristic and resonance frequency.

The damping as function of position is computed from the stiffness by assuming a quality factor of resonance *Q*, using . In the original model the *Q*-factor
was set to a constant value of 20 (based on experimental data). In
later years van den Raadt and Duifhuis discovered that this leads to
unwanted reflection of energy at the apex of the model [7]. Duifhuis
suggested a reduction of *Q* to 0.5 at the apex to remove all unwanted reflections [8]. In this implementation *Q* is reduced (or its reverse, d,
increased), not only to reduce reflections, but also to limit group
delay (see section on group delay in the documentation on
initialisation).

A second equation is needed to describe the coupling of the sections in longitudinal (=*x*)
direction. This equation describes the mechanics of the incompressible
(non-viscous) fluid inside the cochlea, surrounding the BM:

_{}

[C.2]

where a'^2 is a constant determined by the density of the cochlear fluid, the cross-sectional area of the cochlear channels and the width of the BM.

These equations can be combined to one single equation by eliminating *p*:

_{}

[C.3]

Although it is possible to solve this equation analytically under certain conditions, in general a numerical approach is taken to arrive at a solution. More details about the derivation of these equations and the possible solution methods can be found in [9].

** Numerical implementation**

The standard way to
solve equation C.3 numerically involves the separation of the time
dependence and the spatial dependence. This allows the use of a
separate method for the spatial integration and the time integration.
Therefore the equation is rewritten as proposed by Prof. dr ir H.W.
Hoogstraten (R*u*G, Mathematics, *unpublished*) by defining *j*, *g* and *v* as:

_{} |
[C.4a] |

_{} |
[C.4b] |

_{} |
[C.4c] |

with which we can rewrite equation C.3 to the set:

_{} |
[C.5a] |

_{} |
[C.5b] |

_{} |
[C.5c] |

Equations C.5b and C.5c directly follow from C4.c and C.4b. C.5a can be obtained from equation C.3, by using C.4a and C.4b, and =/*m*.

** Dimensionalisation**

The resulting set of equations contains
a number of variables for which units of measure need to be chosen. Although
it would make sense to choose SI units for all quantities, this does not always
lead to efficient code. Therefore scaling units are introduced that fit the
range of expected values for each of the variables. These scaling units have
to be incorporated into the equations, a process known as *dimensionalisation*
of the equations. As unit of time the millisecond was chosen. For the unit
of distance the situation is somewhat complicated. The geometry of the cochlea
can be scaled with its length, but motions of the BM are of a much smaller scale.
Therefore the nm is chosen as a "dynamic" unit of distance, the cochlear length as "static" unit of distance.

The process of dimensionalisation is described in more detail in the work of Marc van den Raadt [10].

** Spatial integration**

Equation C.5a only contains spatial derivatives and needs to be integrated in the spatial direction. Therefore it needs to be discretised. An equidistant numerical grid is used, leading to BM sections with equal size . The spatial derivative is discretised as . Equation C.5a thus becomes:

_{}

[C.6]

This forms a system of equations that can be
written as a matrix equation: *A**j*=*r*, where *A* is a tridiagonal matrix. The matrix *A* is
known (and constant over time). The righthand-side *r* contains the velocity and displacement that have to be computed using equations C.5b and C.5c.

This equation can be solved, provided that *A* and *r*
are known, by a method named *Gauss-elimination*. This method can be found in any
book on numerical mathematics, e.g. [11], [12]. (In the case of the matrix A no pivoting is required and
a unique solution is guaranteed, because the matrix is diagonally dominant, see [10]
pp 28).

The solution procedure consists of eliminating the upper subdiagonal, starting at the bottom, moving up. This elimination is done by subtracting row n+1, multiplied by the proper fraction, from the row at n. The same procedure could be repeated for the lower subdiagonal, moving down, but a simpler method can be used. Moving down, the elements on the lower subdiagonal all refer to known values, so the solution can be computed directly.

**Note:**

In 2000 Olof Schuring used a method developed at the Mathematics Department of the R*u*G
for parallellization of a Gauss-elimination method, in order to
distribute the cochlea code over various computer processors. Although
this method can be used effectively for the cochlea model, the gain in
computational speed will not be much higher than a factor 5, because of
the overhead due to the parallelization scheme. Details can be found in
[13].

Since the matrix *A*
is invariant in time, the coefficients needed to elemininate the upper
subdiagonal elements, as well as the new diagonal elements need to be
computed only once. Therefore the computation of these coefficients can
be performed in the initialisation (see documentation on
initialisation).

** Boundary conditions**

Equation C.6 holds for sections in the interior of the cochlea. The situation at both ends, the helicotrema at the apex and the connection to the outside world through the middel ear at the base, is somewhat different.

The helicotrema is modelled as a hole in the BM, which means that the BM does not cover the entire length of the cochlea, but stops at a certain distance from the apex. This distance is taken to be equal to the height of a cochlear channel (which is half the diameter of the entire cochlea). The fluid flow through this hole is modelled as the fluid flow in the rest of the cochlea (mass only, due to incompressibility and non-viscous nature of the cochlear fluid), with the addition of a damping. This damping is introduced to represent resistance the fluid experiences due to the flow through a narrow hole.

The middle ear is modelled as an ideal transformer, scaling up pressures and scaling down velocities by a factor . This transformer is coupled to the outside world, which is represented by the acoustic impedance *Z_a*
of air (a damping), through a mass and stiffness. The combination of
stiffness, damping and mass of the middle ear models the transfer
characteristic of the real middle ear in a crude fashion. This transfer
is maximal around 2 kHz, with a very low *Q*-value (based on experimental data on middle ear transfer functions).

The exact shape of the equations at these boundaries is derived in [10]. Here a so-called electric analogon is used to derive the equations of motion. No special attention will be paid to this technique here, but because cochlea models are often represented as electric schematics the schematics of the "Groningen" model are given in figure 2. The resulting equations for the two boundaries are:

_{}

[C.7a]

_{}

[C.7b]

In the case of the middle ear connection equation C.5c also changes, to:

_{}

[C.8]

see [vdR pp 29-32] for more details. Note that the middle ear is modelled with a separate section (section number 0). For the helicotrema such an extra section is not required, so sections 1 to N are BM sections.

** Time integration**

In order to obtain values for velocity and displacement at each time(step) a time integration technique is required.

** Note**

The values of displacement and velocity are required for the righthand-side of equation C.5a (and C.6). But the value of phi is required for the solution of equation C.5c. Therefore the solution of the system of equations can only be found by simultaneous application of spatial and temporal integration.

The easiest and most computationally
efficient method for time integration would be a direct (or explicit or Euler's)
method (see [12] pp300). In such a direct method the first order derivative of *y* (between timesteps *n* and *n+1*) is approximated as (identically for *v*)(superscripts
are used for time-indices, in order to see the difference with spatial
indices). Equations C.5b and C.5c are written as:

_{}

[C.9a]

_{}

[C.9b]

However, computational efficiency is not the only requirement
to be met by a time integration method. A very important consideration is numerical
stability. Numerical stability is often defined as a region of the complex plane.
Solutions of the original set of differential equations of the shape _{}are stable
(that is converge to 0 as *t* goes to infinity) if *l* has a real part that is negative. The *difference* equation (the numerical approximation of the differential equation) also has solutions of the form _{}, which are stable if _{}falls within
a so-called stability region in the complex plane. Each numerical (time integration) method has its own stability region.

We know that in the cochlea sections move sinusoïdally,
which means l has a considerable imaginary part for these solutions. Therefore the stability of solutions
with a (sizeable) imaginary part is of great importance. Direct methods
provide poor stability for solutions with a sizeable imaginary part and little effective damping (negative
real part of *l*). Therefore a different (more complex) time integration
method was originally chosen for the cochlea model: a four-step *Runge Kutta
method*.
This four-step method combines good stability with a high accuracy
(small numerical error per time step). Its stability region is shown in
figure 3, along with that of a direct method.

Runge-Kutta, however, has the disadvantage that it requires four evaluations per time step: one at the start to obtain a prediction of the situation half-way between the start and the finish of the time step, one to correct this prediction, one to obtain a prediction of the situation at the finish based on the corrected prediction half-way, and finally a correction on this prediction. (The Runge Kutta method is a prediction-correction scheme).

A full explanation of the Runge-Kutta method falls outside the scope of this document. For further details the reader is referred to a book on numerical mathematics, see e.g. [12] pp 366-380. Here only the resulting equations are given.

The four-step Runge Kutta has the general form:

_{}

[C.10]

for the first order differential equation:

_{}

For our system of coupled equations this changes to:

_{}

[C.11]

The conversion from equations C.10 to C.11 is not straightforward, but further details fall outside the scope of this document.

It can be seen from this formula
that stimulus values are needed at times _{}. This is
why the input frequency of the cochlea model is double the internal sampling frequency _{}, which equals the output sampling frequency.

(**N.B.** the term sampling frequency in relation to the cochlea model will
refer to the *internal* sampling at
_{}
, which is half the input sampling frequency.)

As output the values of the velocity
are chosen. The rationale for this is that experimental evidence seems to indicate that the velocity of the BM
is the driving force for the excitation of the sensory inner hair cells. (This
does not remove the need for computation of the displacement, since this variable
is required to compute _{}).
In order to increase the spatial resolution in the model, the second
order spatial derivative of the velocity is taken. Although there is no
clear evidence of such a process taking place in the real cochlea, it
could be linked to either (mechanical) bending of the BM, or some form
of lateral inhibition or suppression. This lateral suppression could
take place at the level of the BM mechanics, through the outer hair
cells, or in the auditory nerve or higher brain centers. Although
experimental data show no clear evidence of lateral suppression, there
are strong indications that it may play a role in hearing.

The cochlea code consists of two parts:

**Initialisation**

In order to reduce computational load coefficients needed for Gauss elimination or Runge Kutta integration are computed before the actual (time) run starts, whenever possible. This process is quite complex, since it attempts to provide the user with easily understandable physical parameters and limit computation complexity within the main loop as far as possible. Therefore this part is described seperately (see document cochlea initialisation).

**Integration**

This is the actual main loop of the program, consisting of the four stages of the Runge Kutta method, with at each stage a Gauss elimination of the matrix equation.

References:

[1] Noteboom, S.G. and Cohen, A. (1976), "Spreken en verstaan"

[2] Moore, B.C.J. (*ed*) (1995), "Hearing"

[3] Andringa, T.C., *PhD thesis* (*in preparation*)

[4] Duifhuis, H., Hoogstraten, H.W., van Netten, S.M., Diependaal, R.J.
and Bialek, W. (1985), "Modelling the cochlear partition with coupled
Van der Pol oscillators" in: *Peripheral Auditory Mechanisms*, edited by J.B. Allen, J.L. Hall, A.E. Hubbard, S.T. Neely and A. Tubis (Springer, New York) pp. 290-297.

[5] van Hengel, P.W.J. (1996), "Emissions from cochlear modelling" *PhD thesis*

[6] Greenwood, D.D. (1991), "Critical bandwidth and consonance in
relation to cochlear frequency-position coordinates" Hear. Res. **54**, 164-208.

[7] van den Raadt, M.P.M.G. and Duifhuis, H. (1993), "Different
boundary conditions in a one-dimensional time domain cochlea model" in *Biophysics of Hair Cell Sensory Systems*, edited by H. Duifhuis, J.W. Horst, P. van Dijk and S.M. van Netten (World Scientific, Singapore) pp. 103

[8] Duifhuis, H. (2001), "..."

[9] de Boer, E. (1980), "Auditory Physics. Physical principals in hering theory. I" Physics Reports **62**, 87-174.

[10] van den Raadt, M.P.M.G. (1991), "Formulering 1-Dimensionaal
COCHLEA MODEL in termen van een NETWERK (naar H. DUIFHUIS)
[MACROMECHANISCH]" *internal publication Department of Biophysics RuG*

[11] Kreyszig, E. (1988), "Advanced Engineering Mathematics"

[12] Atkinson, K.E. (1978), "An Introduction to Numerical Analysis"

[13] Schuring, O.J. (2000), "Enhancement of a model of the human cochlea" *internal publication Department of Biophyiscs RuG*