Binaural sound source localization - Dual delay-line algorithm

Note: sample code to this algorithm can be found here.

Another - mathematically much more challenging - implementation of the Jeffress model is the dual delay-line algorithm. It performs coincidence detection through comparing the shifted phases of the input signals in the frequency domain.

The following figure shows the structure of a dual delay-line (click to enlarge). The triangular elements are delays while the circular elements are the coincidence detectors. Note that the structure is shown for only one frequency (index m).

Dual delay-line structure.

Dual delay-line structure.

Let x_{Ln}(k) and x_{Rn}(k) be the k-th sample of the timeframe at time n of the left and right audio channels, respectively (each timeframe has a length of M samples). The first step in the dual delay-line algorithm is to transform the time domain signals into the frequency domain with the help of a short-time Fourier transform:

\begin{eqnarray} x_{Ln}(k) & \Longleftrightarrow & X_{Ln}(m), \\ x_{Rn}(k) & \Longleftrightarrow & X_{Rn}(m), \\ & & k=0,\ldots,M-1;\; m=0,\ldots,M/2-1. \end{eqnarray}

Here, X_{Ln}(m) and X_{Rn}(m) are the complex values of the Fourier transform of x_{Ln}(k) and x_{Rn}(k) at time n. The actual frequency f_m (in Hz) corresponding to each Fourier index m can be determined by computing f_m=mf_s/M, with f_s being the sampling frequency.

To compute the delay values \tau_i necessary at each coincidence detector element, the azimuth space from -90° to +90° is partitioned into I sectors of equal size. Rearranging the equation which relates time difference to azimuth, implementing the azimuth range partitioning and dividing by 2 (the value of the delay at both inputs of the coincidence detector need to add up to the actual delay) yields:

\tau_i= \frac{1}{2}\frac{b}{c}\sin(\frac{i}{I-1}\pi-\frac{\pi}{2}),\quad i=0,\ldots,I-1.

Computing the delayed signals is quite easy. As we are in the frequency domain, a time delay simply corresponds to a phase shift. The phase shift \Delta\phi is related to the time difference \Delta t through following equation:

\Delta\phi=2\pi f\Delta t,

where f is frequency.

Applying this to our Fourier-transformed signals yields:

\begin{eqnarray} X_{Ln}^{(i)}(m) & = & X_{Ln}(m)e^{-j2\pi f_m\tau_i}, \\ X_{Rn}^{(i)}(m) & = & X_{Rn}(m)e^{j2\pi f_m\tau_i}, \\ & & i=0,\ldots,I-1;\; m=0,\ldots,M/2-1, \end{eqnarray}

where the superscript (i) denotes the coincidence detector index and j is the imaginary unit, i.e. j^2=-1.

Each coincidence detector subtracts both its inputs and computes the magnitude of the resulting number, i.e. the result of this operation is a real number:

\begin{eqnarray} \Delta X^{(i)}_n(m) & = & |X_{Ln}^{(i)}(m)-X_{Rn}^{(i)}(m)|,\\ & & i=0,\ldots,I-1;\; m=0,\ldots,M/2-1. \end{eqnarray}

The following figure shows a 3D coincidence map, which is the result of plotting the output of the coincidence detector elements for every frequency index m and every coincidence detector index i. For simplicity, m and i have been replaced by the corresponding frequency and azimuth values in the figure. The input signals used were two impulses with a time delay between left and right corresponding to an azimuth of approximately -24.5°.

3D coincidence map.

3D coincidence map \Delta X^{(i)}_n(m).

Note that the correct azimuth corresponds to the minimum in the coincidence map which is consistent over all frequencies. Other minima are caused by phase ambiguities.

The 3D coincidence map is integrated over time in order to get a more robust estimate:

\begin{eqnarray} P_n(i,m) & = & \sum^n_{k=0}{\beta^{n-k}\Delta X^{(i)}_n(m)},\\ & & i=0,\ldots,I-1;\; m=0,\ldots,M/2-1, \end{eqnarray}

with \beta\le 1 a time constant which controls the influence of past information on the current estimate.

The final localization function is obtained by integrating the time-averaged 3D coincidence map over frequency:

H_n(i)=\sum_{m=0}^{M/2-1}{P_n(i,m)},\; i=0,\ldots,I-1.

The following picture shows the result of integrating the 3D coincidence map from the previous picture over frequency. The correct minimum at -24.5° is now clearly evident and the phase ambiguities only register as a ripple in the localization function.

Localization estimate.

Localization estimate H_n(i).

The final step of the dual delay-line algorithm is the determination of the index i of the minimum of H_n(i). From this, the azimuth to the source in degrees can directly be computed:

Dual delay-line structure.

Dual delay-line structure.

3D coincidence map.

3D coincidence map \Delta X^{(i)}_n(m).

Localization estimate.

Localization estimate H_n(i).