Time delay estimation from HRTFs and HRIRs

Sungmok Hwang

Outline

Time delay estimation from HRTFs and HRIRs

Sungmok Hwang

2006

Abstract

Time delay, which is a propagation time for arriving an acoustic wave emitted from a sound source to an ear durum of listener, and Interaural-Time-Delay (ITD), which is the delay between two ears when acoustic waves reach each ear, are very important sound cues to perceive a sound source position. Therefore, it is essential to 3-D sound systems for formation of effective virtual sound. Time delay and ITD are well contained both in Head-Related-Transfer-Functions (HRTFs) and in Head-RelatedImpulse-Responses (HRIRs). However, it is not easy to estimate the accurate time delay and ITD from HRTF and HRIR due to the coarse time resolution and the pinna effect. In this work, we compare the performance of several typical methods for time delay estimation and introduce a HRIR interpolation method to improve accuracy of estimation.

Figures (5)

Figure 1. Left ear time delay (LTD) on the horizontal plane reaching an ear directly. Therefore, we can obtain time delay from the maximum peak position of HRIR (TD... ). However, strictly speaking, the maximum peak of HRIR is generated when the total acoustic energy, including not only the direct sound waves but also the waves reflected in surrounding structure, gets into maximum. As a result, the earliest reaching wave precedes the maximum peak in HRIR, thus many people consider the time corresponding to 12% (or 15%) of the maximum peak value of HRIR as the time delay (TD,,.,,..,.) [Duda et al., 1998]. HRTF phase contains information regarding the distance between a sound source and an ear. Therefore, we can consider the group delay, corresponding to overall slope of the unwrapped HRTF phase, to be the propagation time between a sound source and an ear. Thus, we can obtain the time delay by fitting the linear phase to the unwrapped HRTF phase via the least square method (TD...) [Tohyama er al., 1995]. Figure 1 shows the left ear time delay (LTD) of B&K HATS on the horizontal plane obtained from above methods. The distance to a speaker from the head center was | m and sampling frequency was 44100 Hz. From the result, it can be seen that LTD.,,,,.. is larger than LTD,,,, and LTD 50, peax At azimuth from 0° to 70°. Actually, in this region, the reflection to a tragus is dominant. In other words, the peak due to the reflection to a tragus immediately follows the first peak generated by the direct sound waves. Therefore, the overall slope of HRTF phase steepens in these azimuth angles. However, LTD phase aNd LTD... are almost the same at azimuth from 90° to 180°. This can be interpreted that the distinction between the direct waves and the reflected waves by a tragus is ambiguous because the direct path between a sound source and an ear is obscured by the pinna or the head. Figure 2 shows the phases for B&K HATS and Sphere-Head-Related-Transfer- Function (SHRTF) after the linear trend elimination from the maximum peak of HRIR and Sphere-Head-Related-Impulse-Response (SHRIR). SHRTF, which corresponds to HRTF for a rigid sphere without pinna, was obtained analytically at the very first by Lord Rayleigh at the end of 19" century [Strutt, 1945; Duda et al., 1998; Strutt, 1904]. The overall slopes of SHRTF with the elimination of linear trend via the maximum peak of SHRIR are almost disappeared, whereas they of HRTF still remain at azimuth

Figure 2. Phase of SHRTF (upper figure) and HRTF (lower figure) after the linear phase elimination. from 0° to 90° due to the tragus effect. As a result, if we determine the time delay using the phase slope, then the estimation error can be larger. The difference between LTD,..,. and LTD, ,... 18 also distinct because the position corresponds to 12% of the peak value of HRIR precedes the maximum peak. However, it can be seen that both LTD,,., and LTD,5.,,.., are not smoothly continued with respect to the change of azimuth. Specially, there are some positions that LTD... or LTD ye, sea has the same value despite the movement of an actual sound source. Furthermore, LTD, at 45° of azimuth is larger than that at 40° of azimuth and this means that time delay is overestimated although the distance between an actual sound source and an ear is shorter. These estimation errors are occurred due to the coarse time resolution by the limitation of sampling rate. HRTF, which is used for this research, was obtained with 44100 Hz of sampling rate and the time resolution of HRIR was about 2.27e-2 msec and this corresponds to 7.8 mm of propagation distance. As a result, it can be said that the effects due to the structure having a geometric dimension under about 7.8 mm cannot be interpreted in the obtained HRTF and HRIR.

Figure 3. HRIRs before and after interpolation.

Figure 5. ITDs from HRTF phase, HRIR and interpolated HRIR.

References (9)

Brungart, D., and Rabinowitz, W., 1999, "Auditory localization of nearby sources. Head-related transfer functions," J. Acoust. Soc. Am. Vol. 106, No. 3, pp. 1465-1479.
Duda, R., and Martens, W., 1998, "Range dependence of the response of a spherical head model," J. Acoust. Soc. Am. Vol. 104, No. 5, pp. 3048-3058.
Kistler, D., and Wightman, F., 1992, "A model of head-related transfer functions based on principal components analysis and minimum-phase reconstruction," J. Acoust. Soc. Am. Vol. 91, No. 3, pp. 1637-1647.
Kulkarni, A., Isabelle, S. K., and Colburn, H. S., 1995, "On the minimum-phase approximation of head-related transfer functions," Proc. IEEE ASSP workshop on ASPAA, pp. 84-87.
Rayleigh, L. (1907). On our perception of sound direction, Phil. Mag. 13, pp. 214-232.
Shin, K., and Park, Y., 2004, "Near field HRTF measurement and analysis to reproduce the virtual sound field," Proc. Fall Conf. Acoust. Soc. Kor., pp. 335-338.
Strutt, J. W., 1904, "On the acoustic shadow of a sphere," Phil. Trans. R. Soc. London, Ser. A 203, 87- 89.
Strutt, J. W., 1945, The theory of sound, Dover, New York, vol. 1 and 2.
Tohyama, M., Suzuki, H., and Ando, Y., 1995. The nature and technology of acoustic space, Academic Press, San diego, pp. 97-103.

Time delay estimation from HRTFs and HRIRs

Sign up for access to the world's latest research

Abstract

Related papers

References (9)

Related papers