Academia.eduAcademia.edu

Outline

A Compressed Encoding Scheme for Approximate Tdoa Estimation

2018, 2018 26th European Signal Processing Conference (EUSIPCO)

https://doi.org/10.23919/EUSIPCO.2018.8553197

Abstract

Accurate estimation of Time-Difference of Arrivals (TDOAs) is necessary to perform accurate sound source localization. The problem has traditionally been solved by using methods such as Generalized Cross-Correlation, which uses the entire signal to accurately estimate TDOAs. However, this could pose a problem in distributed sensor networks in which the amount of data that can be transmitted from each sensor to a fusion center is limited, such as in underwater scenarios or other challenging environments. Inspired by approaches from computer vision, in this paper we identify Scale-Invariant Feature Transform (SIFT) keypoints in the signal spectrogram. We perform crosscorrelation on the signal using only the information available at those extracted keypoints. We test our algorithm in scenarios featuring different noise and reverberation conditions, and using different speech signals and source locations. We show that our algorithm can estimate Time-Difference of Arrivals (TDOAs) and the source location within an acceptable error range at a compression ratio of 40 : 1.

References (21)

  1. J. Benesty, J. Chen, and Y. Huang, Microphone array signal processing. Springer Science & Business Media, 2008, vol. 1.
  2. H.-s. Wang, J. Li, Z.-q. Sun, M.-h. Cao, and H.-w. Xie, "Accurate delay extraction for indoor pulse sound source location," in Signal Processing (ICSP), 2014 12th International Conference on. IEEE, 2014, pp. 298- 301.
  3. P. Pertila, M. S. Hamalainen, and M. Mieskolainen, "Passive temporal offset estimation of multichannel recordings of an ad-hoc microphone array," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 21, no. 11, pp. 2393-2402, 2013.
  4. H.-K. Hao, H.-M. Liang, and Y.-W. Liu, "Particle methods for real- time sound source localization based on the multiple signal classification algorithm," in Intelligent Green Building and Smart Grid (IGBSG), 2014 International Conference on. IEEE, 2014, pp. 1-5.
  5. Y. G. Kim, K. M. Jeon, Y. Kim, C.-H. Choi, H. K. Kim, and L. Nex, "Underwater acoustic sensor array signal lossless compression based on valid channel decision approach," Int J Image Signal Syst Eng, vol. 1, no. 1, pp. 21-28, 2017.
  6. S. Zhou and L. Ying, "On delay constrained multicast capacity of large-scale mobile ad hoc networks," IEEE Transactions on Information Theory, vol. 61, no. 10, pp. 5643-5655, 2015.
  7. G. Simon and L. Sujbert, "Acoustic source localization in sensor networks with low communication bandwidth," in Intelligent Solutions in Embedded Systems, 2006 International Workshop on. IEEE, 2006, pp. 1-9.
  8. Q. Fuyong, G. Fucheng, J. Wenli, and M. Xiangwei, "Data compression based on DFT for passive location in sensor networks," Procedia Engineering, vol. 29, pp. 3091-3095, 2012.
  9. D. O. Zion and H. Messer, "Envelope only tdoa estimation for sensor network self calibration," in Sensor Array and Multichannel Signal Processing Workshop (SAM), 2014 IEEE 8th. IEEE, 2014, pp. 229-232.
  10. N. El Gemayel, H. Jakel, and F. K. Jondral, "Error analysis of a low cost tdoa sensor network," in Position, Location and Navigation Symposium- 2014, 2014 IEEE/ION. IEEE, 2014, pp. 1040-1045.
  11. R. Sonnleitner and G. Widmer, "Robust quad-based audio fingerprint- ing," IEEE/ACM Transactions on Audio, Speech and Language Process- ing (TASLP), vol. 24, no. 3, pp. 409-421, 2016.
  12. S. Baluja and M. Covell, "Waveprint: Efficient wavelet-based audio fingerprinting," Pattern recognition, vol. 41, no. 11, pp. 3467-3480, 2008.
  13. T. Tsai and A. Stolcke, "Robust and efficient multiple alignment of unsynchronized meeting recordings," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 5, pp. 833-845, 2016.
  14. A. Wang et al., "An industrial strength audio search algorithm." in Ismir, vol. 2003. Washington, DC, 2003, pp. 7-13.
  15. T.-K. Hon, L. Wang, J. D. Reiss, and A. Cavallaro, "Audio fingerprinting for multi-device self-localization," IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), vol. 23, no. 10, pp. 1623- 1636, 2015.
  16. M. Zanoni, S. Lusardi, P. Bestagini, A. Canclini, A. Sarti, and S. Tubaro, "Efficient music identification approach based on local spectrogram im- age descriptors," in Audio Engineering Society Convention 142. Audio Engineering Society, 2017.
  17. X. Zhang, B. Zhu, L. Li, W. Li, X. Li, W. Wang, P. Lu, and W. Zhang, "Sift-based local spectrogram image descriptor: a novel feature for robust music identification," EURASIP Journal on Audio, Speech, and Music Processing, vol. 2015, no. 1, p. 6, 2015.
  18. Y. Ke, D. Hoiem, and R. Sukthankar, "Computer vision for music identification," in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1. IEEE, 2005, pp. 597-604.
  19. D. G. Lowe, "Object recognition from local scale-invariant features," in Computer vision, 1999. The proceedings of the seventh IEEE interna- tional conference on, vol. 2. Ieee, 1999, pp. 1150-1157.
  20. J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, N. L. Dahlgren, and V. Zue, "Timit acoustic-phonetic continuous speech corpus, 1993," Linguistic Data Consortium, Philadelphia.
  21. E. A. Lehmann and A. M. Johansson, "Prediction of energy decay in room impulse responses simulated with an image-source model," The Journal of the Acoustical Society of America, vol. 124, no. 1, pp. 269- 277, 2008.