Academia.eduAcademia.edu

Outline

Informed audio source separation: A comparative study

Abstract

The goal of source separation algorithms is to recover the constituent sources, or audio objects, from their mixture. However, blind algorithms still do not yield estimates of sufficient quality for many practical uses. Informed Source Separation (ISS) is a solution to make separation robust when the audio objects are known during a so-called encoding stage. During that stage, a small amount of side information is computed and transmitted with the mixture. At a decoding stage, when the sources are no longer available, the mixture is processed using the side information to recover the audio objects, thus greatly improving the quality of the estimates at a cost of additional bitrate which depends on the size of the side information. In this study, we compare six methods from the state of the art in terms of quality versus bitrate, and show that a good separation performance can be attained at competitive bitrates.

References (17)

  1. REFERENCES
  2. V.R. Algazi, R.O. Duda, D.M. Thompson, and C. Aven- dano. The CIPIC HRTF Database. In IEEE Workshop on Applications of Signal Processing to Audio and Acous- tics (WASPAA'2001), pages 99-102, New Paltz, New York, USA, October 2001.
  3. C. Avendano. Frequency-domain source identification and manipulation in stereo mixes for enhancement, sup- pression and re-panning applications. In IEEE Work- shop on Applications of Signal Processing to Audio and Acoustics (WASPAA'2003), pages 55 -58, October 2003.
  4. J. Capon. High-resolution frequency-wavenumber spec- trum analysis. Proceedings of the IEEE, 57(8):1408 - 1418, August 1969.
  5. N.Q.K. Duong, E. Vincent, and R. Gribonval. Under- determined reverberant audio source separation us- ing a full-rank spatial covariance model. Audio, Speech, and Language Processing, IEEE Transactions on, 18(7):1830 -1840, September 2010.
  6. J. Engdegård, B. Resch, C. Falch, O. Hellmuth, J. Hilpert, A. Hölzer, L. Terentiev, J. Breebaart, J. Kop- pens, E. Schuijers, and W. Oomen. Spatial audio ob- ject coding (SAOC) -The upcoming MPEG standard on parametric object based audio coding. In 124th Audio Engineering Society Convention (AES 2008), Amster- dam, Netherlands, May 2008.
  7. S. Gorlow and S. Marchand. Informed source separa- tion: Underdetermined source signal recovery from an instantaneous stereo mixture. In IEEE Workshop on Ap- plications of Signal Processing to Audio and Acoustics (WASPAA'2011), pages 309 -312, October 2011.
  8. D.W. Griffin and J.S. Lim. Signal estimation from modi- fied short-time Fourier transform. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(2):236- 243, 1984.
  9. R. Huber and B. Kollmeier. PEMO-Q -a new method for objective audio quality assessment using a model of auditory perception. IEEE Transactions on Audio, Speech, and Language Processing, 14(6):1902 -1911, November 2006.
  10. A. Liutkus, J. Pinel, R. Badeau, L. Girin, and G. Richard. Informed source separation through spec- trogram coding and data embedding. Signal Processing, 92(8):1937 -1949, 2012.
  11. H.O. Oh, Y.W. Jung, A. Favrot, and C. Faller. Enhanc- ing Stereo Audio with Remix Capability. In AES 129th Convention Preprint 8290, San Francisco, CA, USA, November 2010.
  12. A. Ozerov, A. Liutkus, R. Badeau, and G. Richard. In- formed source separation: source coding meets source separation. In IEEE Workshop Applications of Signal Processing to Audio and Acoustics (WASPAA'11), New Paltz, New York, USA, October 2011.
  13. M. Parvaix and L. Girin. Informed source separation of linear instantaneous under-determined audio mixtures by source index embedding. IEEE Transactions on Au- dio, Speech, and Language Processing, 19(6):1721 - 1733, August 2011.
  14. M. Parvaix, L. Girin, and J.-M. Brossier. A watermarking-based method for informed source sepa- ration of audio signals with a single sensor. IEEE Trans- actions on Audio, Speech, and Language Processing, 18(6):1464-1475, 2010.
  15. N. Sturmel and L. Daudet. Informed source separation using iterative reconstruction. arXiv:1202.2075v1.
  16. N. Sturmel, A. Liutkus, J. Pinel, L. Girin, S. Marchand, G. Richard, R. Badeau, and L. Daudet. Linear mixing models for active listening of music productions in re- alistic studio conditions. In 132th AES convention, Bu- dapest, in press, 2012.
  17. E. Vincent, R. Gribonval, and C. Févotte. Performance measurement in blind audio source separation. IEEE Transactions on Audio, Speech, and Language Process- ing, 14(4):1462 -1469, July 2006.