Academia.eduAcademia.edu

Outline

Spectrogram Feature Losses for Music Source Separation

2019, 2019 27th European Signal Processing Conference (EUSIPCO)

https://doi.org/10.23919/EUSIPCO.2019.8903019

Abstract

In this paper we study deep learning-based music source separation, and explore using an alternative loss to the standard spectrogram pixel-level L2 loss for model training. Our main contribution is in demonstrating that adding a highlevel feature loss term, extracted from the spectrograms using a VGG net, can improve separation quality visa -vis a pure pixel-level loss. We show this improvement in the context of the MMDenseNet, a State-of-the-Art deep learning model for this task, for the extraction of drums and vocal sounds from songs in the musdb18 database, covering a broad range of western music genres. We believe that this finding can be generalized and applied to broader machine learning-based systems in the audio domain.

References (9)

  1. J. Johnson, A. Alahi, and L. Fei-Fei, "Perceptual Losses for Real-Time Style Transfer and Super-Resolution", in ECCV, 2016
  2. E. Grinstein, N. Duong, A. Ozerov, and P. Perez, "Audio Style Transfer", in ICASSP, 2018
  3. P. Verma, and J.O. Smith, "Neural Style Transfer for Audio Spectro- grams", in NIPS Workshop on Machine Learning for Creativity and Design, 2017
  4. N. Takahashi, and Y. Mitsufiji, "Multi-scale Multi-band DenseNets for Audio Source Separation", in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2017
  5. G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, "Densely Connected Convolutional Networks", in CVPR, 2017
  6. K. Simonyan, and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition", in ICLR, 2015
  7. Z. Rafii, A. Liutkus, F.R. Stoter, S. I. Mimilakis and R. Bittner,"The MUSDB18 corpus for music separation"
  8. E. Vincent, R. Gribonval, and C. Fevotte, "Performance Measurement in Blind Audio Source Separation", in IEEE Trans. Audio, Speech and Language Processing, 14(4), pp 1462-1469, 2006
  9. F. Stter, A. Liutkus, and N. Ito, "The 2018 Signal Separation Evaluation Campaign", in Latent Variable Analysis and Signal Separation, pp.293- 305, June 2018