Spectrogram Feature Losses for Music Source Separation
2019, 2019 27th European Signal Processing Conference (EUSIPCO)
https://doi.org/10.23919/EUSIPCO.2019.8903019Abstract
In this paper we study deep learning-based music source separation, and explore using an alternative loss to the standard spectrogram pixel-level L2 loss for model training. Our main contribution is in demonstrating that adding a highlevel feature loss term, extracted from the spectrograms using a VGG net, can improve separation quality visa -vis a pure pixel-level loss. We show this improvement in the context of the MMDenseNet, a State-of-the-Art deep learning model for this task, for the extraction of drums and vocal sounds from songs in the musdb18 database, covering a broad range of western music genres. We believe that this finding can be generalized and applied to broader machine learning-based systems in the audio domain.
References (9)
- J. Johnson, A. Alahi, and L. Fei-Fei, "Perceptual Losses for Real-Time Style Transfer and Super-Resolution", in ECCV, 2016
- E. Grinstein, N. Duong, A. Ozerov, and P. Perez, "Audio Style Transfer", in ICASSP, 2018
- P. Verma, and J.O. Smith, "Neural Style Transfer for Audio Spectro- grams", in NIPS Workshop on Machine Learning for Creativity and Design, 2017
- N. Takahashi, and Y. Mitsufiji, "Multi-scale Multi-band DenseNets for Audio Source Separation", in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2017
- G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, "Densely Connected Convolutional Networks", in CVPR, 2017
- K. Simonyan, and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition", in ICLR, 2015
- Z. Rafii, A. Liutkus, F.R. Stoter, S. I. Mimilakis and R. Bittner,"The MUSDB18 corpus for music separation"
- E. Vincent, R. Gribonval, and C. Fevotte, "Performance Measurement in Blind Audio Source Separation", in IEEE Trans. Audio, Speech and Language Processing, 14(4), pp 1462-1469, 2006
- F. Stter, A. Liutkus, and N. Ito, "The 2018 Signal Separation Evaluation Campaign", in Latent Variable Analysis and Signal Separation, pp.293- 305, June 2018