An essential but complicated task in the audio production process is the selection of microphones... more An essential but complicated task in the audio production process is the selection of microphones that are suitable for a particular source. A microphone is often chosen based on price or common practices, rather than whether the microphone actually works best in that particular situation. In this paper we perceptually assess six microphone types for recording a female singer. Listening tests using a pairwise and multiple stimuli approach are conducted to identify the order of preference of these microphone types. The results of this comparison are discussed, and the performance of each approach is assessed.
An adaptive amplitude distortion effect for audio production is introduced. We consider a versati... more An adaptive amplitude distortion effect for audio production is introduced. We consider a versatile model of multiband waveshaping with a limited set of intuitive parameters. We propose a method for adaptive scaling of the characteristic amplitude distortion transfer function. A number of methods to automate other distortion parameters are explored, such as automatic balance between clean and distorted signals for single band and multiband distortion, make-up gain and adaptive anti-aliasing. Automation formulas are generalised for several other common transfer functions. A formal perceptual evaluation of the automation algorithms is conducted, validating the approach and identifying shortcomings and particularities. Finally, we discuss implementational aspects and possible future directions.
We present an overview of the Semantic Audio Feature Extraction (SAFE) Project, a novel data coll... more We present an overview of the Semantic Audio Feature Extraction (SAFE) Project, a novel data collection architecture for the extraction and retrieval of semantic descriptions of musical timbre, deployed within the digital audio workstation. By embedding the data capture system into the music production workflow, we are able to maximise the return of semantically annotated music production data, whilst mitigating against issues such as musical and environmental bias. Users of the plug-ins are able to submit semantic descriptions of their own music, whilst utilising the continually growing collaborative dataset of musical descriptors. In order to provide more contextually representative timbral transformations, the dataset is partitioned using metadata, captured within the application.
Artificial reverberation is an important music production tool with a strong but poorly understoo... more Artificial reverberation is an important music production tool with a strong but poorly understood perceptual impact. A literature review of the relevant works concerned with the perception of musical reverberation is provided, and the use of artificial reverberation in multi-source mixes is studied. The perceived amount of total artificial reverberation in a mixture is predicted using relative reverb loudness and early decay time, as extracted from the newly proposed Equivalent Impulse Response. Results indicate that both features have a significant impact on the perception of a mix and that they are closely related to the upper and lower bounds of desired amount of reverberation in a mixture.
The Web Audio Evaluation Tool is an open-source, browser-based framework for creating and conduct... more The Web Audio Evaluation Tool is an open-source, browser-based framework for creating and conducting listening tests. It allows remote deployment, GUI-guided setup, and analysis in the browser. While currently being used for listening tests in various fields, it was initially developed specifically for the study of music production practices. In this work, we highlight some of the features that facilitate evaluation of such content.
The Open Multitrack Testbed is an online repository of mul-titrack audio accessible to the public... more The Open Multitrack Testbed is an online repository of mul-titrack audio accessible to the public, with rich metadata annotation , a semantic database and search functionality. Two years after it first went live, the dataset is the largest and most diverse available, and still growing. An overview of the available content, some prominent features, and example uses in the field of intelligent music production are discussed .
Perceptual listening tests are commonplace in audio research and a vital form of evaluation. Whil... more Perceptual listening tests are commonplace in audio research and a vital form of evaluation. While a large number of tools exist to run such tests, many feature just one test type, are platform dependent, run on proprietary software, or require considerable configuration and programming. Using Web Audio, the Web Audio Evaluation Tool (WAET) addresses these concerns by having one toolbox which can be configured to run many different tests, perform it through a web browser and without needing proprietary software or computer programming knowledge. In this paper the role of the Web Audio API in giving WAET key functionalities are shown. The paper also highlights less common features, available to web based tools, such as easy remote testing environment and in-browser analytics.
Mixing music is a highly complex and important part of the music production process, with a varie... more Mixing music is a highly complex and important part of the music production process, with a variety of creative and technical challenges, few of which have established solutions. Consequently, several approaches are viable for each given recording, and evaluation of differences in music production practices is therefore highly subjective. However, the study of perception of music production processes reveals that there is some degree of consensus on which mixes or specific parameter settings are preferred over others. In this paper, we give an overview of prior work based on a dataset consisting of songs mixed by at least eight different mixing engineers, with extensive perceptual evaluation in the form of preference ratings and free-form comments. In contrast with most previous work in the area, we investigate realistic mixes as opposed to considering a specific process in isolation, which disregards the cross-adaptive nature of the mixing process. Furthermore, detailed perceptual evaluation of each mix allows to distinguish if the complete song or specific components thereof received a treatment that was perceived as positive or negative. Finally, having access to the original, raw audio as well as the exact parameter settings used on each processor, thorough analysis of the mix is possible.
The mix is an essential part of the music production process, which has an important but poorly u... more The mix is an essential part of the music production process, which has an important but poorly understood impact on the perception of a record. Little is known about which aspects are the most important, and how to acquire such information. In this work we collect, annotate and analyse over 1400 reviews by trained listeners on 98 mixes. We assess which instruments, types of processing and mix properties are most apparent when comparing mixes, and explore which challenges arise when interpreting these comments. The benefits of using such unstructured data are discussed and a methodology for analysing it is proposed.
Subgrouping is an important part of the mix engineering workflow that facilitates the process of ... more Subgrouping is an important part of the mix engineering workflow that facilitates the process of manipulating a number of audio tracks simultaneously. We statistically analyse the subgrouping practices of mix engineers in order to establish the relationship between subgrouping and mix preference. We investigate the number of subgroups (relative and absolute), the type of audio processing and the subgrouping strategy in 72 mixes of nine songs, by 16 mix engineers. We analyse the subgrouping setup for each mix of a particular song and also each mix by a particular mixing engineer. We show that subjective preference for a mix strongly correlates with the number of subgroups, and to a lesser extent which types of audio processing are applied to the subgroups.
We present an intelligent approach to multitrack dynamic range compression where all parameters a... more We present an intelligent approach to multitrack dynamic range compression where all parameters are configured automatically based on side-chain feature extraction from the input signals. A method of adjustment experiment to explore how audio engineers set the ratio and threshold is described. We use multiple linear regression to model the relationship between different features and the experimental results. Parameter automations incorporate control assumptions based on this experiment and those derived from mixing literature and analysis. Subjective evaluation of the intelligent system is provided in the form of a multiple stimulus listening test where the system is compared against a no-compression mix, two human mixes, and an alternative approach. Results showed that mixes devised by our system are able to compete with or outperform manual mixes by semi-professionals under a variety of subjective criteria.
Perceptual evaluation tests where subjects assess certain qualities of different audio fragments ... more Perceptual evaluation tests where subjects assess certain qualities of different audio fragments are an integral part of audio and music research. These require specialised software, usually custom-made, to collect large amounts of data using meticulously designed interfaces with carefully formulated questions, and play back audio with rapid switching between different samples. New functionality in HTML5 included in the Web Audio API allows for increasingly powerful media applications in a platform independent environment. The advantage of a web application is easy deployment on any platform, without requiring any other application, enabling multiple tests to be easily conducted across locations. In this paper we propose a tool supporting a wide variety of easily configurable, multi-stimulus perceptual audio evaluation tests over the web with multiple test interfaces, pre-and post-test surveys , custom configuration, collection of test metrics and other features. Test design and setup doesn't require programming background, and results are gathered automatically using web friendly formats for easy storing of results on a server.
Mixing multitrack music is an expert task where characteristics of the individual elements and th... more Mixing multitrack music is an expert task where characteristics of the individual elements and their sum are manipulated in terms of balance, timbre and positioning, to resolve technical issues and to meet the creative vision of the artist or engineer. In this paper we conduct a mixing experiment where eight songs are each mixed by eight different engineers. We consider a range of features describing the dynamic, spatial and spectral characteristics of each track, and perform a multidimensional analysis of variance to assess whether the instrument, song and/or engineer is the determining factor that explains the resulting variance, trend, or consistency in mixing methodology. A number of assumed mixing rules from literature are discussed in the light of this data, and implications regarding the automation of various mixing processes are explored. Part of the data used in this work is published in a new online multitrack dataset through which public domain recordings, mixes, and mix settings (DAW projects) can be shared.
Subgrouping is an important part of the mix engineering workflow that facilitates the process of ... more Subgrouping is an important part of the mix engineering workflow that facilitates the process of manipulating a number of audio tracks simultaneously. We statistically analyse the subgrouping practices of mix engineers in order to establish the relationship between subgrouping and mix preference. We investigate the number of subgroups (relative and absolute), the type of audio processing and the subgrouping strategy in 72 mixes of nine songs, by 16 mix engineers. We analyse the subgrouping setup for each mix of a particular song and also each mix by a particular mixing engineer. We show that subjective preference for a mix strongly correlates with the number of subgroups, and to a lesser extent which types of audio processing are applied to the subgroups.
Perceptual evaluation tests where subjects assess certain qualities of different audio fragments ... more Perceptual evaluation tests where subjects assess certain qualities of different audio fragments are an integral part of audio and music research. These require specialised software, usually custom-made, to collect large amounts of data using meticulously designed interfaces with carefully formulated questions, and play back audio with rapid switching between different samples. New functionality in HTML5 included in the Web Audio API allows for increasingly powerful media applications in a platform independent environment. The advantage of a web application is easy deployment on any platform, without requiring any other application, enabling multiple tests to be easily conducted across locations. In this paper we propose a tool supporting a wide variety of easily configurable, multi-stimulus perceptual audio evaluation tests over the web with multiple test interfaces, pre-and post-test surveys, custom configuration, collection of test metrics and other features. Test design and setup doesn't require programming background, and results are gathered automatically using web friendly formats for easy storing of results on a server.
15th International Society for Music Information Retrieval Conference in Taipei, Taiwan, Oct 27, 2014
Mixing multitrack music is an expert task where characteristics of the individual elements and th... more Mixing multitrack music is an expert task where characteristics of the individual elements and their sum are manipulated in terms of balance, timbre and positioning, to resolve technical issues and to meet the creative vision of the artist or engineer. In this paper we conduct a mixing experiment where eight songs are each mixed by eight different engineers. We consider a range of features describing the dynamic, spatial and spectral characteristics of each track, and perform a multidimensional analysis of variance to assess whether the instrument, song and/or engineer is the determining factor that explains the resulting variance, trend, or consistency in mixing methodology.
A number of assumed mixing rules from literature are discussed in the light of this data, and implications regarding the automation of various mixing processes are explored.
Part of the data used in this work is published in a new online multitrack dataset (multitrack.eecs.qmul.ac.uk) through which public domain recordings, mixes, and mix settings (DAW projects) can be shared.
15th International Society for Music Information Retrieval Conference, Taipei, Oct 27, 2014
www.semanticaudio.co.uk
We present an overview of the Semantic Audio Feature Extraction (SAFE) P... more www.semanticaudio.co.uk We present an overview of the Semantic Audio Feature Extraction (SAFE) Project, a novel data collection architecture for the extraction and retrieval of semantic descriptions of musical timbre, deployed within the digital audio workstation. By embedding the data capture system into the music production workflow, we are able to maximise the return of semantically annotated music production data, whilst mitigating against issues such as musical and environmental bias. Users of the plug-ins are able to submit semantic descriptions of their own music, whilst utilising the continually growing collaborative dataset of musical descriptors. In order to provide more contextually representative timbral transformations, the dataset is partitioned using metadata, captured within the application.
137th Convention of the Audio Engineering Society, Los Angeles, Oct 10, 2014
multitrack.eecs.qmul.ac.uk
We introduce the Open Multitrack Testbed, an online repository of mul... more multitrack.eecs.qmul.ac.uk
We introduce the Open Multitrack Testbed, an online repository of multitrack audio, mixes or processed versions thereof, and corresponding mix settings or process parameters such as DAW files. Multitrack audio is a much sought after resource for audio researchers, students, and content producers, and while some online resources exist, few are large and reusable and none allow querying audio fulfilling specific criteria. The test bed we present contains a semantic database of metadata corresponding with the songs and individual tracks, enabling users to retrieve all pop songs featuring an accordion, or all tracks recorded in reverberant spaces. The open character is made possible by requiring the contributions, mainly from educational institutions and individuals, to have a Creative Commons license.
We present a toolbox for multi-stimulus perceptual evaluation of audio samples. Different from MUSHRA (typical for evaluating audio codecs), the audio samples under test are represented by sliders on a single axis, encouraging careful rating, relative to adjacent samples, where both reference and anchor are optional. Intended as a more flexible, versatile test design environment, subjects can rate the same samples on different scales simultaneously, with separate comment boxes for each sample, an arbitrary rating scale, and various randomisation options. Other tools include a pairwise evaluation tool and a loudness equalisation stage. We discuss some notable experiences and considerations based on various studies where these tools were used. We have found this test design to be highly effective when perceptually evaluating qualities pertaining to music and audio production.
53rd Conference of the Audio Engineering Society, Jan 28, 2014
An adaptive amplitude distortion effect for audio production is introduced. We consider a versati... more An adaptive amplitude distortion effect for audio production is introduced. We consider a versatile model of multiband waveshaping with a limited set of intuitive parameters. We propose a method for adaptive scaling of the characteristic amplitude distortion transfer function. A number of methods to automate other distortion parameters are explored, such as automatic balance between clean and distorted signals for single band and multiband distortion, make-up gain and adaptive anti-aliasing. Automation formulas are generalised for several other common transfer functions. A formal perceptual evaluation of the automation algorithms is conducted, validating the approach and identifying shortcomings and particularities. Finally, we discuss implementational aspects and possible future directions.
Uploads
Papers by Brecht De Man
A number of assumed mixing rules from literature are discussed in the light of this data, and implications regarding the automation of various mixing processes are explored.
Part of the data used in this work is published in a new online multitrack dataset (multitrack.eecs.qmul.ac.uk) through which public domain recordings, mixes, and mix settings (DAW projects) can be shared.
We present an overview of the Semantic Audio Feature Extraction (SAFE) Project, a novel data collection architecture for the extraction and retrieval of semantic descriptions of musical timbre, deployed within the digital audio workstation. By embedding the data capture system into the music production workflow, we are able to maximise the return of semantically annotated music production data, whilst mitigating against issues such as musical and environmental bias. Users of the plug-ins are able to submit semantic descriptions of their own music, whilst utilising the continually growing collaborative dataset of musical descriptors. In order to provide more contextually representative timbral transformations, the dataset is partitioned using metadata, captured within the application.
We introduce the Open Multitrack Testbed, an online repository of multitrack audio, mixes or processed versions thereof, and corresponding mix settings or process parameters such as DAW files. Multitrack audio is a much sought after resource for audio researchers, students, and content producers, and while some online resources exist, few are large and reusable and none allow querying audio fulfilling specific criteria. The test bed we present contains a semantic database of metadata corresponding with the songs and individual tracks, enabling users to retrieve all pop songs featuring an accordion, or all tracks recorded in reverberant spaces. The open character is made possible by requiring the contributions, mainly from educational institutions and individuals, to have a Creative Commons license.
We present a toolbox for multi-stimulus perceptual evaluation of audio samples. Different from MUSHRA (typical for evaluating audio codecs), the audio samples under test are represented by sliders on a single axis, encouraging careful rating, relative to adjacent samples, where both reference and anchor are optional. Intended as a more flexible, versatile test design environment, subjects can rate the same samples on different scales simultaneously, with separate comment boxes for each sample, an arbitrary rating scale, and various randomisation options. Other tools include a pairwise evaluation tool and a loudness equalisation stage. We discuss some notable experiences and considerations based on various studies where these tools were used. We have found this test design to be highly effective when perceptually evaluating qualities pertaining to music and audio production.