This paper addresses the topic of note, cut and strike detection in Irish traditional music (ITM)... more This paper addresses the topic of note, cut and strike detection in Irish traditional music (ITM). In order to do this we first evaluate state of the art onset detection methods for identifying note boundaries. Our method utilises the results from manually and automatically segmented flute recordings. We then demonstrate how this information may be utilised for the detection of notes and single note articulations idiomatic of this genre for the purposes of player style identification. Results for manually annotated onsets achieve 86%, 70% and 74% accuracies for note, cut and strike classification respectively. Results for automatically segmented recordings are considerably, lower therefore we perform an analysis of the onset detection results per event class to establish which musical patterns contain the most errors.
The paper presents a novel approach to real-time temporal alignment of motion sequences, called O... more The paper presents a novel approach to real-time temporal alignment of motion sequences, called On-line Predictive Warping (OPW) and considers potential uses in interactive applications. The approach develops on the methods of aligning motions based on least cost, used in dynamic time warping (DTW), with the short term predictions of smoothing algorithms, in an iterative step through approach. The approach allows a recorded motion sequence to be warped to align it with a users motion as it is being captured. The paper demonstrates the potential feasibility of the approach to support applications in MR and VR, allowing virtual characters to perform and interact with users and live actors in a variety of rehearsal, training, visualisation and performance scenarios.
This work presents a system for providing ubiquitous visual information to actors in a virtual st... more This work presents a system for providing ubiquitous visual information to actors in a virtual studio, blue screen, or other mixed reality environment. The Scanning Mirror Projector (ScaMP) system uses steerable projection and head pose estimation to form a new type of ever-present display that projects visual information of a virtual world to the gaze direction of the user. This system allows improved interaction between the real and virtual environments.
A model is presented for the analysis and synthesis of low frequency human-like pitch deviation, ... more A model is presented for the analysis and synthesis of low frequency human-like pitch deviation, as a replacement for existing modulation techniques in singing voice synthesis systems. Fundamental frequency (f0) measurements are taken from vocalists producing a selected range of utterances without vibrato and trends in the data are observed. A probabilistic function that provides natural sounding low frequency f0 modulation to synthesized singing voices is presented and the perceptual relevance is evaluated with subjective listening tests.
Multitrack Mixing Using a Model of Loudness and Partial Loudness
A method for generating a mix of multi-track recordings using an auditory model has been develope... more A method for generating a mix of multi-track recordings using an auditory model has been developed. The proposed method is based on the concept that a balanced mix is one in which the loudness of all instruments are equal. A sophisticated psychoacoustic loudness model is used to measure the loudness of each track both in quiet and when mixed with any combination of the remaining tracks. Such measures are used to control the track gains in a time-varying manner. Finally we demonstrate how model predictions of partial loudness can be used to counteract energetic masking for any track, allowing the user to achieve better channel intelligibility in complex music mixtures.
The measurement of perceived loudness is a difficult yet im-portant task with a multitude of appl... more The measurement of perceived loudness is a difficult yet im-portant task with a multitude of applications such as loudness align-ment of complex stimuli and loudness restoration for the hear-ing impaired. Although computational hearing models exist, few are able to accurately predict the binaural loudness of everyday sounds. Such models demand excessive processing power making real-time loudness metering problematic. In this work, the dy-namic auditory loudness models of Glasberg and Moore (J. Audio Eng. Soc., 2002) and Chen and Hu (IEEE ICASSP, 2012) are pre-sented, extended and realised as binaural loudness meters. The performance bottlenecks are identified and alleviated by reducing the complexity of the excitation transformation stages. The ef-fects of three parameters (hop size, spectral compression and filter spacing) on model predictions are analysed and discussed within the context of features used by scientists and engineers to quantify and monitor the perceived loudness of...
Audio Engineering Society Convention Paper 8693 Presented at the 133rd Convention
This Convention paper was selected based on a submitted abstract and 750-word precis that have be... more This Convention paper was selected based on a submitted abstract and 750-word precis that have been peer reviewed by at least two qualied anonymous reviewers. The complete manuscript was not peer reviewed. This convention paper has been reproduced from the author's advance manuscript without editing, corrections, or consideration by the Review Board. The AES takes no responsibility for the contents. Additional papers may be obtained by sending request
Meeting the educational needs of students currently requires moving toward collaborative electron... more Meeting the educational needs of students currently requires moving toward collaborative electronic and mobile learning systems that parallel the vision of Web 2.0. However, factors such as data freedom, brokerage, interconnectivity and the Internet of Things add to a vision for Web 3.0 that will require con-sideration in the development of future campus-based, distance and vocational study. So, education can, in future, be expected to require deeper technological connections between students and learning environments, based on significant use of sensors, mobile devices, cloud computing and rich-media visualization. Therefore, we discuss challenges associated with such a futuristic campus con-text, including how learning materials and environments may be enriched by it. As an additional novel element the potential for much of that enrichment to be realized through development by students, within the curriculum, is also considered. We will conclude that much of the technology require...
A model is presented for the analysis and synthesis of low frequency human-like pitch deviation, ... more A model is presented for the analysis and synthesis of low frequency human-like pitch deviation, as a replacement for existing modulation techniques in singing voice synthesis systems. We build on research taken from [1], in order to find the features of drift under selected conditions in singing. Fundamental Frequency ( f0) measurements are taken from vocalists producing a selected range of utterances without vibrato and trends in the data are observed. Finally, we present a function that provides more natural low frequency f0 modulation to synthesized singing voices.
The measurement of perceived loudness is a difficult yet important task with a multitude of appli... more The measurement of perceived loudness is a difficult yet important task with a multitude of applications such as loudness alignment of complex stimuli and loudness restoration for the hearing impaired. Although computational hearing models exist, few are able to accurately predict the binaural loudness of everyday sounds. Such models demand excessive processing power making real-time loudness metering problematic. In this work, the dynamic auditory loudness models of Glasberg and Moore (J. Audio Eng. Soc., 2002) and Chen and Hu (IEEE ICASSP, 2012) are presented, extended and realised as binaural loudness meters. The performance bottlenecks are identified and alleviated by reducing the complexity of the excitation transformation stages. The effects of three parameters (hop size, spectral compression and filter spacing) on model predictions are analysed and discussed within the context of features used by scientists and engineers to quantify and monitor the perceived loudness of music...
Uploads
Papers by cham athwal