Key research themes
1. How can auditory models improve accurate pitch segmentation and transcription in singing sequences?
This research area focuses on developing and evaluating auditory model-based transcription systems that can convert singing sequences into discrete pitch and duration pairs with minimized segmentation errors. Accurate segmentation and transcription are critical for applications like Query-by-Humming (QBH) systems, where matching sung queries to musical databases depends fundamentally on precise note boundary detection and pitch estimation. Challenges include reducing segmentation errors and accommodating variability caused by singing with or without lyrics.
2. What advanced signal processing techniques enable robust and high-resolution pitch tracking in complex and noisy audio signals?
This theme investigates algorithmic advancements in pitch estimation that provide enhanced time-frequency resolution, noise robustness, and effective multi-pitch tracking. These techniques use innovative mathematical transforms, empirical mode decomposition, canonical correlation analysis, and statistical modeling to disambiguate pitch information from acoustically rich or degraded signals. The focus is on leveraging continuous pitch estimation and harmonic models to improve pitch tracking accuracy, essential for applications such as speech synthesis, music transcription, and robot audition.
3. How can integrated acoustic and music language models enable multi-pitch detection and voice assignment in polyphonic vocal music?
Research within this theme explores systems combining probabilistic acoustic models with musicological language models to simultaneously detect multiple concurrent pitches and assign detected pitches to individual voices or singers in polyphonic a cappella recordings. Such integration addresses challenges of pitch detection amidst overlapping harmonic sources and enables voice separation based on voice-leading rules and temporal continuity. The resulting methods facilitate transcription and analysis of complex vocal ensembles like chorales and quartets.