Key research themes
1. How have automatic speech recognition (ASR) systems evolved methodologically to address speech variability and improve recognition accuracy?
This theme examines the technological and methodological progression in ASR systems from early pattern matching techniques to advanced probabilistic models and neural networks. Central challenges include handling intra- and inter-speaker variability, continuous speech recognition, and environmental noise. Understanding these developments is crucial for optimizing ASR accuracy and robustness in diverse real-world settings.
2. What roles do multisensory inputs and motor theories play in advancing models of human speech perception?
This theme investigates how speech perception research integrates auditory, visual, and tactile modalities, and how motor theories of perception explain the 'lack of invariance' problem in acoustic signals. Multisensory approaches consider how visual cues (e.g., lip movements) and somatosensory feedback contribute to phonetic interpretation, helping resolve ambiguity and enhancing recognition, with implications for both human and machine perception models.
3. How can open-access clinical speech corpora facilitate reproducible research and the development of AI speech technologies for atypical speech populations?
This theme explores the creation, accessibility, and utility of large clinical speech datasets to support reproducibility, comparative research, clinical training, and AI development for populations with speech sound disorders. Such corpora enable standardized evaluation, algorithm training, and facilitate education in speech processing, particularly addressing challenges related to representing children and individuals with speech impairments in training data.