Academia.eduAcademia.edu

Voice Production

description61 papers
group15 followers
lightbulbAbout this topic
Voice production is the physiological and acoustic process by which humans generate sound through the vocal folds in the larynx, modulating airflow and pressure to create speech and other vocalizations. It encompasses the study of the mechanics of phonation, resonance, and articulation in the context of communication and performance.
lightbulbAbout this topic
Voice production is the physiological and acoustic process by which humans generate sound through the vocal folds in the larynx, modulating airflow and pressure to create speech and other vocalizations. It encompasses the study of the mechanics of phonation, resonance, and articulation in the context of communication and performance.

Key research themes

1. How can speech synthesis systems be adapted to support both speech and singing voice production from neutral speech corpora?

This research area focuses on developing text-to-speech (TTS) frameworks that extend beyond conventional speech synthesis to incorporate singing voice production without requiring dedicated singing databases. The motivation lies in the cost, feasibility, and flexibility challenges of recording supplementary singing corpora, especially when the original speaker is unavailable or unable to sing well. The key insight is integrating speech-to-singing (STS) conversion within unit selection or corpus-based TTS systems using neutral speech databases, enabling synthesis of expressive vocal outputs for applications like storytelling, assistive devices, and immersive experiences.

Key finding: Introduced a unit selection-based TTS and singing (US-TTS&S) framework that integrates speech-to-singing conversion to generate both speech and singing from a single neutral speech corpus. The system was validated objectively... Read more
Key finding: Developed an expressive speech synthesizer tailored for military training applications using corpus-based concatenative synthesis with samples classified by speaking style. The system exhibited versatile, high-quality... Read more
Key finding: Provided an overview and design of TTS synthesizers using concatenative and formant synthesis approaches, highlighting unit selection and diphone synthesis. Emphasized the trade-offs between database size, naturalness, and... Read more
Key finding: Integrated GlórCáil voice analysis-synthesis system into a DNN-based TTS framework to manipulate glottal source and vocal tract parameters globally, enabling control over speaker identity (gender, age) and affective coloring.... Read more

2. What computational and vocal models facilitate the control and learning of expressive vocal intonation and prosody, including for language learning and voice training?

This theme explores computational synthesis techniques and interactive training methods designed to improve vocal expressiveness, particularly intonation patterns and prosodic features critical for natural speech and singing. The focus includes how speech synthesis models are manipulated for expressive control and how novel interfaces support second language (L2) speakers in mastering challenging intonation, as well as models for professional voice training to optimize vocal and prosodic quality. These approaches provide actionable methods for enhancing voice performance through controlled vocal synthesis and targeted training.

Key finding: Demonstrated that real-time hand-gesture controlled vocal synthesis (Performative Vocal Synthesis, PVS) enables L2 learners (French speakers learning English intonation) to produce more comprehensible categorical intonation... Read more
Key finding: Designed and experimentally validated a vocal training program to improve vocal and prosodic elements (breathing, articulation, loudness, pitch, jitter, speech rate, pauses, stress) in journalism students. Post-training... Read more
Key finding: Provided a computational framework for manipulating glottal and vocal tract parameters to generate variations in affective expression and speaker identity within synthetic speech, revealing that global parameter shifts can... Read more

3. How can physical and computational models of vocal fold physiology and acoustics improve understanding and simulation of voice production?

This theme surveys synthetic vocal fold models and numerical approaches that accurately represent the biomechanics and aerodynamics of phonation to better simulate human voice production. It includes the design of self-oscillating vocal fold models, quantification of vocal fold geometry, and stabilized finite element methods for wave equations in moving vocal tracts. These advances help elucidate the complex coupling of tissue vibration, airflow, and acoustics, yielding insights for synthesis, voice therapy, and model-based voice production research.

Key finding: Provided a comprehensive review of two principal classes of synthetic self-oscillating vocal fold models—membranous (e.g., water-filled latex tubes) and elastic solid (e.g., multi-layered ultrasoft silicone)—detailing their... Read more
Key finding: Quantified 3D medial surface geometry of porcine vocal folds using microCT before and after freezing, finding ~5% non-uniform expansion due to freezing. Demonstrated qualitative similarity of porcine vocal fold geometry to... Read more
Key finding: Proposed a subgrid scale stabilized finite element method (FEM) to solve the mixed form wave equation within an arbitrary Lagrangian-Eulerian (ALE) framework, addressing inf-sup compatibility and high-frequency oscillations... Read more
Key finding: Analyzed how inclusion of a finite relaxation length for the flow to transition to one-dimensionality downstream of the glottis affects low-order vocal fold voice production models. Demonstrated that shorter relaxation... Read more

All papers in Voice Production

At present, two important questions about voice remain unanswered: When voice quality changes, what physiological alteration caused this change, and if a change to the voice production system occurs, what change in perceived quality can... more
Import 03/11/2016Hlasová analýza může být použita k mnoha účelům, například k diagnostice nebo k prevenci poškození hlasu. Hlas může být snadno poškozen zejména jeho nadměrným nebo nesprávným používáním. V této práci jsou popsány základy... more
by B. Pickup and 
1 more
The influence of asymmetric vocal fold stiffness on voice production was evaluated using life-sized, selfoscillating vocal fold models with an idealized geometry based on the human vocal folds. The models were fabricated using flexible,... more
Objectives: Decreasing the closing speed of the vocal folds can reduce loudness and energy in the higher frequency harmonics, resulting in reduced voice quality. Our aim was to study the correlation between higher frequencies and the... more
The aim of the study was an anthropometric analysis of the values of selected cranial and facial indexes in vocal students and a comparison of these values with the standards for the same ethnic and age group of non-singing students.... more
Objectives/Hypothesis: The posterior cricoarytenoid (PCA) muscle is the sole abductor of the glottis and serves important functions during respiration, phonation, cough, and sniff. The present study examines vocal fold abduction dynamics... more
Objectives/Hypothesis: Evaluate the effects of asymmetric superior laryngeal nerve stimulation on the vibratory phase, laryngeal posture, and acoustics.
The aim of the study was an anthropometric analysis of the values of selected cranial and facial indexes in vocal students and a comparison of these values with the standards for the same ethnic and age group of non-singing students.... more
Register shift between the chest and falsetto register is generally studied in the higher-than-speaking pitch range. However, a similar difference can also be produced at speaking pitch level. The shift from breathy "falsetto" phonation... more
Realistic mathematical modeling of voice production has been recently boosted by applications to different fields like bioprosthetics, quality speech synthesis and pathological diagnosis. In this work, we revisit a two-mass model of the... more
The effect of subglottic stenosis on vocal fold vibration is investigated. An idealized stenosis is defined, parameterized, and incorporated into a two-dimensional, fully coupled finite element model of the vocal folds and laryngeal... more
Download research papers for free!