Academia.eduAcademia.edu

Blizzard challenge

description15 papers
group16 followers
lightbulbAbout this topic
The Blizzard Challenge is an annual competition in the field of speech synthesis, where participants develop systems to generate natural-sounding speech from text. It evaluates the quality of synthesized speech using standardized datasets and metrics, fostering advancements in voice technology and promoting research collaboration within the speech processing community.
lightbulbAbout this topic
The Blizzard Challenge is an annual competition in the field of speech synthesis, where participants develop systems to generate natural-sounding speech from text. It evaluates the quality of synthesized speech using standardized datasets and metrics, fostering advancements in voice technology and promoting research collaboration within the speech processing community.

Key research themes

1. How can Blizzard Challenge entries optimize speech synthesis quality with limited linguistic resources and data?

This research area investigates methodologies and technological innovations employed by multiple Blizzard Challenge participants to build high-quality text-to-speech (TTS) synthesis systems utilizing minimal annotated data, scarce linguistic expertise, or low-resource languages. It aims to address key challenges in speech synthesis development for less-resourced languages or scenarios with limited supervised data, by leveraging automatic data processing, unsupervised learning, novel linguistic units, and minimal expert intervention strategies.

Key finding: The SIMPLE 4 ALL consortium introduced a pipeline for speech synthesis construction requiring minimal expert supervision, exemplified by automatic segmentation and alignment of audiobook recordings as ‘found’ data,... Read more
Key finding: This work reports a unit-selection concatenative TTS system developed for Tamil using only one hour of provided speech data. The system used forced alignment through HTK to generate phoneme-level segmentation from given data,... Read more
Key finding: Though primarily focused on mathematical card shuffling, this paper’s underlying methodological approach of modeling stochastic processes with limited sampling parallels techniques relevant to evaluating randomness in unit... Read more
Key finding: CMU's entry prioritized using a carefully selected subset of the large database by pruning less reliable segments, focusing on news and Arctic subsets to maximize quality while mitigating labeling errors. By employing a... Read more
Key finding: Lessac Technologies introduced 'Lessemes', an extensive symbolic phonetic representation that includes prosodic and coarticulatory information allowing precise unit clustering beyond traditional phonetic sets. Their hybrid... Read more

2. What challenge types do players prefer in gaming, and how can a validated challenge inventory inform game design?

This research theme centers on understanding the multidimensional nature of 'challenge' in video games, delineating distinct challenge types, and measuring player preferences to inform both academic understanding and practical game development. Developing a psychometrically validated inventory quantifies key challenge types, enabling correlation with player motivations and genre preferences. The aim is to provide a measurement tool that enables nuanced, player-centric game design and research.

3. How does motivated play in MMORPGs like World of Warcraft relate to positive and negative player experiences?

This area focuses on the complex interplay of player motivations and their experiential outcomes in persistent, immersive multiplayer online role-playing games (MMORPGs) such as World of Warcraft (WoW). Utilizing ethnographic methods and mixed qualitative and quantitative analyses, the research seeks to disentangle how core motivational dimensions—Achievement, Social, and Immersion—correlate with both wellbeing-enhancing and distressing game experiences. Understanding these dynamics informs psychological theory on gaming and has implications for game design and player welfare.

Key finding: Through mixed ethnographic and survey methodologies within WoW, this study extends Yee's 3-factor motivation framework by showing that Achievement motivation frequently correlates with problematic or addictive play, while... Read more

All papers in Blizzard challenge

This paper describes I 2 R's submission to the Blizzard Challenge 2010 speech synthesis evaluation. This is our third participation in the challenge. In this paper, we will describe our main approaches to building the required voices. We... more
The present paper reports on the DFKI entry to the Blizzard challenge 2008. The main difference of our system compared to last year is a new join model inspired by last year's iFlytek paper; the effect seems small, but measurable in the... more
The paper describes the Blizzard Challenge 2009 participation of MARY TTS, an open-source TTS system using a unit selection voice. We briefly outline the new language support framework we provide so that people can add support for their... more
Lessac Technologies has developed a technology for concatenated speech synthesis based on a novel approach for describing speech in which expressivity, voice quality, and speaking style are fundamental. The main aspect of our system is... more
Lessac Technologies has developed a technology for concatenated speech synthesis based on a novel approach for describing speech in which expressivity, voice quality, and speaking style are fundamental. The main aspect of our system is... more
BACKGROUND: The necessity of setting up high-resolution models is essential to timely forecast dangerous meteorological phenomena. OBJECTIVE: This study presents a verification of the numerical Weather Research and Forecasting... more
This paper describes NTNU’s entry for the Blizzard Challenge 2010. Our system is a conceptually simple variation of an HMM-based unit selection system, which uses diphones as the basic unit and employs a combined selection of units and... more
by Minghui Dong and 
1 more
Synthesized speech can be largely degraded in noise, resulting in compromised speech quality. In this paper, we propose a unit selection based speech synthesis system for better speech quality under poor channel conditions. First, the... more
This paper reports the I 2 R's submission to the Blizzard Challenge 2008. This is our first participation in Blizzard Challenge. In this paper, we describe the approach that we used to build the three required voices. We introduced the... more
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and... more
The present paper outlines the Vergina speech database, which was developed in support of research and development of corpus-based unit selection and statistical parametric speech synthesis systems for Modern Greek language. In the... more
Text to Speech plays a vital role in imparting information to the general population who have difficulty reading text but can understand spoken language. In Bhutan, many people fall in this category in adopting the national language... more
An extreme Koshava episode (EKE) from 30 January to 4 February 2014 has been studied. Koshava is a local windstorm in Southeast Europe. EKE was characterized by wind gusts above 45 m s -1 and deep snowdrifts. Strong Eurasian anticyclone... more
This paper gives an overview of the UCD Blizzard Challenge 2011 entry. The entry is a unit selection synthesiser that uses hidden Markov models for prosodic modelling. The evaluation consisted of s ...
In this paper, we are investigating the unit size: syllable, half-phone and quarter-phone to be used for speech synthesis in multi-lingual screen reader in phonetic languages such as Telugu and non-phonetic language English. Perceptual... more
The Blizzard challenge 2014 was the tenth annual Blizzard challenge organized by the following group of institutions : IIIT Hyderabad, IIT Madras, DAIICT, SSN College of Engineering, IIT Mandi and IIT Guwahati with support and... more
10 In this paper, we describe an efficient method of de-identification of speech 11 such that the transformation from the source speech is furthest away from 12 the source features, yet fully intelligible. We have designed a speaker ID 13... more
The L&H RealSpeak Laboratory TTS (RSLab) system is a corpus based speech synthesis system comprising components that deal with linguistic processing, prosody prediction, segment selection, concatenation and modification. In this paper we... more
This paper describes the CereVoice text-to-speech system developed by Cereproc Ltd, and its use for the generation of the test sentences for the Albayzin 2008 TTS evaluation. Also, the building procedure of a Cerevoice-compatible voice... more
The Blizzard challenge 2014 was the tenth annual Blizzard challenge organized by the following group of institutions : IIIT Hyderabad, IIT Madras, DAIICT, SSN College of Engineering, IIT Mandi and IIT Guwahati with support and... more
This paper presents a new analytic method that can be used for analyzing perceptual relevance of unit selection costs and/or their sub-components as well as for tuning of unit selection weights. The proposed method is leveraged to... more
This paper presents a new analytic method that can be used for analyzing perceptual relevance of unit selection costs and/or their sub-components as well as for tuning of unit selection weights. The proposed method is leveraged to... more
The purpose of the present paper is to examine the relationships between target and concatenation costs and the quality (with focus on naturalness) of generated speech. Several synthetic phrases were examined by listeners with the aim to... more
The purpose of the present paper is to examine the relation- ships between target and concatenation costs and the quality (with focus on naturalness) of generated speech. Several synthetic phrases were ex- amined by listeners with the aim... more
This paper presents the development of Croatian speech synthesis systems. Three voices were built using the same recorded speech corpus. Two of these voices were built with the Festival speech synthesis system, using the clustering unit... more
One problem with speech synthesis impeding high quality is the occurrence of audible discontinuities at segment boundaries. Formant jumps across concatenation points suggest the problem to be due to spectral differences. The problem is... more
This paper describes I 2 R's submission to the Blizzard Challenge 2009. This is our second time participating in this challenge. In this paper, we will describe our main approach to building the required voices. We will introduce the... more
This paper describes I2R's submission to the Blizzard Challenge 2009. This is our second time participating in this challenge. In this paper, we will describe our main approach to building the required voices. We will introduce the... more
This paper describes I 2 R's submission to the Blizzard Challenge 2009. This is our second time participating in this challenge. In this paper, we will describe our main approach to building the required voices. We will introduce the... more
The present paper outlines the Vergina speech database, which was developed in support of research and development of corpus-based unit selection and statistical parametric speech synthesis systems for Modern Greek language. In the... more
This paper describes a special version of IVONA Text-To-Speech for a GB English voice designed and developed by IVO Software for The Blizzard Challenge 2009. The architecture of this system is based on an improved IVONA Text-To-Speech... more
This paper describes a special version of IVONA Text-To-Speech for a GB English voice designed and developed by IVO Software for The Blizzard Challenge 2009. The architecture of this system is based on an improved IVONA Text-To-Speech... more
This paper describes a special version of IVONA Text-To- Speech for a GB English voice designed and developed by IVO Software for The Blizzard Challenge 2009. The architecture of this system is based on an improved IVONA Text-To-Speech... more
The present paper outlines the Vergina speech database, which was developed in support of research and development of corpus-based unit selection and statistical parametric speech synthesis systems for Modern Greek language. In the... more
A general-purpose isiZulu text-to-speech (TTS) system was developed, based on the "Multisyn" unit-selection approach supported by the Festival TTS toolkit. The development involved a number of challenges related to the interface between... more
An extreme Koshava episode (EKE) from 30 January to 4 February 2014 has been studied. Koshava is a local windstorm in Southeast Europe. EKE was characterized by wind gusts above 45 m s −1 and deep snowdrifts. Strong Eurasian anticyclone... more
One approach to the generation of natural-sounding syn-thesized speech waveforms is to select and concatenate units from a large speech database. Units (in the current work, phonemes) are selected to produce a natural realisation of a... more
One approach to the generation of natural-sounding syn-thesized speech waveforms is to select and concatenate units from a large speech database. Units (in the current work, phonemes) are selected to produce a natural realisation of a... more
The L&H RealSpeak Laboratory TTS (RSLab) system is a corpus based speech synthesis system comprising components that deal with linguistic processing, prosody prediction, segment selection, concatenation and modification. In this paper we... more
The L&H RealSpeak Laboratory TTS (RSLab) system is a corpus based speech synthesis system comprising components that deal with linguistic processing, prosody prediction, segment selection, concatenation and modification. In this paper we... more
The present paper outlines the Vergina speech database, which was developed in support of research and development of corpus-based unit selection and statistical parametric speech synthesis systems for Modern Greek language. In the... more
This paper describes simple designing methods of corpus-based visual speech synthesis. Our approach needs only a synchronous real image and speech database. Visual speech is synthesized by concatenating real image segments and speech... more
This paper presents the development of Croatian speech synthesis systems. Three voices were built using the same recorded speech corpus. Two of these voices were built with the Festival speech synthesis system, using the clustering unit... more
This paper presents the development of Croatian speech synthesis systems. Three voices were built using the same recorded speech corpus. Two of these voices were built with the Festival speech synthesis system, using the clustering unit... more
This paper presents the development of Croatian speech synthesis systems. Three voices were built using the same recorded speech corpus. Two of these voices were built with the Festival speech synthesis system, using the clustering unit... more
Diphone Backoff mechanisms in text-to-speech provide a means of ensuring that synthesis of the text takes place, even if some of the diphones in the text are missing in the speech database. This paper describes an automatic method for... more
Diphone Backoff mechanisms in text-to-speech provide a means of ensuring that synthesis of the text takes place, eve n if some of the diphones in the text are missing in the speech database. This paper describes an automatic method for... more
This paper proposes a new text-to-speech synthesis technique, for producing continuous, natural sounding speech of a speci c speaker. The synthesis technique is based on selecting short speech frames from a phoneme-labeled s p eech... more
This paper describes simple designing methods of corpus-based visual speech synthesis. Our approach needs only a syn- chronous real image and speech database. Visual speech is synthesized by concatenating real image segments and speech... more
Download research papers for free!