Blizzard challenge

description15 papers

group16 followers

lightbulbAbout this topic

The Blizzard Challenge is an annual competition in the field of speech synthesis, where participants develop systems to generate natural-sounding speech from text. It evaluates the quality of synthesized speech using standardized datasets and metrics, fostering advancements in voice technology and promoting research collaboration within the speech processing community.

lightbulbAbout this topic

Key research themes

1. How can Blizzard Challenge entries optimize speech synthesis quality with limited linguistic resources and data?

This research area investigates methodologies and technological innovations employed by multiple Blizzard Challenge participants to build high-quality text-to-speech (TTS) synthesis systems utilizing minimal annotated data, scarce linguistic expertise, or low-resource languages. It aims to address key challenges in speech synthesis development for less-resourced languages or scenarios with limited supervised data, by leveraging automatic data processing, unsupervised learning, novel linguistic units, and minimal expert intervention strategies.

The Simple4All entry to the Blizzard Challenge 2014

by J. Montero

2022

Key finding: The SIMPLE 4 ALL consortium introduced a pipeline for speech synthesis construction requiring minimal expert supervision, exemplified by automatic segmentation and alignment of audiobook recordings as ‘found’ data,... Read more

articleView Paper downloadDownload

MILE TTS for Tamil for blizzard challenge 2014

by Ramakrishnan Angarai Ganesan and

2014

Key finding: This work reports a unit-selection concatenative TTS system developed for Tamil using only one hour of provided speech data. The system used forced alignment through HTK to generate phoneme-level segmentation from given data,... Read more

articleView Paper downloadDownload

You betcha it's random: riffle shuffling in cards games – when is enough, enough?

by Kyle Caudle

2025, Teaching Statistics

Key finding: Though primarily focused on mathematical card shuffling, this paper’s underlying methodological approach of modeling stochastic processes with limited sampling parallels techniques relevant to evaluating randomness in unit... Read more

articleView Paper downloadDownload

CMU Blizzard 2008: Optimally using a large database for unit selection synthesis.

by Kishore Prahallad

2011, Blizzard Challenge …

Key finding: CMU's entry prioritized using a carefully selected subset of the large database by pruning less reliable segments, focusing on news and Arctic subsets to maximize quality while mitigating labeling errors. By employing a... Read more

articleView Paper downloadDownload

The Lessac Technologies Hybrid Concatenated System for Blizzard Challenge 2013

by Reiner Wilhelms-Tricarico

2024

Key finding: Lessac Technologies introduced 'Lessemes', an extensive symbolic phonetic representation that includes prosodic and coarticulatory information allowing precise unit clustering beyond traditional phonetic sets. Their hybrid... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

2. What challenge types do players prefer in gaming, and how can a validated challenge inventory inform game design?

This research theme centers on understanding the multidimensional nature of 'challenge' in video games, delineating distinct challenge types, and measuring player preferences to inform both academic understanding and practical game development. Developing a psychometrically validated inventory quantifies key challenge types, enabling correlation with player motivations and genre preferences. The aim is to provide a measurement tool that enables nuanced, player-centric game design and research.

3. How does motivated play in MMORPGs like World of Warcraft relate to positive and negative player experiences?

This area focuses on the complex interplay of player motivations and their experiential outcomes in persistent, immersive multiplayer online role-playing games (MMORPGs) such as World of Warcraft (WoW). Utilizing ethnographic methods and mixed qualitative and quantitative analyses, the research seeks to disentangle how core motivational dimensions—Achievement, Social, and Immersion—correlate with both wellbeing-enhancing and distressing game experiences. Understanding these dynamics informs psychological theory on gaming and has implications for game design and player welfare.

Restorative Magical Adventure or Warcrack?: Motivated MMO Play and the Pleasures and Perils of Online Experience

by Jeffrey G Snodgrass and

2015, Games and Culture

Key finding: Through mixed ethnographic and survey methodologies within WoW, this study extends Yee's 3-factor motivation framework by showing that Achievement motivation frequently correlates with problematic or addictive play, while... Read more

articleView Paper downloadDownload

All papers in Blizzard challenge

I2R Text-to-Speech System for Blizzard Challenge 2010

by Minghui Dong

2025

This paper describes I 2 R's submission to the Blizzard Challenge 2010 speech synthesis evaluation. This is our third participation in the challenge. In this paper, we will describe our main approaches to building the required voices. We... more

descriptionView Paper arrow_downwardDownload

The MARY TTS entry in the Blizzard Challenge 2008

by Sathish Pammi

2024

The present paper reports on the DFKI entry to the Blizzard challenge 2008. The main difference of our system compared to last year is a new join model inspired by last year's iFlytek paper; the effect seems small, but measurable in the... more

descriptionView Paper arrow_downwardDownload

Multilingual MARY TTS participation in the Blizzard Challenge 2009

by Sathish Pammi

2024

The paper describes the Blizzard Challenge 2009 participation of MARY TTS, an open-source TTS system using a unit selection voice. We briefly outline the new language support framework we provide so that people can add support for their... more

descriptionView Paper arrow_downwardDownload

The Lessac Technologies Hybrid Concatenated System for Blizzard Challenge 2013

by Reiner Wilhelms-Tricarico

2024

Lessac Technologies has developed a technology for concatenated speech synthesis based on a novel approach for describing speech in which expressivity, voice quality, and speaking style are fundamental. The main aspect of our system is... more

descriptionView Paper arrow_downwardDownload

The Lessac Technologies Hybrid Concatenated System for Blizzard Challenge 2013

by Reiner Wilhelms-Tricarico

2024

descriptionView Paper arrow_downwardDownload

Verification of temperature, wind and precipitation fields for the high-resolution WRF NMM model over the complex terrain of Montenegro

by Angel Marcev

2023, Technology and Health Care

BACKGROUND: The necessity of setting up high-resolution models is essential to timely forecast dangerous meteorological phenomena. OBJECTIVE: This study presents a verification of the numerical Weather Research and Forecasting... more

Table | Fig. 1. The domain of the WRF NMM model (red square) and the geographical location of the stations on the map of Montenegro A. Ze€evié et al. / Verification of temperature, wind and precipitation fields

Fig. 3. Variations of (yellow) mean difference, (green) mean absolute difference and (brown) root mean square difference (°C) of the NUM WRF model forecast against temperature on 2 m observations with months.

Fig. 4. Heatmap of mean difference. Shades of blue means underestimation of model while the red shade shows overestimation of modelled temperature.

Fig. 5. Correlation coefficient (CC) of the WRF NMM temperature forecast against temperature observations for forecasting periods of 24, 48 and 72 hours.

Fig. 6. Variations of mean difference(dark grey), mean absolute difference (grey) and root mean square difference (light grey) of the NUM WRF model forecast against wind on the 10 m observations during a day. x axis is time in hours (UTC).

Fig. 7. Variations of (dark blue) mean difference, (blue) mean absolute difference and (light blue) root mean square difference (ms~') of the NUM WRF model forecast against wind on the 10 m observations.

Bar [42.1 N; 19.083 E] {WMO: 13461; model_el: 4[m], el: 6(m]} Tivat [42.404 N; 18.723 E] {WMO: 13457; model_el: 73 [m], el: S[ Golubovci [42.359 N; 19.252 E) {WMO: 13462; model_el: 54 [m], € Niksic [42.766 N; 18.952 E] {WMO: 13459; model_el: 651 [m], el: € Pljevija (43.35 N; 19.35 E] {WMO: 13363; model_el: 846 [m], el: 7f Zabljak [43.15 N; 19.12 E] {WMO: 13361; model_el: 1423

Fig. 8. Binary contingency tables (it is best to remove them, leaving only the results).

descriptionView Paper arrow_downwardDownload

The NTNU Concatenative Speech Synthesizer

by Torbjørn Svendsen

2023

This paper describes NTNU’s entry for the Blizzard Challenge 2010. Our system is a conceptually simple variation of an HMM-based unit selection system, which uses diphones as the basic unit and employs a combined selection of units and... more

descriptionView Paper arrow_downwardDownload

Unit selection based speech synthesis for poor channel condition

by Minghui Dong and

2023, Interspeech 2009

Synthesized speech can be largely degraded in noise, resulting in compromised speech quality. In this paper, we propose a unit selection based speech synthesis system for better speech quality under poor channel conditions. First, the... more

descriptionView Paper arrow_downwardDownload

I 2 R’s Submission to Blizzard Challenge 2008

by Minghui Dong

2023

This paper reports the I 2 R's submission to the Blizzard Challenge 2008. This is our first participation in Blizzard Challenge. In this paper, we describe the approach that we used to build the three required voices. We introduced the... more

descriptionView Paper arrow_downwardDownload

Fundamentals of Meteorology

by Vlado Spiridonov

2023

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and... more

descriptionView Paper arrow_downwardDownload

Vergina: A Modern Greek Speech Database for Speech Synthesis

by Todor Ganchev

2022

The present paper outlines the Vergina speech database, which was developed in support of research and development of corpus-based unit selection and statistical parametric speech synthesis systems for Modern Greek language. In the... more

descriptionView Paper arrow_downwardDownload

Developing a Text to Speech System for Dzongkha

by Yeshi Wangchuk

2022, Computer Engineering and Intelligent Systems

Text to Speech plays a vital role in imparting information to the general population who have difficulty reading text but can understand spoken language. In Bhutan, many people fall in this category in adopting the national language... more

descriptionView Paper arrow_downwardDownload

Investigation of an extreme Koshava wind episode of 30 January-4 February 2014

by Mladjen Ćurić

2022, Atmospheric Science Letters

An extreme Koshava episode (EKE) from 30 January to 4 February 2014 has been studied. Koshava is a local windstorm in Southeast Europe. EKE was characterized by wind gusts above 45 m s -1 and deep snowdrifts. Strong Eurasian anticyclone... more

descriptionView Paper arrow_downwardDownload

Ucd blizzard challenge 2011 entry

by Zeeshan Ahmed

2022

This paper gives an overview of the UCD Blizzard Challenge 2011 entry. The entry is a unit selection synthesiser that uses hidden Markov models for prosodic modelling. The evaluation consisted of s ...

descriptionView Paper arrow_downwardDownload

Building sleek synthesizers for multi-lingual screen reader

by Kishore Prahallad

2022

In this paper, we are investigating the unit size: syllable, half-phone and quarter-phone to be used for speech synthesis in multi-lingual screen reader in phonetic languages such as Telugu and non-phonetic language English. Perceptual... more

descriptionView Paper arrow_downwardDownload

The Blizzard Challenge 2014

by Swaran Lata

2022

The Blizzard challenge 2014 was the tenth annual Blizzard challenge organized by the following group of institutions : IIIT Hyderabad, IIT Madras, DAIICT, SSN College of Engineering, IIT Mandi and IIT Guwahati with support and... more

Figure 1: Similarity and Naturalness results on RD for IH1.1 (Assamese)

Figure 2: Similarity and Naturalness results on RD for IH1.2 (Gujarati)

Figure 3: Similarity and Naturalness results on RD for IH1.3 (Hindi)

Figure 4: Similarity and Naturalness results on RD for IH1.4 (Rajasthani)

Figure 5: Similarity and Naturalness results on RD for IH1.5 (Tamil)

Figure 6: Similarity and Naturalness results on RD for IH1.6 (Telugu)

Figure 7: Similarity and Naturalness results on SUS for IH1.1 (Assamese)

Figure 8: Similarity and Naturalness results on SUS for IH1.2 (Gujarati)

Figure 9: Similarity and Naturalness results on SUS for IH1.3 (Hindi)

Figure 10: Similarity and Naturalness results on SUS for IH1.4 (Rajasthani)

Figure 11: Similarity and Naturalness results on SUS for IH1.5 (Tamil)

Figure 12: Similarity and Naturalness results on SUS for IH1.6 (Telugu)

Figure 16: Intelligibility results on SUS for TH1.4 (Ra- jasthani) Figure 13: Intelligibility results on SUS for IH1.1 (As- samese) Figure 15: Intelligibility results on SUS for IH1.3 (Hindi) Figure 14: Intelligibility results on SUS for IH1.2 (Gujarati)

Figure 17: Intelligibility results on SUS for IH1.5 (Tamil) Figure 18: Intelligibility results on SUS for IH1.6 (Telugu)

Figure 19: Similarity and Naturalness results on ML for IH2.1 (Assamese)

Figure 20: Similarity and Naturalness results on ML for IH2.2 (Gujarati)

Figure 21: Similarity and Naturalness results on ML for IH2.3 (Hindi)

Figure 22: Similarity and Naturalness results on ML for IH2.4 (Rajasthani)

Figure 23: Similarity and Naturalness results on ML for IH2.5 (Tamil)

Figure 24: Similarity and Naturalness results on ML for IH2.6 (Telugu)

Table 1: Participants in Blizzard challenge 2014

descriptionView Paper arrow_downwardDownload

De-Identification of Speech

by bharat kandoi

2022

10 In this paper, we describe an efficient method of de-identification of speech 11 such that the transformation from the source speech is furthest away from 12 the source features, yet fully intelligible. We have designed a speaker ID 13... more

descriptionView Paper arrow_downwardDownload

Segment selection in the L&h Realspeak laboratory TTS system

by Justin Fackrell

2021

The L&H RealSpeak Laboratory TTS (RSLab) system is a corpus based speech synthesis system comprising components that deal with linguistic processing, prosody prediction, segment selection, concatenation and modification. In this paper we... more

descriptionView Paper arrow_downwardDownload

The Cerevoice Speech Synthesiser

by Eva Bofias

2021

This paper describes the CereVoice text-to-speech system developed by Cereproc Ltd, and its use for the generation of the test sentences for the Albayzin 2008 TTS evaluation. Also, the building procedure of a Cerevoice-compatible voice... more

descriptionView Paper arrow_downwardDownload

The Blizzard Challenge 2014

by Anandaswarup Vadapalli

2021

descriptionView Paper arrow_downwardDownload

Is Unit Selection Aware of Audible Artifacts?

by Daniel Tihelka

2021

This paper presents a new analytic method that can be used for analyzing perceptual relevance of unit selection costs and/or their sub-components as well as for tuning of unit selection weights. The proposed method is leveraged to... more

descriptionView Paper arrow_downwardDownload

Is Unit Selection Aware of Audible Artifacts?

by Daniel Tihelka

2021

descriptionView Paper arrow_downwardDownload

Quality Deterioration Factors in Unit Selection Speech Synthesis

by Daniel Tihelka

2021, Lecture Notes in Computer Science

The purpose of the present paper is to examine the relationships between target and concatenation costs and the quality (with focus on naturalness) of generated speech. Several synthetic phrases were examined by listeners with the aim to... more

descriptionView Paper arrow_downwardDownload

Quality Deterioration Factors in Unit Selection Speech Synthesis

by Daniel Tihelka

2021, TSD, Text, Speech and Dialogue

The purpose of the present paper is to examine the relation- ships between target and concatenation costs and the quality (with focus on naturalness) of generated speech. Several synthetic phrases were ex- amined by listeners with the aim... more

descriptionView Paper arrow_downwardDownload

Development of Croatian unit selection and statistical parametric speech synthesis

by Ivo Ipsic

2021, 2011 Proceedings of the 34th International Convention Mipro

This paper presents the development of Croatian speech synthesis systems. Three voices were built using the same recorded speech corpus. Two of these voices were built with the Festival speech synthesis system, using the clustering unit... more

descriptionView Paper arrow_downwardDownload

A solution to the reduction of concatenation artefacts in speech synthesis

by Kim Koppen

2021

One problem with speech synthesis impeding high quality is the occurrence of audible discontinuities at segment boundaries. Formant jumps across concatenation points suggest the problem to be due to spectral differences. The problem is... more

descriptionView Paper arrow_downwardDownload

I2R Text-to-Speech System for Blizzard Challenge 2009

by Minghui Dong

2021

This paper describes I 2 R's submission to the Blizzard Challenge 2009. This is our second time participating in this challenge. In this paper, we will describe our main approach to building the required voices. We will introduce the... more

descriptionView Paper arrow_downwardDownload

I2R Text-to-Speech System for Blizzard Challenge 2009

by Minghui Dong

2021

This paper describes I2R's submission to the Blizzard Challenge 2009. This is our second time participating in this challenge. In this paper, we will describe our main approach to building the required voices. We will introduce the... more

descriptionView Paper arrow_downwardDownload

I2R Text-to-Speech System for Blizzard Challenge 2009

by Minghui Dong

2021

descriptionView Paper arrow_downwardDownload

Vergina: A Modern Greek Speech Database for Speech Synthesis

by Alexandros Lazaridis

2021

descriptionView Paper arrow_downwardDownload

The IVO Software Blizzard Challenge 2009 Entry: Improving IVONA Text-To-Speech

by Lukasz OSowski

2021

This paper describes a special version of IVONA Text-To-Speech for a GB English voice designed and developed by IVO Software for The Blizzard Challenge 2009. The architecture of this system is based on an improved IVONA Text-To-Speech... more

descriptionView Paper arrow_downwardDownload

The IVO Software Blizzard Challenge 2009 Entry: Improving IVONA Text-To-Speech

by Lukasz OSowski

2021, Blizzard Challenge Workshop, Edinburgh, …

descriptionView Paper arrow_downwardDownload

The IVO Software Blizzard Challenge 2009 Entry: Improving IVONA Text-To-Speech

by Lukasz OSowski

2021

This paper describes a special version of IVONA Text-To- Speech for a GB English voice designed and developed by IVO Software for The Blizzard Challenge 2009. The architecture of this system is based on an improved IVONA Text-To-Speech... more

descriptionView Paper arrow_downwardDownload

Vergina: A Modern Greek Speech Database for Speech Synthesis

by Todor D Ganchev

2021

descriptionView Paper arrow_downwardDownload

A general-purpose IsiZulu speech synthesizer

by Marelie Davel

2021, South African Journal of African Languages

A general-purpose isiZulu text-to-speech (TTS) system was developed, based on the "Multisyn" unit-selection approach supported by the Festival TTS toolkit. The development involved a number of challenges related to the interface between... more

descriptionView Paper arrow_downwardDownload

Investigation of an extreme Koshava wind episode of 30 January-4 February 2014

by Ilija Jovicic

2021, Atmospheric Science Letters

An extreme Koshava episode (EKE) from 30 January to 4 February 2014 has been studied. Koshava is a local windstorm in Southeast Europe. EKE was characterized by wind gusts above 45 m s −1 and deep snowdrifts. Strong Eurasian anticyclone... more

Figure 2. Past records of the extreme Koshava gusts at VrSac station (blue diamonds) and Belgrade station (red stars)

Figure |. Schematics of a typical Koshava flow. The across-mountain pressure gradient between the Wallachia Valley and the Pannonian Plane drives the flow through mountain passes and gorges. Weather stations are indicated with red dots.

Figure 3. The MSLP in mb (contours) overlying the surface air temperature map. The EKE started on 30 January (a) and lasted until 4 February (b).

Figure 4. (a) The SKI before and during the EKE. (b) Daily mean (blue line with circles) and maximum (orange line with triangles) Koshava speeds during the EKE at the BG station (primary y-axis). The grey line with squares represents the height of the maximum Koshava speed (logarithmic secondary y-axis).

Figure 5. Emagrams for BG (a) and VR (b). Parameters in box as follows: K, K index in °C; TT, total totals index in °C; PW, precipitable water for the entire sounding in centimetres; Temp, temperature on ground in °C; Dewp — dewpoint on ground in °C; Thetae, equivalent potential temperature in K; LI, lifted index in °C; CAPE, convective available potential energy in J kg~'; CIN, convective inhibition in J kg~!; EH, environmental helicity in m? s~?; SREH, storm relative environmental helicity in m*s~?; StrmDir, storm direction in degrees; StrmSpd, storm speed in ms~!. For further explanation of the parameters see Doswell Ill and Schultz (2006).

Figure 6. Comparison of simulated mean hourly wind speed (Y), gust (V), and wind direction (D) time series against observations (magneta lines with squares) the BG, NS, VR, and VG stations. The utilized models are summarized in Table |: NMM (green lines with stars), NMMB_| (blue lines with triangles), NMMB_2 (red lines with diamonds), NMM_QNSE (cyan lines with dots) and IFS (black lines with circles). See Table 2 for for the verification statistics.

Figure 7. NMM forecasts of the mean daily wind speeds during the EKE. Model start at 30 January (00h UTC).

Table |. Overview physical packages, initial and boundary conditions, and computational domain characteristics used in numerica modelling of the EKE.

Table 2. Bias (i.e. absolute difference), mean absolute difference (MAD), root mean square difference (RMSD) and correlation coefficient (CC) between forecasts and observations during the EKE. The corresponding time series are portrayed in Figure 6.

descriptionView Paper arrow_downwardDownload

Unit Selection In a Concatenative Speech Synthesis System Using a Large Speech Database

by Muthamizh Selvan

2021, … , Speech, and Signal Processing, 1996. …

One approach to the generation of natural-sounding syn-thesized speech waveforms is to select and concatenate units from a large speech database. Units (in the current work, phonemes) are selected to produce a natural realisation of a... more

descriptionView Paper arrow_downwardDownload

Unit Selection In a Concatenative Speech Synthesis System Using a Large Speech Database

by muthamizh selvan

2021, … , Speech, and Signal Processing, 1996. …

descriptionView Paper arrow_downwardDownload

Segment selection in the L&h Realspeak laboratory TTS system

by Geert Coorman

2021

descriptionView Paper arrow_downwardDownload

Segment selection in the L&H Realspeak laboratory TTS system

by Geert Coorman

2021, Proc. ICSLP

The next element in the synthesizer is a polyphone concatenator, which joins the polyphone waveforms with minimal audible distartion.

The calculation of the masking functions m, can be made very efficient by using a piecewise linear function, shown in Figure 3.

descriptionView Paper arrow_downwardDownload

The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate

by Adriana Stan

2017, Speech Communication

descriptionView Paper arrow_downwardDownload

Vergina: A Modern Greek Speech Database for Speech Synthesis

by Theodoros Kostoulas

2016

Figure 1: Structural information of the WCL-1 text corpus In the first step, a large amount of textual material, approximately 5 million words, was collected from articles in newspapers (approximately 2.2 million words) and periodicals (approximately 1.4 million words) as well as from excerpts from the literature (approximately 1.4 million words). The entire text corpus consists of Figure 1 and Tables 1 and 2 show structural information of the Vergina speech database. In particular, in Figure | the number of words per sentence is presented. In Table 1 the twenty most frequent words of the database are presented along with the number of their occurrences and the pronunciation of the words. In Table 2 the twenty Furthermore, the design of the database was guided by the needs of building a Greek corpus-based unit-selection voice operating with phone sized units as well as by the needs of a HMM-based voice. Even though perfect quality open-domain synthesis is not yet possible (Kominek and Black, 2003), an attempt was made not to restrict the database to a specific narrow domain. This was achieved by designing the contents in such a way, so that a number of dissimilar domains are covered in the recordings. To implement this intention, we included in the database texts collected from different domains and sources such as newspapers, periodicals, and literature. For that purpose the prompt sentences were designed through the following steps: (i) selecting a source text corpus to represent the target domains, (11) analyzing the source text corpus to obtain the unit statistics and finally (iii) selecting appropriate prompt sentences from the source text. The above mentioned steps resulted in a set of approximately 3,000 sentences. This set corresponds to approximately 23,500 words — 8,000 unique words — and to approximately 60,000 and 127,000 syllables and phones respectively.

Table 4: Mean duration, standard deviation and number of occurrences of the phones of Vergina speech database After producing the phonetic transcription for each speech waveform of the database we estimated the phonetic boundary positions by time-aligning the phone sequences with HMM phone models. In order to accurately estimate the phonetic transition positions we used the hybrid-HMM method of Mporas et al. (2008). In his method, for each phone an HMM model is initially constructed by embedded training (Young et al., 2006) of he corresponding HMMs. The resulting initial set of HMM models is time-aligned against the phonetic sequences in order to produce a first estimation of the phonetic boundaries. These boundaries are in turn used to rain isolated-unit models (Young et al., 2006), which in turn are time-aligned to produce a refined, i.e. more In this work, we outlined the Vergina speech database, which was recently developed at the Wire Communications Laboratory of the University of Patras. The design, development and annotation of the database

descriptionView Paper arrow_downwardDownload

Simple designing methods of corpus-based visual speech synthesis

by Hiromichi Kawanami

2016

This paper describes simple designing methods of corpus-based visual speech synthesis. Our approach needs only a synchronous real image and speech database. Visual speech is synthesized by concatenating real image segments and speech... more

descriptionView Paper arrow_downwardDownload

Development of Croatian unit selection and statistical parametric speech synthesis

by Ivo Ipsic

2016

descriptionView Paper arrow_downwardDownload

Development of Croatian unit selection and statistical parametric speech synthesis

by Ivo Ipsic

2016

descriptionView Paper arrow_downwardDownload

Development of Croatian unit selection and statistical parametric speech synthesis

by Ivo Ipsic

2016

descriptionView Paper arrow_downwardDownload

Halfphones: a backoff mechanism for Diphone Unit Selection Synthesis

by Marelie Davel

2016

Diphone Backoff mechanisms in text-to-speech provide a means of ensuring that synthesis of the text takes place, even if some of the diphones in the text are missing in the speech database. This paper describes an automatic method for... more

descriptionView Paper arrow_downwardDownload

Halfphones: A Backoff Mechanism for Diphone Unit Selection Synthesis

by Marelie Davel

2016

Diphone Backoff mechanisms in text-to-speech provide a means of ensuring that synthesis of the text takes place, eve n if some of the diphones in the text are missing in the speech database. This paper describes an automatic method for... more

descriptionView Paper arrow_downwardDownload

Speech synthesis for a specific speaker based on a labeled speech database

by Dan Chazan

2016, Proceedings of the 12th IAPR International Conference on Pattern Recognition (Cat. No.94CH3440-5)

This paper proposes a new text-to-speech synthesis technique, for producing continuous, natural sounding speech of a speci c speaker. The synthesis technique is based on selecting short speech frames from a phoneme-labeled s p eech... more

descriptionView Paper arrow_downwardDownload

Simple designing methods of corpus-based visual speech synthesis

by Kiyohiro Shikano

2016, Annual Conference of the International Speech Communication Association

This paper describes simple designing methods of corpus-based visual speech synthesis. Our approach needs only a syn- chronous real image and speech database. Visual speech is synthesized by concatenating real image segments and speech... more

descriptionView Paper arrow_downwardDownload

Blizzard challenge

Key research themes

1. How can Blizzard Challenge entries optimize speech synthesis quality with limited linguistic resources and data?

2. What challenge types do players prefer in gaming, and how can a validated challenge inventory inform game design?

3. How does motivated play in MMORPGs like World of Warcraft relate to positive and negative player experiences?

Related Topics

All papers in Blizzard challenge