Speech Translation Research Papers

Translation Shift of a Transitivity System in Obama and Trump’s Inauguration Speech

2025, LiNGUA: Jurnal Ilmu Bahasa dan Sastra

The purpose of the study is to describe the process shift caused by applying certain translation techniques. The research used descriptive qualitative method by applying purposive sampling technique. The source of the data was text of... more

descriptionView Paper arrow_downwardDownload

Evaluating Multilingual Speech Translation under Realistic Conditions with Resegmentation and Terminology

by Mona Diab

2025, Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)

We present the ACL 60/60 evaluation sets for multilingual translation of ACL 2022 technical presentations into 10 target languages. This dataset enables further research into multilingual speech translation under realistic recording... more

descriptionView Paper arrow_downwardDownload

A multicriteria comparison of end-to-end and cascade speechto-text translation models

by beei iaes

2025, Bulletin of Electrical Engineering and Informatics

This paper presents a thorough examination of two prominent speech-to-text translation (STT) models: the end-to-end (E2E) model and the cascade model. STT is a critical technology in today's multilingual society, facilitating... more

descriptionView Paper arrow_downwardDownload

Evaluation of alternatives on speech to sign language translation

by Rubén San Segundo

2025, Interspeech 2007

This paper evaluates different approaches on speech to sign language machine translation. The framework of the application focuses on assisting deaf people to apply for the passport or related information. In this context, the main aim is... more

descriptionView Paper arrow_downwardDownload

Adapting a speech into sign language translation system to a new domain

by Rubén San Segundo

2025, Interspeech 2013

This paper presents a methodology for adapting an advanced communication system for deaf people in a new domain. This methodology is a user-centered design approach consisting of four main steps: requirement analysis, parallel corpus... more

descriptionView Paper arrow_downwardDownload

LSESpeak: A spoken language generator for Deaf people

by Rubén San Segundo

2025, Expert Systems with Applications

This paper describes the development of LSESpeak, a spoken Spanish generator for Deaf people. This system integrates two main tools: a sign language into speech translation system and an SMS (Short Message Service) into speech translation... more

descriptionView Paper arrow_downwardDownload

An introduction to the Chinese speech recognition front-end of the NICT/ATR multi-lingual speech translation system

by jinsong zhang

2025, Tsinghua Science and Technology

This paper introduces several important features of the Chinese large vocabulary continuous speech recognition system in the NICT/ATR multi-lingual speech-to-speech translation system. The features include: (1) a flexible way to derive an... more

descriptionView Paper arrow_downwardDownload

Implementing SRI's Pashto speech-to-speech translation system on a smart phone

by smart phone

2025, 2010 IEEE Spoken Language Technology Workshop

We describe our recent effort implementing SRI's UMPCbased Pashto speech-to-speech (S2S) translation system on a smart phone running the Android operating system. In order to maintain very low latencies of system response on... more

descriptionView Paper arrow_downwardDownload

Initial Considerations in Building a Speech-to-Speech Translation System for the Slovenian-English Language Pair

by Simon Dobrisek

2025

The paper presents the design concept of the VoiceTRAN Communicator that integrates speech recognition, machine translation and text-to-speech synthesis using the DARPA Galaxy architecture. The aim of the project is to build a robust... more

descriptionView Paper arrow_downwardDownload

Diagnstico Integral en salud y sus determinantes sociales: Villa Hidalgo, San Luis Potosí

by Elysse Bautista-González

2025, Congreso de Investigación en Salud Pública

Elaborar un diagnóstico integral en salud de personas trabajadoras de dos plantas enfocadas a la producción de aluminio y plástico de Villa Hidalgo, San Luis Potosí, para orientar el diseño de intervenciones públicas y privadas... more

descriptionView Paper arrow_downwardDownload

Text-to-speech synthesis for embedded speech communicators

by Jerneja Gros

2025, International Conference on Artificial Intelligence

We present a scalable corpus-based concatenation text-to-speech (TTS) system, which can be used in a variety of applications, ranging from server-based systems to embedded applications. For embedded applications, limited memory and... more

descriptionView Paper arrow_downwardDownload

Labeling of Prosodic Events in Slovenian Speech Database GOPOLIS

by Jerneja Gros

2025, Language Resources and Evaluation

The paper describes prosodic annotation procedures of the GOPOLIS Slovenian speech data database and methods for automatic classification of different prosodic events. Several statistical parameters concerning duration and loudness of... more

descriptionView Paper arrow_downwardDownload

A Voice-Driven Web Browser for Blind People

by Jerneja Gros

2025, Lecture Notes in Computer Science

A small self-voicing Web browser designed for blind users is presented. The Web browser was built from the GTK Web browser Dillo, which is a free software project in terms of the GNU general public license. Additional functionality has... more

descriptionView Paper arrow_downwardDownload

CTC-based Compression for Direct Speech Translation

by Mauro Cettolo

2025

Previous studies demonstrated that a dynamic phone-informed compression of the input audio is beneficial for speech translation (ST). However, they required a dedicated model for phone recognition and did not test this solution for direct... more

descriptionView Paper arrow_downwardDownload

Efficient handling ofN-gram language models for statistical machine translation

by Mauro Cettolo

2025

Statistical machine translation, as well as other areas of human language processing, have recently pushed toward the use of large scale n-gram language models. This paper presents efficient algorithmic and architectural solutions which... more

descriptionView Paper arrow_downwardDownload

Beyond Voice Activity Detection: Hybrid Audio Segmentation for Direct Speech Translation

by Mauro Cettolo

2025, arXiv (Cornell University)

The audio segmentation mismatch between training data and those seen at run-time is a major problem in direct speech translation. Indeed, while systems are usually trained on manually segmented corpora, in real use cases they are often... more

descriptionView Paper arrow_downwardDownload

A Comparison of Transformer and Recurrent Neural Networks on Multilingual Neural Machine Translation

by Mauro Cettolo

2025, International Conference on Computational Linguistics

Recently, neural machine translation (NMT) has been extended to multilinguality, that is to handle more than one translation direction with a single system. Multilingual NMT showed competitive performance against pure bilingual systems.... more

descriptionView Paper arrow_downwardDownload

The IWSLT 2016 Evaluation Campaign

by Mauro Cettolo

2025

The IWSLT 2016 Evaluation Campaign featured two tasks: the translation of talks and the translation of video conference conversations. While the first task extends previously offered tasks with talks from a different source, the second... more

descriptionView Paper arrow_downwardDownload

Overview of the IWSLT 2017 Evaluation Campaign

by Mauro Cettolo

2025

The IWSLT 2017 evaluation campaign has organised three tasks. The Multilingual task, which is about training machine translation systems handling many-to-many language directions, including so-called zero-shot directions. The Dialogue... more

descriptionView Paper arrow_downwardDownload

The IWSLT 2015 Evaluation Campaign

by Mauro Cettolo

2025

The IWSLT 2015 Evaluation Campaign featured three tracks: automatic speech recognition (ASR), spoken language translation (SLT), and machine translation (MT). For ASR we offered two tasks, on English and German, while for SLT and MT a... more

descriptionView Paper arrow_downwardDownload

The IWSLT Evaluation Campaign: Challenges, Achievements, Future Directions

by Mauro Cettolo

2025

Evaluation campaigns are the most successful modality for promoting the assessment of the state of the art of a field on a specific task. Within the field of Machine Translation (MT), the International Workshop on Spoken Language... more

descriptionView Paper arrow_downwardDownload

Findings of the Iwslt 2023 Evaluation Campaign

by Mauro Cettolo

2025, Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)

This paper reports on the shared tasks organized by the 20th IWSLT Conference. The shared tasks address 9 scientific challenges in spoken language translation: simultaneous and offline translation, automatic subtitling and dubbing,... more

descriptionView Paper arrow_downwardDownload

Direct Speech Translation for Automatic Subtitling

by Mauro Cettolo

2025, arXiv (Cornell University)

Automatic subtitling is the task of automatically translating the speech of an audiovisual product into short pieces of timed text, in other words, subtitles and their corresponding timestamps. The generated subtitles need to conform to... more

descriptionView Paper arrow_downwardDownload

KIT’s Multilingual Speech Translation System for IWSLT 2023

by Thai Nguyen

2025, Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)

Many existing speech translation benchmarks focus on native-English speech in high-quality recording conditions, which often do not match the conditions in real-life use-cases. In this paper, we describe our speech translation system for... more

descriptionView Paper arrow_downwardDownload

A Multi-Modal English-Italian Parallel Corpus for End-to-End Speech-to-Text Machine Translation

by Sara Stymne

2025

We discuss a set of methods for the creation of IESTAC: a English-Italian speech and text parallel corpus designed for the training of end-toend speech-to-text machine translation models and publicly released as part of this work. We... more

descriptionView Paper arrow_downwardDownload

Kant: Knowledge-Based, Accurate Natural Language Translation

by Teruko Mitamura

2025

KANT is an interlingual MT system for multi-lingual translation of technical documents, written using a controlled vocabulary and grammar. KANT is comprised of a set of software modules (parser, interpreter, mapper, generator) which work... more

descriptionView Paper arrow_downwardDownload

Factored translation models for improving a speech into sign language translation system

by Juan M Montero

2025, Interspeech 2011

This paper proposes the use of Factored Translation Models (FTMs) for improving a Speech into Sign Language Translation System. These FTMs allow incorporating syntactic-semantic information during the translation process. This new... more

descriptionView Paper arrow_downwardDownload

A Spanish speech to sign language translation system for assisting deaf-mute people

by Juan M Montero

2025, Interspeech 2006

This paper describes the first experiments of a speech to sign language translation system in a real domain. The developed system is focused on the sentences spoken by an officer when assisting people in applying for, or renewing the... more

descriptionView Paper arrow_downwardDownload

Speech to sign language translation system for Spanish

by Juan M Montero

2025, Speech Communication

This paper describes the development of and the first experiments in a Spanish to sign language translation system in a real domain. The developed system focuses on the sentences spoken by an official when assisting people applying for,... more

descriptionView Paper arrow_downwardDownload

Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation

by Jonathan Amith

2025, Interspeech 2022

Self-Supervised Learning (SSL) models have been successfully applied in various deep learning-based speech tasks, particularly those with a limited amount of data. However, the quality of SSL representations depends highly on the... more

descriptionView Paper arrow_downwardDownload

Leveraging End-to-End ASR for Endangered Language Documentation: An Empirical Study on Yolóxochitl Mixtec

by Jonathan Amith

2025, Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Transcription bottlenecks", created by a shortage of effective human transcribers are one of the main challenges to endangered language (EL) documentation. Automatic speech recognition (ASR) has been suggested as a tool to overcome such... more

descriptionView Paper arrow_downwardDownload

The Althingi ASR System

by Inga Rún Helgadóttir

2025, Interspeech 2019

All performed speeches in the Icelandic parliament, Althingi, are transcribed and published. An automatic speech recognition system (ASR) has been developed to reduce the manual work involved. To our knowledge, this is the first open... more

descriptionView Paper arrow_downwardDownload

DynaSpeak

by Horacio Franco

2025, Proceedings of the second international conference on Human Language Technology Research -

We introduce SRI's new speech recognition engine, DynaSpeak TM , which is characterized by its scalability and flexibility, high recognition accuracy, memory and speed efficiency, adaptation capability, efficient grammar optimization,... more

descriptionView Paper arrow_downwardDownload

Activity analysis of sign language video for mobile telecommunication

by Eve Riskin

2025

In presenting this dissertation in partial fulfillment of the requirements for the doctoral degree at the University of Washington, I agree that the Library shall make its copies freely available for inspection. I further agree that... more

descriptionView Paper arrow_downwardDownload

On the use of prosody in a speech-to-speech translator

by Guenther Goerz

2025, 5th European Conference on Speech Communication and Technology (Eurospeech 1997)

descriptionView Paper arrow_downwardDownload

Translation Shift of a Transitivity System in Obama and Trump’s Inauguration Speech

by Taufik Nur Hidayat

2025, LiNGUA: Jurnal Ilmu Bahasa dan Sastra

The purpose of the study is to describe the process shift caused by applying certain translation techniques. The research used descriptive qualitative method by applying purposive sampling technique. The source of the data was text of... more

descriptionView Paper arrow_downwardDownload

Wav2Vec2 Application in Automated Email Formatting Using Real-Time Speech Recognition

by WARSE The World Academy of Research in Science and Engineering

2025, International Journal of Advanced Trends in Computer Science and Engineering

Automatic Speech Recognition (ASR) has experienced remarkable progress, transitioning from rule-based systems to deep learning methodologies that enhance interactions between humans and machines. Earlier ASR systems depended on Hidden... more

descriptionView Paper arrow_downwardDownload

Un nouveau système indépendant de rejet multi-seuils pour la reconnaissance de mots manuscrits

by Laurent Guichard

2025

En reconnaissance de mots manuscrits, la capacité de rejeter les mots qui n'appartiennent pas au lexique ou présentent une ambiguïté est indispensable pour fiabiliser un système de reconnaissance utilisé en condition réelle. Dans cet... more

descriptionView Paper arrow_downwardDownload

Speech and Translation Technologies

by Mark Seligman

2025, Cambridge University Press eBooks

Cross-language communication in healthcare is urgently needed. Daily and nightly throughout the world, thousands of conversations are required between caregiversdoctors, nurses, administrators, volunteers, and othersand patients or family... more

descriptionView Paper arrow_downwardDownload

Design and Development of Speech Corpora for Air Traffic Control Training

by Lubos Smidl

2025

The paper describes the process of creation of domain-specific speech corpora containing air traffic control (ATC) communication prompts. Since the ATC domain is highly specific both from the acoustic point-of-view (significant level of... more

descriptionView Paper arrow_downwardDownload

La robustesse de la traduction neuronale : les systèmes de traduction automatique neuronale à l' épreuve de la reproductibilité de l'expérience

by Lichao Zhu

2025, HAL (Le Centre pour la Communication Scientifique Directe)

descriptionView Paper arrow_downwardDownload

Accelerating Hakka Speech Recognition Research and Development Using the Whisper Model

by Chen-Chi Chang

2025, Proceedings of the 35th Conference on Computational Linguistics and Speech Processing (ROCLING 2023)

Language preservation has become increasingly urgent with globalization and rapid technological advancement. Minority languages, such as Hakka, are particularly vulnerable. This study aims to expedite the research and development of Hakka... more

descriptionView Paper arrow_downwardDownload

ELiRF at MediaEval 2013: Similar Segments in Social Speech Task

by Ferran Pla

2025

This paper describes the Natural Language Engineering and Pattern Recognition group (ELiRF) approaches and results towards the Similar Segments of Social Speech Task of Me-diaEval 2013. The task involves finding segments similar to a... more

descriptionView Paper arrow_downwardDownload

Speech Recognition Technologies Design Challenges and Real World Applications

by editor ijircst

2025, International Journal of Innovative Research in Computer Science and Technology (IJIRCST)

This paper presents an automated speech recognition (ASR) system that transcribes audio from YouTube videos into accurate text using OpenAI's Whisper model. Leveraging tools such as yt_dlp, FFmpeg, and PyTorch, the system creates a robust... more

descriptionView Paper arrow_downwardDownload

The NESPOLE! Speech-to-Speech Translation System

by Michael Bett

2025, Lecture Notes in Computer Science

descriptionView Paper arrow_downwardDownload

Multilingual Speech Recognition

by Michael Bett

2025, Verbmobil: Foundations of Speech-to-Speech Translation

The speech-to-speech translation system Verbmobil requires a multilingual setting. This consists of recognition engines in the three languages German, English and Japanese that run in one common framework together with a language... more

descriptionView Paper arrow_downwardDownload

Enhancing the usability and performance of NESPOLE!

by Michael Bett

2025, Proceedings of the second international conference on Human Language Technology Research -

descriptionView Paper arrow_downwardDownload

Not only Translation Quality: Evaluating the NESPOLE! Speech-to-Speech Translation System along other Viewpoints

by Michael Bett

2025

Performance and usability of real-world speech-to-speech translation systems, like the one developed within the Nespole! project, are affected by several aspects that go beyond the pure translation quality provided by the Human Language... more

descriptionView Paper arrow_downwardDownload

Advances in meeting recognition

by Michael Bett

2025, Proceedings of the first international conference on Human language technology research - HLT '01

descriptionView Paper arrow_downwardDownload

Advanced Voice Cloning and Transcription Using Deep Learning: Implementation of NVIDIA Tacotron2 for HighFidelity Speech Synthesis

by Journal of Computer Science IJCSIS

2025, International Journal of Computer Science and Information Security (IJCSIS), Vol. 23, No. 2, March-April

State of the art voice cloning methodologies employed conventional concatenative and parametric synthesis techniques, effective and efficient though they have remained, producing mechanical, if somewhat constrained speech. Advancements in... more

descriptionView Paper arrow_downwardDownload

Speech Translation

Related Topics