Academia.eduAcademia.edu

Sequence-to-Sequence Models

description12 papers
group3 followers
lightbulbAbout this topic
Sequence-to-sequence models are a class of neural network architectures designed to transform input sequences into output sequences, commonly used in tasks such as machine translation and text summarization. They typically consist of an encoder that processes the input and a decoder that generates the output, enabling the handling of variable-length sequences.
lightbulbAbout this topic
Sequence-to-sequence models are a class of neural network architectures designed to transform input sequences into output sequences, commonly used in tasks such as machine translation and text summarization. They typically consist of an encoder that processes the input and a decoder that generates the output, enabling the handling of variable-length sequences.

Key research themes

1. How can sequence-to-sequence models be enhanced to optimize generation quality and address exposure bias?

Sequence-to-sequence (seq2seq) models for text generation often suffer from exposure bias due to training on ground-truth sequences but generating from model predictions at test time. Additionally, standard training optimizes word-level likelihood which does not directly correlate with sequence-level evaluation metrics like BLEU or ROUGE. This theme investigates methods that directly optimize sequence-level objectives, integrate reinforcement learning techniques, and introduce novel training algorithms to mitigate exposure bias and improve generation quality.

Key finding: This paper introduces MIXER, a sequence-level training algorithm that combines cross-entropy and REINFORCE to optimize non-differentiable metrics like BLEU directly, addressing exposure bias by using model predictions during... Read more
Key finding: TwinNet trains an auxiliary backward RNN to generate sequences in reverse, encouraging the forward RNN states to predict corresponding backward states. This regularization guides the forward model to capture long-term... Read more
Key finding: This work proposes a modification of seq2seq models for incremental output generation by conditioning output predictions on partial input sequences and previously generated partial outputs via a transducer RNN over blocks.... Read more

2. What architectural advances in sequence-to-sequence modeling enable better long-range dependency modeling and scalability?

Capturing long-term dependencies and scalable training are critical challenges for sequence modeling. This research area focuses on Transformer-based architectures employing self-attention mechanisms that eliminate recurrence and enable better parallelization. Improvements include deeper network designs, auxiliary loss functions to enhance convergence, and hybrid attention mechanisms combining hard and soft attention to efficiently model sparse and global dependencies.

Key finding: This study shows that a 64-layer Transformer model with causal self-attention and auxiliary losses at intermediate layers and positions outperforms traditional truncated backpropagation-through-time LSTMs for character-level... Read more
Key finding: ReSAN integrates a novel parallelizable hard attention mechanism (reinforced sequence sampling - RSS) with soft self-attention, wherein hard attention selects important tokens for soft attention to process. The soft attention... Read more
Key finding: This survey synthesizes advances in Transformer-based sequence-to-sequence models applied to NLP tasks, highlighting their use of self-attention to model long-range dependencies and overcome RNN limitations such as vanishing... Read more

3. How can sequence-to-sequence models be designed and trained to facilitate efficient decoding while satisfying complex constraints?

While seq2seq models excel at generating sequences, standard autoregressive decoding is inherently sequential and slow, limiting real-time applications and constrained generation scenarios. This theme explores approaches that introduce discrete latent variables to enable more parallel decoding, frameworks supporting modular extensible model development for scalability, and novel decoding algorithms inspired by heuristic search to enforce lexical or logical constraints effectively during generation.

Key finding: Proposes Latent Transformer model that auto-encodes target sequences into shorter sequences of discrete latent variables, which are generated autoregressively and then decoded in parallel. Introduces decomposed vector... Read more
Key finding: Lingvo is a TensorFlow-based research framework providing modular, extensible building blocks and centralized experiment configurations allowing flexible sequence-to-sequence model development. It supports production-scale... Read more
Key finding: This paper compares deep learning models (GRU, LSTM, CNN) with physics-based residual Kalman filter (RKF) for dynamic load identification under limited data and structural uncertainty scenarios. While deep networks excel in... Read more
Key finding: Introduces OptiGAN, combining GANs with reinforcement learning using policy gradients to optimize desired goal metrics in sequence generation, such as BLEU for text or McGrew score for trajectories. The hybrid approach... Read more
Key finding: NEUROLOGIC A* integrates heuristic future cost estimation inspired by A* search into beam search decoding to enforce complex lexical constraints in sequence generation. By incorporating lookahead heuristics for constraint... Read more

All papers in Sequence-to-Sequence Models

This study delves into the relatively unexplored domain of natural language processing for the Kazakh languagea language with limited computational resources. The paper dissects the effectiveness of diffusion models and transformers in... more
Paraphrasing is an important aspect of language competence; however, EFL learners have long had difficulty paraphrasing in their writing owing to their limited language proficiency. Therefore, automatic paraphrase suggestion systems can... more
The dynamic structural load identification capabilities of the gated recurrent unit, long short-term memory, and convolutional neural networks are examined herein. The examination is on realistic small dataset training conditions and on a... more
Grapheme-to-phoneme models are key components in automatic speech recognition and text-to-speech systems. With lowresource language pairs that do not have available and well-developed pronunciation lexicons, grapheme-to-phoneme models are... more
Grapheme-to-phoneme models are key components in automatic speech recognition and text-to-speech systems. With lowresource language pairs that do not have available and well-developed pronunciation lexicons, grapheme-to-phoneme models are... more
Classifying or categorizing texts is the process by which documents are classified into groups by subject, title, author, etc. This paper undertakes a systematic review of the latest research in the field of the classification of Arabic... more
The article considers a lemmatiser that is developed specifically for Old Church Slavonic (OCS). The introduction underlines the problem of the lack of lemmatisers that might deal with different datasets of the OCS. The review gives a... more
This paper describes our contribution to two challenges in data-driven lemmatization. We approach lemmatization in the framework of a two-stage process, where first lemma candidates are generated and afterwards a ranker chooses the most... more
Language identification (LI) in textual documents is the process of automatically detecting the language contained in a document based on its content. The present language identification techniques presume that a document contains text in... more
We present LemmaTag, a featureless neural network architecture that jointly generates part-of-speech tags and lemmas for sentences by using bidirectional RNNs with characterlevel and word-level embeddings. We demonstrate that both tasks... more
Automatic Image Captioning is the never-ending effort of creating syntactically and validating the accuracy of textual descriptions of an image in natural language with context. The encoder-decoder structure used throughout existing... more
Nowadays due to vast number of camera equipped devices, large amount of data in terms of image and video are getting generated which brings lot of information which can address many real world problems [16]. Deep learning based Visual... more
Computer vision and natural language processing (NLP) are two active machine learning research areas. However, the integration of these two areas gives rise to a new interdisciplinary field, which is currently attracting more attention of... more
Automatic Image Captioning is the never-ending effort of creating syntactically and validating the accuracy of textual descriptions of an image in natural language with context. The encoder-decoder structure used throughout existing... more
Paraphrasing is an important aspect of language competence; however, EFL learners have long had difficulty paraphrasing in their writing owing to their limited language proficiency. Therefore, automatic paraphrase suggestion systems can... more
In this paper, a self-attention based neural network architecture to address human activity recognition is proposed. The dataset used was collected using smartphone. The contribution of this paper is using a multi-layer multi-head... more
Techniques for generating and recognizing paraphrases, i.e., semantically equivalent expressions, play an important role in a wide range of natural language processing tasks. In the last decade, the task of automatic acquisition of... more
Natural language processing for historical material almost inevitably runs into the problematic combination of large variation (leading to domain adaptation-like problems) and low resources (problematic for the standard statistical... more
Video captioning is process of summarising the content, event and action of the video into a short textual form which can be helpful in many research areas such as video guided machine translation, video sentiment analysis and providing... more
Transliteration is the task of translating text from source script to target script provided that the language of the text remains the same. In this work, we perform transliteration on less explored Devanagari to Roman Hindi... more
Paraphrasing is an important aspect of language competence; however, EFL learners have long had difficulty paraphrasing in their writing owing to their limited language proficiency. Therefore, automatic paraphrase suggestion systems can... more
The current situation regarding the existence of natural language processing (NLP) resources and tools for Corsican reveals their virtual non-existence. Our inventory contains only a few rare digital resources, lexical or corpus... more
Automatic Image Captioning is the never-ending effort of creating syntactically and validating the accuracy of textual descriptions of an image in natural language with context. The encoder-decoder structure used throughout existing... more
With the development of today's society, demand for applications using digital cameras jumps over year by year. However, analyzing large amounts of video data causes one of the most challenging issues. In addition to storing the data... more
This paper describes a study on the impact of the original signal (text, speech, visual scene, event) of a text pair on the task of both manual and automatic sub-sentential paraphrase acquisition. A corpus of 2,500 annotated sentences in... more
In this paper, we report efforts towards the acquisition and construction of a bilingual parallel corpus between French and Wolof, a Niger-Congo language belonging to the Northern branch of the Atlantic group. The corpus is constructed as... more
Techniques for generating and recognizing paraphrases, i.e., semantically equivalent expressions, play an important role in a wide range of natural language processing tasks. In the last decade, the task of automatic acquisition of... more
We introduce a composite deep neural network architecture for supervised and language independent context sensitive lemmatization. The proposed method considers the task as to identify the correct edit tree representing the transformation... more
We analyze the performance of encoder-decoder neural models and compare them with wellknown established methods. The latter represent different classes of traditional approaches that are applied to the monotone sequence-to-sequence tasks... more
If two sentences have the same meaning, it should follow that they are equivalent in their inferential properties, i.e., each sentence should textually entail the other. However, many paraphrase datasets currently in widespread use rely... more
This paper presents the process of compiling a model-agnostic similarity goal standard for evaluating Danish word embeddings based on human judgments made by 42 native speakers of Danish. Word embeddings resemble semantic similarity... more
We present a novel technique for zero-shot paraphrase generation. The key contribution is an end-to-end multilingual paraphrasing model that is trained using translated parallel corpora to generate paraphrases into “meaning spaces” –... more
Classifying or categorizing texts is the process by which documents are classified into groups by subject, title, author, etc. This paper undertakes a systematic review of the latest research in the field of the classification of Arabic... more
Natural Language Processing is now dominated by deep learning models. Baseline1 is a library to facilitate reproducible research and fast model development for NLP with deep learning. It provides easily extensible implementations and... more
We study the role of the second language in bilingual word embeddings in monolingual semantic evaluation tasks. We find strongly and weakly positive correlations between down-stream task performance and second language similarity to the... more
The efficacy of external language model (LM) integration with existing end-to-end (E2E) automatic speech recognition (ASR) systems can be improved significantly using the internal language model estimation (ILME) method [1]. In this... more
by Amy Hemmeter and 
1 more
Natural Language Processing is now dominated by deep learning models. Baseline1 is a library to facilitate reproducible research and fast model development for NLP with deep learning. It provides easily extensible implementations and... more
In the case of compromised databases or interested database managers, CryptDB has been built for validated and realistic protection. CryptDB operates through encrypted data while executing SQL queries. The key concept of the SQL-aware... more
This paper presents the submission by the Charles University-University of Malta team to the SIGMORPHON 2019 Shared Task on Morphological Analysis and Lemmatization in context. We present a lemmatization model based on previous work on... more
We investigate the influence that document context exerts on human acceptability judgements for English sentences, via two sets of experiments. The first compares ratings for sentences presented on their own with ratings for the same set... more
The concept of using two neural networks to translate one Sequence to another sequence presented by google in 2014 has led to a revolutionary result of translation between the input sequence as source language and the output sequence as... more
We develop a precise writing survey on sequence-to-sequence learning with neural network and its models. The primary aim of this report is to enhance the knowledge of the sequence-to-sequence neural network and to locate the best way to... more
Historical text normalization often relies on small training datasets. Recent work has shown that multi-task learning can lead to significant improvements by exploiting synergies with related datasets, but there has been no systematic... more
This paper presents a paraphrase acquisition method that uncovers and exploits generalities underlying paraphrases: paraphrase patterns are first induced and then used to collect novel instances. Unlike existing methods, ours uses both... more
Health management has become a primary problem as new kinds of diseases and complex symptoms are introduced to a rapidly growing modern society. Building a better and smarter healthcare infrastructure is one of the ultimate goals of a... more
We develop a precise writing survey on sequence-to-sequence learning with neural network and its models. The primary aim of this report is to enhance the knowledge of the sequence-to-sequence neural network and to locate the best way to... more
Vectorization is extracting data from strings through Natural Language Processing by using different approaches; one of the best approaches used in vectorization is word2vec. To make the vectorized data secure, we must apply a security... more
Lemmatisation, which is one of the most important stages of text preprocessing, consists in grouping the inflected forms of a word together so they can be analysed as a single item, identified by the word’s lemma, or dictionary form. It... more
Colloquialism in the Philippines has been prominently used in day-to-day conversations. Its vast emergence is evident especially on social media platforms but poses issues in terms of understandability to certain groups. For this... more
We present symbolic and neural approaches for Arabic paraphrasing that yield high paraphrasing accuracy. This is the first work on sentence level paraphrase generation for Arabic and the first using neural models to generate paraphrased... more
Download research papers for free!