Academia.eduAcademia.edu

Language Models

description372 papers
group36 followers
lightbulbAbout this topic
Language models are computational algorithms designed to understand, generate, and manipulate human language by predicting the likelihood of sequences of words. They utilize statistical methods and neural networks to analyze linguistic patterns, enabling applications in natural language processing, machine translation, and text generation.
lightbulbAbout this topic
Language models are computational algorithms designed to understand, generate, and manipulate human language by predicting the likelihood of sequences of words. They utilize statistical methods and neural networks to analyze linguistic patterns, enabling applications in natural language processing, machine translation, and text generation.
Large Language Models (LLMs) are increasingly trained in elastic, multi-tenant cloud infrastructures[1] that span data centers, regions, and heterogeneous accelerators. While distributed training has matured in scale and efficiency, its... more
Large Language Models (LLMs) are increasingly trained in elastic, multi-tenant cloud infrastructures that span data centers, regions, and heterogeneous accelerators. While distributed training has matured in scale and efficiency, its... more
Building language models (LMs), especially small and medium ones, remains more art than science. While large LMs often improve by sheer scale, it is still unclear why many design choices work. For small LMs, this uncertainty is more... more
Recently, adversarial input highly negotiates the security concerns in deep learning (DL) techniques. The main motive to enhance the natural language processing (NLP) models is to learn attacks and secure against adversarial text.... more
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by... more
This working note introduces semantic drift as a hidden failure mode in large language models. While accuracy measures facts and coherence measures form, fidelity measures whether meaning survives. Drift occurs when intent and nuance... more
This document presents the development and implementation of an advanced personalized travel planning system that integrates multiple Generative Artificial Intelligence techniques, including Large Language Models (LLMs), Retrieval... more
The pervasive deployment of Deep Learning models has recently prompted apprehensions regarding their ecological footprint, owing to the exorbitant levels of energy consumption necessitated by the training and inference processes. The term... more
While theoretical linguists and cognitive scientists alike have contested the contribution of Large Language Models (LLMs) to linguistic theory, small cognitively-inspired Language Models (BabyLMs) have emerged as a complementary research... more
This article explores the phenomenon of Lucid Dreama unique state in the behavior of a large language model, where token generation is temporarily suppressed while internal computational activity is maintained. Based on the architectural... more
Curriculum Learning has been a popular strategy to improve the cognitive plausibility of Small-Scale Language Models (SSLMs) in the BabyLM Challenge. However, it has not led to considerable improvements over noncurriculum models. We... more
In this study, we investigate the zero-shot and zero-shot chain-of-thought reasoning capabilities of advanced language models GPT-4, Claude and Mistral on the Joint Admissions and Matriculation Board (JAMB) Mathematics and Physics... more
Despite several decades of research in document analysis, recognition of unconstrained handwritten documents is still considered a challenging task. Previous research in this area has shown that word recognizers perform adequately on... more
Large Language Models (LLMs) have demonstrated significant capabilities in answering questions using techniques such as Chain of Thought (CoT) and Retrieval-Augmented Generation (RAG). CoT enables step-by-step reasoning to improve... more
The field of artificial intelligence (AI) is evolving at an extraordinary pace, and among its most transformative innovations are Large Language Models (LLMs). These models—powering chatbots, search engines, code generators, and more—are... more
World Wide Web (WWW) is a mine of information for most people. Due to the huge amount of ‎information and documents available on the internet, the process ‎of retrieving documents that are most relevant to user needs become a tremendous... more
This research addresses Natural Language Pro- cessing (NLP) tokenization challenges for tran- sitional Chinese, which lacks adequate digi- tal resources. The project used a collection of articles from the Shenbao, a newspaper from this... more
Corpus of english PhD theses collected by the EthOS 1 service of the British Library 475,383 documents Meaningful metadata: ethosid Identifier of the record withing the EThOS digital library; title Title of the thesis; creator Author of... more
Large Language Models (LLMs) have catalyzed a paradigm shift in Natural Language Processing (NLP). From the introduction of the Transformer architecture to the development of massive generative models such as GPT-3.5, LLaMA2-7B, and PaLM,... more
En premier lieu, je remercie évidemment Nadine Vigouroux à qui je dois tant. Non seulement pour son encadrement rigoureux, mais surtout pour son humanité et sa confiance. Je pourrais remplir le reste de ce document en louanges à son... more
This paper describes a test set designed to analyse the translation of dislocations from Persian, to be used for testing neural machine translation models. We first tested the accuracy of the two Universal dependency treebanks for Persian... more
A new weighting scheme for vector space model is presented to improve retrieval effectiveness for an information retrieval system. In addition, a dimension compression method is introduced to reduce the computational cost of the weighting... more
Hindi and Urdu are variants of the same language, but while Hindi is written in the Devnagri script from left to right, Urdu is written in a script derived from a Persian modification of Arabic script written from right to left. The... more
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by... more
The quadratic memory complexity of transformers prevents long document summarization in low computational resource scenarios. State-of-the-art models need to apply input truncation, thus discarding and ignoring potential summary-relevant... more
Although current state-of-the-art Transformerbased solutions succeeded in a wide range for single-document NLP tasks, they still struggle to address multi-input tasks such as multidocument summarization. Many solutions truncate the... more
Direct integration of translation model (TM) probabilities into a language model (LM) with the purpose of improving automatic speech recognition (ASR) of spoken translations typically requires a number of complex operations for each... more
The main objective of this paper is to examine the readability statistics of a corpus of Malaysian short stories in English with reference to a corpus of established canonical short stories written by native speakers. The short stories... more
In this paper our goal is to perform an open-ended exploration of the program repair search space. Our idea is to collect the largest number of test-suite adequate patches, independently of whether they are fully correct or overfitting.... more
Most of the relevance feedback algorithms only use document terms as feedback (local features) in order to update the query and re-rank the documents to show to the user. This approach is limited by the terms of those documents without... more
La diabetes, una enfermedad con un impacto global significativo en la salud, plantea desafíos considerables en su diagnóstico y tratamiento. Este articulo aborda la necesidad de mejorar la accesibilidad a información precisa sobre la... more
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by... more
Cet article explore la révolution de la traduction générative à l'ère de l'intelligence artificielle (IA), en analysant tant ses fondements théoriques que ses implications pratiques. Après avoir défini la traduction générative à l'ère de... more
Este estudio evalúa la eficacia de las redes neuronales recurrentes (RNN) y los modelos basados en transformadores para predecir el índice de calidad del aire (ICA). La investigación compara los modelos RNN tradicionales, incluidos los... more
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by... more
The global crisis of language endangerment meets a technological turning point as Generative AI (GenAI) and Large Language Models (LLMs) unlock new frontiers in automating corpus creation, transcription, translation, and tutoring.... more
Functional decline is one of the serious syndromes experienced among older adults. Its early assessment is critical to preventing its symptoms. Some Comprehensive Geriatric Assessment CGA questionnaires, chosen amongst others, can be... more
The growing need for accountability of the people behind AI systems can be addressed by leveraging processes in three fields of study: ethics, law, and computer science. While these fields are often considered in isolation, they rely on... more
Task 3 of the 2013 ShARe/CLEF eHealth Evaluation Lab simulated web searches for health information by patients. The web searches were designed to be connected to hospital discharge summaries from the patient's Electronic Medical Record... more
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by... more
Automated lexicon acquisition from corpora represents one way that large datasets can be leveraged to provide resources for a variety of NLP tasks. Our work applies techniques popularized in sentiment lexicon acquisition and topic... more
This paper presents a performance analysis of the neural model of visual attention presented by Kelvin Xu, et al. in 2016. The model was trained and tested with a new Spanish translated version of the Flickr8k dataset. This is the first... more
This paper presents results of dependency parsing of Old French, a language which is poorly standardized at the lexical level, and which displays a relatively free word order. The work is carried out on five distinct sample texts... more
In this paper, we describe the development of unit selection voice for Tamil language. We describe the build process and address the issue of speech segmentation using HMM based techniques. We report the comparison of automatically... more
The main objective of an assessment is to measure student's learning abilities and increase such abilities by correcting them in line with their knowledge. Question generation plays a vital role in assessment, The creation of the... more
Probabilistic finite-state machines are used today in a variety of areas in pattern recognition, or in fields to which pattern recognition is linked. In part I of this paper, we surveyed these objects and studied their properties. In this... more
In this paper, we set out to present an original rule-learning algorithm for symbolic natural language processing (NLP), designed to learn the rules of extraction of keywords marked in its training sentences. What really sets our... more
The aim of this thesis is to give an introduction to Natural Language Understanding. Many tools and language models are described along this work in order to teach a machine the ability to analyze and understand human speech. In the last... more
We present the HistCorp collection, a freely available open platform aiming at the distribution of a wide range of historical corpora and other useful resources and tools for researchers and scholars interested in the study of historical... more
This paper presents the application of morpheme-based and factored language models in an Amharic speech recognition task. Since using morphemes in both acoustic and language models results, mostly, in performance degradation due to... more
Download research papers for free!