Linguistic Computing Research Papers

Words, Meaning and Vocabulary An introduction to modern English lexicology

2025, Words, Meaning and Vocabulary An introduction to modern English lexicology

Chapter 1: 1.1 The word and its associative field 1.2 Syntagmatic and paradigmatic relations 1.3 (a) Lexical fields in the total vocabulary 1.3 (b) Example of a lexical field Chapter 6: 6.1 The vocabulary of English according to the OED... more

descriptionView Paper arrow_downwardDownload

John J Parry 1946 "The Revival of Cornish"

by William Parry

2025, PMLA

"The Revival of Cornish: An Dasserghyans Kernewek" by John J. Parry (1889–1954). With unpublished comments, corrections and updates by "Caradar" (A. S. D. Smith, 1883–1950), letter dated May 27, 1946. Cornish, a Celtic language of... more

descriptionView Paper arrow_downwardDownload

Twentieth Century Cornish Lexicography and Language Revival

by Jon Mills

2024

descriptionView Paper arrow_downwardDownload

Computer Assisted Lemmatisation of a Cornish Text Corpus for Lexicographical Purposes

by Jon Mills

2024

descriptionView Paper arrow_downwardDownload

Reconstructive Phonology and Contrastive Lexicology: Problems with the Gerlyver Kernewek Kemmyn

by Jon Mills

2024, University of Exeter Press eBooks

this is the case, orthographic practice has a lot in common with Romance languages in general. Where Cornish has borrowed from English, Cornish spelling frequently resembles that found in the works of Chaucer. But there are differences... more

descriptionView Paper arrow_downwardDownload

Review of Web Dictionary of Ukrainian Feminine Personal Nouns

by Hanna Barabakh

2024

descriptionView Paper arrow_downwardDownload

Analyzing Homographs in Acehnese and English Languages

by puan tursina

2024, English Education Journal

The aims of this study were to figure out the number of homographs in Acehnese and English languages and the examples of homographs. Qualitative approach was used to conduct the study. Both Acehnese dictionary and Oxford dictionary were... more

English is the word “let” in Acehnese rattan).” Indeed, interlingual homograph is

descriptionView Paper arrow_downwardDownload

Cornish or Klingon? The Standardization of the Cornish Language

by Bernard Deacon

2023, Cornish Studies

There are estimated to be upwards of 6,000 languages in the world today, although a disturbingly high proportion of these are under threat of extinction. All these languages have their histories. And then there are those languages that... more

There are estimated to be upwards of 6,000 languages in the world today, although a disturbingly high proportion of these are under threat of extinction. All these languages have their histories. And then there are those languages that have been invented over the last century and a half. The 'fastest growing language in the galaxy' has been claimed to be Klingon, invented for an alien race who were first heard speaking it in Star Trek: The Motion Picture in 1979. 1 In 1984 Marc Okrand, a linguist, invented its grammar, vocabulary and orthography. 2 Since then Trekkies have enthusiastically attempted to learn this language, to the extent that in 1999 over 600 people could claim to be speakers, while the Klingon Language Institute had over 1,000 members. 3 Klingon, the product of a globalised American TV culture, would seem to be hundreds of light years away from Cornish, a language with a long and respectable history. But is it? In this brief review of the attempted standardisation of revived Cornish my argument is that, in its twentieth century revival, Cornish had many resemblances to invented languages such as Klingon. Being a language fit for aliens, Klingon promoters deliberately revel in its inhuman irregularities. 4 However, this is unusual for deliberately invented languages. 5 The website of the Klingon Language Institute is significantly hosted by the Logical Language Group. Its language-Lojban-claims to possess an unambiguous grammar, phonetic spelling and the 'unambiguous resolution of sounds into words'. Unlike historic languages, with their messy irregularities and other human foibles, languages such as Lojban or the older Esperanto (which dates from 1887) are 'easy to learn'. 6 Those familiar with the dialects of revived Cornish will have heard similar claims. Historically, Cornish became a distinct language somewhere in the latter part of the first millennium when the dialect of the British language spoken in south western Britain began to diverge from that of the Welsh. The history of this language can then be traced to the death of its last speakers around 1800. What is written and spoken now is revived Cornish, a resuscitated version that has no unbroken chain back to the historical language. Indeed, at least one observer has argued that the gap between the historic language and revived Cornish is so wide we should describe revived Cornish as 'pseudo

descriptionView Paper arrow_downwardDownload

Suffix -mente Adverbs in DAELE, A Spanish Learners' Dictionary

by Sergi Torner

2023, International Journal of Lexicography

This paper looks at how Spanish-mente adverbs are shown in DAELE, an electronic dictionary for advanced-level students of Spanish, currently being developed at the Universitat Pompeu Fabra (Barcelona). Since a learners' dictionary is a... more

descriptionView Paper arrow_downwardDownload

Adapting an English morphological analyzer for French

by Evelyne Tzoukermann

2023

A word-based morphological analyzer and a dictionary for recognizing inflected forms of French words have been built by adapting the UDICI" system. We describe the adaptations, emphasizing mechanisms developed to handle French verbs. This... more

descriptionView Paper arrow_downwardDownload

8 Orthographic regularization in Early Modern English printed books: Grapheme distribution and vowel length indication

by Hanna Rutkowska

2023

The present study is an attempt at assessing the level of consistency in the orthographic systems of selected sixteenth and seventeenth-century printers and at tracing the influence that normative writings could have potentially exerted... more

descriptionView Paper arrow_downwardDownload

Descriptive and normative aspects of lexicographic decision-making: the borderline cases

by Lars Trap-Jensen

2023

The paper deals with that all-too-familiar situation where the lexicographer must make a decision whether to include a particular item in the dictionary or not, and if so then in what form. Some borderline cases are investigated where... more

descriptionView Paper arrow_downwardDownload

The Death and Subsequent Revival of the Cornish Language

by Kensa Broadhurst

2023

Cornish is the vernacular language of Cornwall, the most SouthWestern part of Great Britain. It is widely believed the language died out in the eighteenth century with the death of Dolly Pentreath, the so-called last speaker of the... more

descriptionView Paper arrow_downwardDownload

On a vocabulary data base

by Marc Eisinger

2023, Computers and The Humanities

descriptionView Paper arrow_downwardDownload

Word Manager and CALL: structured access to the lexicon as a tool for enriching learners’ vocabulary

by Cornelia Tschichold

2023, ReCALL

Morphology consists of inflection and word formation. In foreign language teaching it occurs mainly in the form of inflectional paradigms. While this is certainly an important part of mastering a foreign language, an adequate use of... more

descriptionView Paper arrow_downwardDownload

Vowel Sound Disambiguation for Intelligible Korean Speech Synthesis

by Jong C. Park

2023, Proceedings of the 19th Asia-Pacific …

Proceedings of PACLIC 19, the 19th Asia-Pacific Conference on Language, Information and Computation. ... Vowel Sound Disambiguation for Intelligible Korean Speech Synthesis ... Ho-Joon Lee Computer Science Division EECS department, KAIST... more

The first and third arrows in Figure 14 indicate long vowel sound of ‘J}(kwa)’ in‘ JE O|OfFI/S dS otO}] BetCt(kunun iyakilul kwacanghaye malhanta: He tends to overstate things)’ and ‘AH (pel) in ‘SSS 2S 2 tCt(cwukun pelul patassta: I got a dead bee)’, respectively, and the second and fourth arrows point out short vowel sound of ‘(ho)’ in ‘S==0i| J+Ct(hocwuey kata: I RS Ql go to Australia)’ and ‘UH(pay)’ in ‘Uh St SS S011 SCt(pay han chekul ilhepelyessta: I lost one ship)’, respectively. framework. We have investigated possible syntactic clues for vowel sound disambiguation, such as parts-of-speech information, the possibility of conjunction with a suffix, the syntactic relationship with a predicate phrase, case information, unconjugated adjectives, numerals, numerical adjective, and numerical adj ectives with related nouns. prey: “4 sgn x a ‘gna ‘ec < wee a ag ale ag

descriptionView Paper arrow_downwardDownload

The Word Class Adjective in English Business Magazines Online

by Borislav Marušić

2023, Journal of Language and Cultural Education

The aim of this paper is to research the word class adjective in one sequence of the ESP: Business English, more precisely English business magazines online. It is an empirical study on the corpus taken from a variety of business... more

descriptionView Paper arrow_downwardDownload

Strategies in tracing linguistic variation in a corpus of Old Irish texts (CorPH)

by Nora White

2023, Corpus studies of language through time

This article introduces Corpus PalaeoHibernicum (CorPH), a corpus currently consisting of 78 texts in Early Irish (c. 7th–10th cent.) created by the ERC-funded Chronologicon Hibernicum (ChronHib) project by bringing together pre-existing... more

descriptionView Paper arrow_downwardDownload

Lexical vs . Dictionary Databases Design Choices of the MorDebe System

by Maarten Janssen

2023

Many lexical databases are modelled simply as digital version of paper dictionaries. However, for many purposes the demands on a lexical database are different from those on a dictionary database. Therefore, the MorDebe database system... more

descriptionView Paper arrow_downwardDownload

Computer-aided inflection for lexicography controlled lexica

by Maarten Janssen

2023, Electronic Lexicography in the 21st Century New Applications For New Users Proceedings of Elex 2011 Bled 10 12 November 2011 2011 Pags 96 105

This article describes the design of a computational system for the development and maintenance of inflected lexica, developed as part of the Open Source Lexical Information Network (OLSIN). The system is built as a tool for... more

descriptionView Paper arrow_downwardDownload

LEXICON AND THE NATURE OF WORDS

by Emmanuel Erondu

2023, Emmanuel Erondu

This paper explains the roles of the lexicon and the lexicographer to the nature of words.

descriptionView Paper arrow_downwardDownload

Strategies in tracing linguistic variation in a corpus of Old Irish texts (CorPH)

by Elliott Lash

2023, Corpus studies of language through time

This article introduces Corpus PalaeoHibernicum (CorPH), a corpus currently consisting of 78 texts in Early Irish (c. 7th–10th cent.) created by the ERC-funded Chronologicon Hibernicum (ChronHib) project by bringing together pre-existing... more

descriptionView Paper arrow_downwardDownload

Nora Sánchez: Accounting Dictionary, English-Spanish, Spanish-English, Spanish-Spanish. Hoboken, New Jersey, USA: John Wiley & Sons, 2003

by Anne Lise Laursen

2023, HERMES - Journal of Language and Communication in Business

descriptionView Paper arrow_downwardDownload

Coping with an expanding vocabulary: the lexicographical contribution to Welsh

by Andrew Hawke

2023, International Journal of Lexicography

The Welsh language, as a lesser-used language with English as an immediate neighbour, has inevitably borrowed much of its vocabulary from that language (or its precursors) as well as inheriting a considerable vocabulary from Latin via... more

descriptionView Paper arrow_downwardDownload

8 Orthographic regularization in Early Modern English printed books: Grapheme distribution and vowel length indication

by Hanna Rutkowska

2023, Current Trends in Historical Sociolinguistics

The present study is an attempt at assessing the level of consistency in the orthographic systems of selected sixteenth and seventeenth-century printers and at tracing the influence that normative writings could have potentially exerted... more

descriptionView Paper arrow_downwardDownload

The Long Journey from the Core to the Real Size of Large LDBs

by Elena Paskaleva

2023

Large Lexical Data Bases are one of the earliest applications of NLP. The initial stage of their rise, with the admiration for the automation of lexicographic work itself, came to an end long ago. In the following stages LexicalData Bases... more

descriptionView Paper arrow_downwardDownload

The long journey from the core to the real size of a large LDB

by Elena Paskaleva

2023, Proceedings of ACL …

Large Lexical Data Bases are one of the earliest applications of NLP. The initial stage of their rise, with the admiration for the automation of lexicographic work itself, came to an end long ago. In the following stages LexicalData Bases... more

descriptionView Paper arrow_downwardDownload

Towards an integrated environment for Spanish document verification and composition

by Luis Sopeña

2022, Proceedings of the third conference on European chapter of the Association for Computational Linguistics -

Languages other than English have received little attention as far as the application of natural language processing techniques to text composition is concerned. The present paper describes briefly work under development aiming at the... more

descriptionView Paper arrow_downwardDownload

Towards an integrated environment for Spanish document verification and composition

by Celia Villar

2022, Proceedings of the third conference on European chapter of the Association for Computational Linguistics -

Languages other than English have received little attention as far as the application of natural language processing techniques to text composition is concerned. The present paper describes briefly work under development aiming at the... more

descriptionView Paper arrow_downwardDownload

Māori Vocabulary: A Study of Some High Frequency Homonyms

by Kelly Keane-Tuala

2022

The problem addressed in this thesis concerns the accuracy of Māori language vocabulary counts, e.g Boyce (2006), where Māori was found to use a very small vocabulary in comparison with e.g. English. As Boyce (2006, ii) acknowledges, this... more

descriptionView Paper arrow_downwardDownload

Word – morpheme balance in dictionary-making

by Abraham Solomonick

2022

Lexicography should Ьѳ based on the dual dependency of nearly every dictionary entry; word dependence and morpheme de pendence. The overt or Implied assumption that lexicography deals only wlth words and their combinations is. therefore,... more

descriptionView Paper arrow_downwardDownload

Analyzing Homographs in Acehnese and English Languages

by puan tursina

2022, English Education Journal

The aims of this study were to figure out the number of homographs in Acehnese and English languages and the examples of homographs. Qualitative approach was used to conduct the study. Both Acehnese dictionary and Oxford dictionary were... more

descriptionView Paper arrow_downwardDownload

Towards an integrated environment for Spanish document verification and composition

by Luis Sopeña

2022, Proceedings of the third conference on European chapter of the Association for Computational Linguistics -

Languages other than English have received little attention as far as the application of natural language processing techniques to text composition is concerned. The present paper describes briefly work under development aiming at the... more

descriptionView Paper arrow_downwardDownload

The Spanish Travel Subjective Lexicon (STSL)

by LILIANA IBETH BARBOSA SANTILLAN

2022, Proceedings 10th International Conference on Terminology and Artificial Intelligence Tia 2013 10th International Conference on Terminology and Artificial Intelligence Tia 2013 28 10 2013 30 10 2013 Paris

This paper presents a proposal for a recognition model for the appraisal value of sentences. It is based on splitting the text into independent sentences (full stops) and then analysing the appraisal elements contained in each sentence... more

descriptionView Paper arrow_downwardDownload

The Cornish Bible of John Trevisa

by Erik Grigg

2022

An examination of the evidence that a medieval Cornish Bible written by John Trevisa once existed

descriptionView Paper arrow_downwardDownload

Quantitative analysis of cetirizine dihydrochloride by HPLC (high performance liquid chromatography) and q-NMR (quantitative nuclear magnetic resonance) techniques

by Michael Villanueva

2022

, to a Polish father and English mother, Sarah Frances Field Sommerville, and brought up speaking English, Polish and French. As an Anglican clergyman, writer and historian he contributed to the Cornish Revival in the early twentieth... more

descriptionView Paper arrow_downwardDownload

Analysis of inflectional suffixes in the selected poems of Luis G. Dato: A Phonology and Morphology Study

by Lowie Jade Alojado and

2022

The derivational morphology in learners' English narrative compositions was the main focus of this research. This research aims to unravel the different inflectional suffixes used in the selected poems of Luis G. Dato driven from the... more

descriptionView Paper arrow_downwardDownload

THE DIACHRONIC PERSPECTIVE ON THE MORPHOLOGY OF COMPOUND ADJECTIVES

by Marijana Prodanović

2022

This paper focuses on a diachronic study of compound adjectives found in the Old and Middle English texts of the Helsinki Corpus. The compound adjectives of both periods are analysed, and further classified into types on the basis of the... more

descriptionView Paper arrow_downwardDownload

Lexicon Based Critical Tokenisation: An Algorithm

by Jon Mills

2022

In some languages, spaces and punctuation marks are used to delimit word boundaries. This is the case with Cornish. However there is considerable inconsistency of segmentation to be found within the Corpus of Cornish. The individual texts... more

descriptionView Paper arrow_downwardDownload

The Long Journey from the Core to the Real Size of Large LDBs

by Mariana Damova

2022

Large Lexical Data Bases are one of the earliest applications of NLP. The initial stage of their rise, with the admiration for the automation of lexicographic work itself, came to an end long ago. In the following stages LexicalData Bases... more

descriptionView Paper arrow_downwardDownload

The Long Journey from the Core to the Real Size of a Large LDB

by Milena Slavcheva

2022

Large Lexical Data Bases are one of the earliest applications of NLP. The initial stage of their rise, with the admiration for the automation of lexicographic work itself, came to an end long ago. In the following stages LexicalData Bases... more

descriptionView Paper arrow_downwardDownload

Historical Corpus and Historical Dictionary: Merging Two Ongoing Projects of Old French by Integrating their Editing Systems

by Sabine Tittel

2022

To combine corpus data with dictionary data has two advantages: (i) It embeds the vocabulary of the corpus texts within the overall system of the language, and it semantically disambiguates the texts. (ii) The corpus data enrich the... more

descriptionView Paper arrow_downwardDownload

Nora Sánchez: Accounting Dictionary, English-Spanish, Spanish-English, Spanish-Spanish. Hoboken, New Jersey, USA: John Wiley & Sons, 2003

by Anne Lise Laursen

2022, HERMES - Journal of Language and Communication in Business

descriptionView Paper arrow_downwardDownload

Screffva: A Lexicographer's Workbench

by Jon Mills

2022

This paper describes the implementation of Screffva, a computer system written in Prolog that employs a parallel corpus for the automatic generation of bilingual dictionary entries. Screffva provides a lemmatised interface between a... more

descriptionView Paper arrow_downwardDownload

Phonologically motivated orthographic variation in Modern Uyghur: the voicing of h

by Michael Fiddler

2022, Proceedings of the Workshop on Turkic and Languages in Contact with Turkic

In this paper, I present data from three corpora of written Uyghur showing that the conventionally voiceless letter h, which occurs in words of Arab-Persian etymology, sometimes patterns as voiced in stem-final environments where it is a... more

descriptionView Paper arrow_downwardDownload

Variants and Homographs : Eternal Problem of Dictionary Makers ⋆

by Jaroslava Hlaváčová

2022

We discuss two types of asymmetry between wordforms and their (morphological) characteristics, namely (morphological) variants and homographs. We introduce a concept of multiple lemma that allows for unique identification of wordform... more

Fig. 1. Relations among basic concepts. Lexeme is a set of lexical units that share the same paradigm. We are aware that especially this term is simplified but it is sufficient for dictionaries containing all necessary information about words but at the same time, easy to use.

Implementation of Multiple Lemmas. In the morphological dictionary of Czech [7], the wordforms are not listed separately, they are clustered according to their lemmas. The lemma represents the whole paradigm. However, the multiple lemma cannot represent the extended paradigm straightforwardly because a set cannot serve as unique identifier. Thus, we keep all lemma variants separately but we connect them with pointers (see Fig. 2). Fig. 2. Schema of implementation of multiple lemma.

Fig. 3. Schema of variants and homographs. Parts in ellipses concern polysemy. The basic difference between the two concepts are illustrated on the schemas in Fig. 3. For variants, the shape of the schema resembles the letter A, while for homographs it is the letter Y. The polysemy appears only at the syntactic (if applicable) or semantic levels of the schema (see the right schema). It is not surprising that these schemas resemble those introduced in [8], where they illus- trate synonymy and homonymy as relations between separate layers of language description.

descriptionView Paper arrow_downwardDownload

Microsoft and Dictionary Makers : Defining Partnerships

by maya fruchtman

2022

Thanks to ❙ Anthony P. Cowie, Raphael Gefen, Doron Rubinstein, Merav Kernerman Miriam Shlesinger, Nili Sadeh, Lionel Kernerman K DICTIONARIES LTD Nahum 10 Tel Aviv 63503 Israel ❙ tel 972-3-5468102 ❙ fax 972-3-5468103 ❙... more

descriptionView Paper arrow_downwardDownload

A generative grammar approach for the morphologic and morphosyntactic analysis of Italian

by Marina Russo

2022, Proceedings of the third conference on European …

A GENERATIVE GRAMMAR APPROACH FOR THE MORPHOLOGIC AND MORPHOSYNTACTIC ANALYSIS OF ITALIAN Marina Russo ... singular plural passa-porto (pass-port) passa-porti porta-cenere (ash-tray) porta-cenere cava-tappi (cork-screw) cava-tappi rule 1... more

Figure 4. Parse tree for the word MURAGLIONE

Figure 5. Parse tree for the word TRASPORTATORE

Figure 6. Parse tree for the word RIDANDOGLIELO

Figure 9, Compound tenses of verbs The grammar for the comparative and superlative forms of adjectives is applied any time the analyzer finds the words piu’ (more), meno (/ess) followed by a qualificative adjective. In this way it is possible to recognize and to distinguish expressions like piu’ interessante (more interesting) and it piu’ interessante (the most interesting). Remark that in English there is the use of more, most to make clear the distinction between the comparative and the superlative form of the adjective.

Figure 11. The grammar for COMPOUND NUMBERs Conclusions

Some examples of compound nouns are: The task of this part of the morphology is to: Figure 7. The rules for the plural of Compound Nouns

descriptionView Paper arrow_downwardDownload

English/Veneto resource poor machine translation with STILVEN

by Rodolfo Delmonte

2022, … of the International Symposium on Data …

The paper reports ongoing work for the implementation of a system for automatic translation from English-to-Veneto and viceversa. The system does not have parallel texts to work on because of the almost inexistence of such manual... more

Table 1. Italian/Veneto [x] grapheme mismatch As can be seen, /x/ may correspond to Italian /s/ /tch/, /dg/ /dz/ as far as sounds are concerned, and to [s, c, gi, ge, Z, zz] as far as graphemes are concerned. The same happens with Veneto /s/ as shown in Table 2. below, where it may correspond again to /s/ /z/ /tch/ /sc/, and to graphemes [ss, c, ZZ, Z, sci, sce],

It is just this mechanism that will allow the system to find appropriate antecedents for unexpressed subject pronouns which will automatically instantiate features like number and gender. Fig. 1 Anaphoric Processes in GETARUNS In the system, three levels are indicated: Clause level, i.e. simple sentences; Utterance level, i.e. complex sentences; Discourse level, i.e. intersententially. Our system computes semantic structures in a sentence by sentence fashion and any information useful to carry out anaphoric processes needs to be made available to the following portion of text, and eventually to the Semantic Evaluation that computes entailment. We will comment a number of significant examples to clarify the way in which our system operates. resolution. One such system is shown in Fig. 1 below, where we highlight the architecture and main processes undergoing at the anaphora level. First of all, the subdivision of the system into two levels: Clause level — intrasentential pronominal phenomena — where all pronominal expressions contained in modifiers, adjuncts or complement clauses receive _ their antecedent locally. Possessive pronouns, pronouns contained in relative clauses and complement clauses choose preferentially their antecedents from list of higher level referring expressions. Not so for those pronouns contained in matrix clauses. In particular the ones in subject position are to be coreferred in the discourse. This requires the system to be equipped with a History List of all referring expressions to be used when needed. Pe Oe a Se SO a a as) ee - Se a nee me” Troe

descriptionView Paper arrow_downwardDownload

Adapting a Welsh Terminology Tool to Develop a Cornish Dictionary

by Delyth Prys

2022

Cornish and Welsh are closely related Celtic languages and this paper provides a brief description of a recent project to publish an online bilingual English/Cornish dictionary, the Gerlyver Kernewek, based on similar work previously... more

descriptionView Paper arrow_downwardDownload

Linguistic Computing

Related Topics