Academia.eduAcademia.edu

Local Grammar Graphs

description12 papers
group1 follower
lightbulbAbout this topic
Local Grammar Graphs are structured representations that capture the syntactic and semantic relationships within a specific linguistic context. They facilitate the analysis of language by modeling the rules and patterns governing the formation of phrases and sentences, enabling a detailed examination of grammatical structures in localized settings.
lightbulbAbout this topic
Local Grammar Graphs are structured representations that capture the syntactic and semantic relationships within a specific linguistic context. They facilitate the analysis of language by modeling the rules and patterns governing the formation of phrases and sentences, enabling a detailed examination of grammatical structures in localized settings.

Key research themes

1. How can local grammar graphs (LGGs) be utilized for practical natural language understanding and data generation in domain-specific systems?

This theme focuses on the development and application of local grammar graphs as robust linguistic resources that capture lexico-syntactic patterns for diverse, domain-specific natural language understanding (NLU) tasks. The importance lies in their ability to generate large-scale, high-quality labeled datasets automatically, which address the scarcity and privacy concerns of authentic user data, and facilitate training effective machine learning models for conversational AI in complex domains like law, finance, and customer service.

Key finding: Introduces a graph-theoretic formalization for syntactic workspaces that models local regions of syntactic operations as directed graphs. This foundational approach informs how syntactic structure manipulation can be... Read more
Key finding: Demonstrates the use of local grammar graphs to capture and generalize legal vocabulary and local syntax, enabling the generation of 700 million labeled utterances for training a DIET classifier in Korean legal chatbot NLU.... Read more
Key finding: Develops a modular linguistic resource named FIAD based on local grammar graphs capturing three core linguistic components (TOPIC, EVENT, DISCOURSE MARKER) derived from banking app review corpora. FIAD enables generation of... Read more

2. How do graph-theoretic and topological frameworks advance linguistic theory by modeling syntax and grammar structures as graphs?

This research area addresses theoretical syntactic modeling by leveraging graph theory, topology, and formal grammar graphs to represent syntactic dependencies, workspace operations, and morphological processes. It is significant because it offers precise mathematical characterizations of syntactic derivations and grammar structures, allows new computational interpretations of movement and locality, supports morphological-syntactic integration, and offers a unifying formalism beyond string-based representations.

Key finding: Proposes a topological and graph-theoretic formalization of the syntactic workspace within minimalist syntax as local directed graphs. This approach explicitizes how syntactic operations affect local regions of derivations,... Read more
Key finding: Introduces the concept of the graph of a generative grammar (Γ), a specialized and-or graph that extends classical dependency graphs and finite automata to represent production rules of arbitrary generative grammars,... Read more
Key finding: Presents a systematic method to transform context-free grammars into Directed Marked Graphs (DMGs) annotated with typed nodes representing AND/OR nonterminals and terminals. This representation facilitates grammatical... Read more
Key finding: Develops a formal framework to interpret string languages as graph languages using typed i/o-hypergraphs and sequential graph composition respecting input/output interfaces. This formalism relates existing string grammar... Read more

3. What computational approaches enable unsupervised or weakly supervised learning of construction grammars integrating multi-level, multi-length linguistic patterns?

This theme explores algorithms and computational modeling for induction of construction grammars from corpus data without requiring strong innate linguistic constraints. Emphasis is placed on learning flexible units that generalize across mixed representations ranging from item-specific to schematic forms, including recursive and discontinuous structures. Understanding these learning mechanisms is critical for data-driven grammar acquisition, linguistic typology, and modeling language evolution.

Key finding: Presents an algorithm inducing construction grammars from large corpora by identifying minimal sets of multi-length, multi-level schematic and item-specific constructions, including recursive and discontinuous patterns, based... Read more
Key finding: Employs graph-theoretic concepts, specifically the chromatic number (minimal coloring), to analyze syntactic network evolution during child language acquisition and detect phase transitions from simple two-word structures to... Read more

All papers in Local Grammar Graphs

 Trong ngôn ngữ, chuyển nghĩa là một trong những cách vừa tiện lợi vừa tiết kiệm để phát triển nghĩa của từ. Kết quả của hiện tượng chuyển nghĩa sẽ tạo ra từ đa nghĩa. Từ nghĩa gốc ban đầu của một từ, người ta sẽ dựa vào những mối liên hệ... more
This paper aims to construct a linguistic resource of Korean Multiword Expressions for Feature-Based Sentiment Analysis (FBSA): DECO-MWE. Dealing with multiword expressions (MWEs) has been a critical issue in FBSA since many constructs... more
Natural language understanding (NLU) is integral to task-oriented dialog systems, but demands a considerable amount of annotated training data to increase the coverage of diverse utterances. In this study, we report the construction of a... more
This paper aims to construct a linguistic resource of Korean Multiword Expressions for Feature-Based Sentiment Analysis (FBSA): DECO-MWE. Dealing with multiword expressions (MWEs) has been a critical issue in FBSA since many constructs... more
Chatbots are robots that can communicate with humans using text or voice signals. Legal chatbots improve access to justice, since legal representation and legal advice by lawyers come with a high cost that excludes disadvantaged and... more
The aim of this research is to know English lexical loanwords into Indonesian languages in tourism magazine.  In this research, the writer uses descriptive qualitative method where she describes the corpus of English lexical loanwords... more
Adopting the concept of “Local Grammars” (M. Gross), which were successfully applied in practice by (Geierhos, 2010) to biographical information extraction in English our project aims to detect, encode, and finally visualize relations... more
Natural language understanding (NLU) is integral to task-oriented dialog systems, but demands a considerable amount of annotated training data to increase the coverage of diverse utterances. In this study, we report the construction of a... more
We report the construction of a Korean evaluation-annotated corpus, hereafter called 'Evaluation Annotated Dataset (EVAD)', and its use in Aspect-Based Sentiment Analysis (ABSA) extended in order to cover e-commerce reviews containing... more
This study describes a methodology we adopted for constructing Multilingual Sentiment-Annotated Corpora (named MUSE), that consist of two types of annotated corpora: Sentence-based Sentiment-Annotated Corpora (MUSE-SESAC) and Token-based... more
Finite State Automata (FSA) and their variants are natural tools adapted to the description of various linguistic phenomena which must be dealt with at various points in different types of automatic processing of texts written in Natural... more
NooJ's linguistic engine integrates all its parsers (from the lexical to the syntactic level) with its morphological and paraphrase generators. In particular, both NooJ's syntactic parser and NooJ's transformational generator... more
Syntactic parsing is a major area of NLP which has been widely studied with the help of many approaches. Usually, parsers take in input tagged texts, that is to say texts whose lexical units have been annotated with informations such as... more
This paper presents PEAS, the first comparative evaluation framework for parsers of French whose annotation formalism allows the annotation of both constituents and functional relations. A test corpus containing an assortment of different... more
This paper presents EASY, which has been the first campaign evaluating syntactic parsers on all the common syntactic phenomena and a large set of dependency relations. During this campaign, an annotation scheme has been elaborated with... more
This article presents the methodology of the PASSAGE project, aiming at syntactically annotating large corpora by composing annotations. It introduces the annotation format and the syntactic annotation specifications. It describes an... more
Named Entity Recognition task needs high-quality and large-scale resources. In this paper, we present RENCO, a based-rules system focused on the recognition of entities in the Cosmetic domain (brandnames, product names, …). RENCO has... more
Adopting the concept of “Local Grammars” (M. Gross), which were successfully applied in practice by (Geierhos, 2010) to biographical information extraction in English our project aims to detect, encode, and finally visualize relations... more
In this paper, we present the founding elements of a formal model of the evaluation paradigm in natural language processing. We propose an abstract model of objective quantitative evaluation based on rough sets, as well as the notion of... more
This paper describes the unfolding of the EASy evaluation campaign for french parsers as well as the techniques employed for the participation of laboratory LPL to this campaign. Three symbolic parsers based on a same resource and a same... more
This paper presents a French corpus annotated for multiword nouns. This corpus is designed for investigation in information retrieval and extraction, as well as in deep and shallow syntactic parsing. We delimit which kind of multiword... more
This paper presents a French corpus annotated for multiword expressions (MWEs) with adverbial function. This corpus is designed for investigation on information retrieval and extraction, as well as on deep and shallow syntactic parsing.... more
This paper presents a French corpus annotated for multiword nouns. This corpus is designed for investigation in information retrieval and extraction, as well as in deep and shallow syntactic parsing. We delimit which kind of multiword... more
Using standard methods and formats established at LADL, and adopted by several European research teams to construct largecoverage electronic dictionaries and grammars, we elaborated for Portuguese a set of lexlcal resources, that were... more
Like common noun phrases, proper names contain ambiguous conjoined phrases that make their delimitation and classification difficult in text. This paper presents a finite-state approach to the disambiguation of Portuguese candidate proper... more
Syntactic parsing is a major area of NLP which has been widely studied with the help of many approaches. Usually, parsers take in input tagged texts, that is to say texts whose lexical units have been annotated with informations such as... more
Visual shapes inherent in di↵erent aspects of language processing have been manifesting themselves as important not only for enhancing that process itself, but also for helping solve open problems in ways that are more economical and more... more
This paper aims to construct a linguistic resource of Korean Multiword Expressions for Feature-Based Sentiment Analysis (FBSA): DECO-MWE. Dealing with multiword expressions (MWEs) has been a critical issue in FBSA since many constructs... more
The purpose of this demo is to introduce the linguistic development tool NooJ. The tool has been in development for a number of years and it has a solid community of computational linguists developing grammars in two dozen languages... more
In this paper we present the PASSAGE project which aims at building automatically a French Treebank of large size by combining the output of several parsers, using the EASY annotation scheme. We present also the results of the of the... more
This paper presents EASY (Evaluation of Analyzers of SYntax), an ongoing evaluation campaign of syntactic parsing of French, a subproject of EVALDA in the French TECHNOLANGUE program. After presenting the elaboration of the annotation... more
Existing syntactic grammars of natural languages, even with a far from complete coverage, are complex objects. Assessments of the quality of parts of such grammars are useful for the validation of their construction. We extended a grammar... more
Like common noun phrases, proper names contain ambiguous conjoined phrases that make their delimitation and classification difficult in text. This paper presents a finite-state approach to the disambiguation of Portuguese candidate proper... more
E-learning systems should deliver contents which reflect various phenomena of the language as it is used. E-learning systems that would include real-world Korean expressions such as those in web documents, mobile text messages, or twitter... more
The present paper is written within the framework of the French ANR-Passage project that gathers ten parser developers. The main motivations of the project are to evaluate parsers for French, to test their ac- curacy and robustness on... more
This paper proposes a model for the design of interlanguage corpus with error analysis annotation. The data is obtained from ICNALE (Ishikawa, 2013) corpus, a corpus of English learners in Asia. I focus the extraction on the Indonesian... more
Machine Readable Grammar (MRG) is aimed at supporting the computer to perform Natural Language Processing (NLP) tasks. As for this paper, it discusses the one of the essences of MRG, which is to perform automatic retrieval in a text... more
There is a considerable number for loanwords in Indonesian language as it has been, or even continuously, in contact with other languages. The contact takes place via different media; one of them is via machine readable medium. As the... more
This article presents the methodology of the PASSAGE project, aiming at syntactically annotating large corpora by composing annotations. It introduces the annotation format and the syntactic annotation specifications. It describes an... more
We present the PASSAGE syntactic representation based on syntactic relations, initially developed for French in the scope of national evaluation campaigns. After a brief presentation of the non-nested chunks and syntactic relations of... more
The ongoing project Nomage aims at describing the aspectual properties of deverbal nouns in an empirical way. It is centered on the development of two resources: a semantically annotated corpus of deverbal nouns, and an electronic... more
The local grammar approach was first used to discuss recursive phrases that are commonly found in specialist literature like biochemistry and then extended to extract time, date and address expressions from letters. It has recently been... more
This paper presents a French corpus annotated for multiword expressions (MWEs) with adverbial function. This corpus is designed for investigation on information retrieval and extraction, as well as on deep and shallow syntactic parsing.... more
This paper presents a French corpus annotated for multiword nouns. This corpus is designed for investigation in information retrieval and extraction, as well as in deep and shallow syntactic parsing. We delimit which kind of multiword... more
Existing syntactic grammars of natural languages, even with a far from complete coverage, are complex objects. Assessments of the quality of parts of such grammars are useful for the validation of their construction. We evaluated the... more
Assessments of the quality of parts of syntactic grammars of natural languages are useful for the validation of their construction. We extended a grammar of French determiners that takes the form of a recursive transition network and... more
Download research papers for free!