Papers by Marie-Laure Reinberger
IEEE Intelligent …, Jan 1, 2004
Before we can give industry recommendations for incorporating ontology technology into its IT sys... more Before we can give industry recommendations for incorporating ontology technology into its IT systems, we must consider two types of evaluation: content evaluation and ontology technology evaluation. Evaluating content is a must for preventing applications from using inconsistent, incorrect, or redundant ontologies. It's unwise to publish an ontology that one or more software applications will use without first evaluating it. A well-evaluated ontology won't guarantee the absence of problems, but it will make its use safer. Similarly, evaluating ontology technology will ease its integration with other software environments, ensuring a correct technology transfer from the academic to the industrial world.
On the relevance of performing shallow parsing before clustering
Computational Linguistics in the …, Jan 1, 2002
Proceeding of LREC, Jan 1, 2004
We report on a comparative evaluation carried out in the field of unsupervised text mining. We ha... more We report on a comparative evaluation carried out in the field of unsupervised text mining. We have worked on a parsed medical corpus, on which we have used different statistical measures. Using those measures, we rate the verb-object dependencies and we select the most reliable ones according to each measure. We then apply pattern matching and clustering algorithms to the classes of dependencies in order to build sets of semantically related words and establish semantic links between them. Finally, we evaluate the impact of the statistical measures used for the initial selection of the dependencies on the quality of the results.
Unsupervised text mining for ontology learning
Machine Learning for the Semantic …, Jan 1, 2005
On the Move to …, Jan 1, 2003
Ontologies in current computer science parlance are computer based resources that represent agree... more Ontologies in current computer science parlance are computer based resources that represent agreed domain semantics. This paper first introduces the DOGMA ontology engineering approach that separates "atomic" conceptual relations from "predicative" domain rules. A DOGMA ontology consists of an ontology base that holds sets of intuitive context-specific conceptual relations and a layer of "relatively generic" ontological commitments that hold the domain rules. Secondly, we describe and experimentally evaluate work in progress on a potential method to automatically derive the atomic conceptual relations mentioned above from a corpus of English medical texts. Preliminary outcomes are presented based on the clustering of nouns and compound nouns according to co-occurrence frequencies in the subject-verb-object syntactic context.
Procesamiento del lenguaje …, Jan 1, 1999
Modélisation pour une construction dynamique de la signification: le cas des expressions adjectif-nom
… Learning from Text: Methods, Applications and …, Jan 1, 2005
Ontologies in current computer science parlance are computer based resources that represent share... more Ontologies in current computer science parlance are computer based resources that represent shared conceptualizations for a specific domain. This paper first introduces ontologies in general and subsequently, in particular, shortly outlines the DOGMA ontology learning approach. The paper also introduces the reader in the field of Knowledge Discovery in Text before, in the main part, work in progress is described and experimentally evaluated. It concerns a potential method that ultimately aims at automatically extracting concepts and conceptual relationships from texts in an unsupervised way. Preliminary outcomes are presented based on the clustering of nominal terms and prepositional phrases according to cooccurrence frequencies in the verb-object syntactic context.
Lexically evaluating ontology triples automatically generated from text
Proceedings of the second European Semantic Web …, Jan 1, 2005
Proceedings of the EKAW2004 …, Jan 1, 2004
Our purpose was to devise a method to evaluate the results of extracting semantic relations from ... more Our purpose was to devise a method to evaluate the results of extracting semantic relations from text corpora in an unsupervised way. We have worked on a legal corpus (EU VAT directive) consisting of 43K words. Using a shallow parser, a set of "lexons" has been produced. These are to be used as preprocessed material for the construction of ontologies from scratch. A knowledge engineer has judged that the outcome is useful to support and speed up the ontology modelling task. In addition, a quantitative scoring method (coverage and accuracy measures resulting in a 52.38% and 47.12% score respectively) has been applied.
ECAI Workshop on Ontology Learning and …, Jan 1, 2004
Ontologies in current computer science parlance are computer based resources that represent share... more Ontologies in current computer science parlance are computer based resources that represent shared conceptualizations for a specific domain. This paper first introduces ontologies in general and subsequently, in particular, shortly outlines the DOGMA ontology leaning approach. The paper also introduces the reader in the field of Knowledge Discovery in Text before, in the main part, work in progress is described and experimentally evaluated. It concerns a potential method to automatically extract concepts and conceptual relationships from texts. Preliminary outcomes are presented based on the clustering of nominal terms and prepositional phrases according to co-occurrence frequencies in the verb-object syntactic context. ¦ This is called ontology aligning and merging -e.g.
Unsupervised text mining for designing a virtual web environment
Text Mining Research, Practice and Opportunities, Jan 1, 2005
Ontology Learning from Text: Methods, Applications and Evaluation, chapter Unsupervised Text Mining for the Learning of DOGMA-inspired Ontologies
The Semantic Web: Research and …, Jan 1, 2005
Our purpose is to present a method to lexically evaluate the results of extracting in an unsuperv... more Our purpose is to present a method to lexically evaluate the results of extracting in an unsupervised way material from text corpora to build ontologies. We have worked on a legal corpus (EU VAT directive) consisting of 43K words. The unsupervised text miner has produced a set of triples. These are to be used as preprocessed material for the construction of ontologies from scratch. A quantitative scoring method (coverage, accuracy, recall and precision metrics resulting in a 38.68%, 52.1%, 9.84% and 75.81% scores respectively) has been defined and applied. We make abstraction here of the feasibility of physically connecting these devices and services or agents to a (global) network. 2 See [28] for more details on the semantics of the Semantic Web.

On the Move to …, Jan 1, 2004
We report on an a set of experiments carried out in the context of the Flemish OntoBasis project.... more We report on an a set of experiments carried out in the context of the Flemish OntoBasis project. Our purpose is to extract semantic relations from text corpora in an unsupervised way and use the output as preprocessed material for the construction of ontologies from scratch. The experiments are evaluated in a quantitative and "impressionistic" manner. We have worked on two corpora: a 13M words corpus composed of Medline abstracts related to proteins (SwissProt), and a small legal corpus (EU VAT directive) consisting of 43K words. Using a shallow parser, we select functional relations from the syntactic structure subject-verb-direct-object. Those functional relations correspond to what is a called a "lexon". The selection is done using prepositional structures and statistical measures in order to select the most relevant lexons. Therefore, the paper stresses the filtering carried out in order to discard automatically all irrelevant structures . Domain experts have evaluated the precision of the outcomes on the SwissProt corpus. The global precision has been rated 55%, with a precision of 42% for the functional relations or lexons, and a precision of 76% for the prepositional relations. For the VAT corpus, a knowledge engineer has judged that the outcomes are useful to support and can speed up his modelling task. In addition, a quantitative scoring method (coverage and accuracy measures resulting in a 52.38% and 47.12% score respectively) has been applied.
Uploads
Papers by Marie-Laure Reinberger