Key research themes
1. How can ontology and lexical taxonomy structures improve semantic similarity and relatedness measurement?
This research area focuses on exploiting structured knowledge bases, such as WordNet and domain-specific ontologies, to calculate semantic similarity and relatedness. These methods leverage hierarchical relationships (hypernymy/hyponymy), synonyms, and sometimes meronymy to compute similarity measures that reflect human-like semantic closeness. The importance lies in achieving interpretable, knowledge-driven similarity metrics that outperform purely corpus-based methods in precision and enable applications such as ontology matching, information retrieval, and word sense disambiguation.
2. What corpus-based and distributional semantic models best capture semantic textual similarity in practical applications?
This line of research investigates methods that use statistical information from large corpora and distributional semantics to compute semantic similarity of words, sentences, or documents. These approaches rely on co-occurrence patterns, word embeddings, and vector space models to model meaning based on context and usage frequencies. They aim to deliver scalable, domain-independent solutions often used in natural language processing tasks such as semantic textual similarity, document clustering, and short text similarity.
3. Can lexico-syntactic pattern-based and hybrid lexical-corpus methods provide effective semantic similarity without reliance on hand-crafted knowledge bases?
This theme explores semantic similarity measures derived from automatically harvested lexical patterns and statistical co-occurrence, often implemented via pattern extraction or web-based statistics. The goal is to achieve wide coverage and reasonable precision without depending on curated resources like WordNet, which have limited domain coverage. These methods facilitate scalable semantic similarity computation applicable to named entity similarity, relation extraction, and semantic search.