Key research themes
1. How can directional (asymmetric) distributional similarity measures improve lexical expansion and related NLP tasks?
This research theme focuses on developing and analyzing distributional similarity measures that are directional, reflecting asymmetric semantic relations such as hyponymy or lexical entailment. Traditional symmetric measures fail to capture these relations effectively. Directional measures quantify the degree of distributional feature inclusion from a more specific term to a more general term, thereby enhancing lexical expansion, information retrieval, and related tasks where asymmetric semantic relations are critical.
2. What are the advantages and computational considerations of embedding complex or variable-length data sequences into metric manifold spaces to facilitate similarity search?
This theme examines methods for representing multivariate, variable-length data sequences—such as text, time series, or trajectories—in a manifold space that preserves meaningful similarity and metric properties. These embeddings address the challenges posed by non-metric and variable-length sequence comparison, enabling effective and computationally feasible similarity search, clustering, and downstream analysis in domains like sensor networks, image retrieval, and linguistics.
3. How do different semantic similarity models and pre-processing techniques impact word and document similarity measurement in constrained and morphologically rich language scenarios?
This research theme explores semantic similarity measures applicable to words and senses, focusing on knowledge-based, distributional, and hybrid models. It additionally investigates the effects of language-specific preprocessing, such as root-based and stem-based techniques for morphologically rich languages like Arabic, on similarity computations. The work emphasizes method selection for constrained computing environments (e.g., IoT), the choice of embedding and lexical resource models, and their impacts on semantic similarity accuracy.