Key research themes
1. How can semantic and structural information be incorporated into text representations to improve understanding and communication?
This research theme explores methods of representing text beyond simple bag-of-words models by integrating semantic, syntactic, and graphical information to enhance comprehension, disambiguation, and communication effectiveness. Incorporating semantic relations helps bridge gaps stemming from synonymy, polysemy, and structural features of language, which are essential for applications like Augmentative and Alternative Communication (AAC), cognitive modeling, and natural language understanding.
2. What role do dimensionality reduction and embedding methods play in constructing effective low-dimensional text representations?
This theme focuses on how dimensionality reduction techniques, including deep learning autoencoders and embedding models, can produce compact, informative representations of text data. These techniques aim to address the curse of dimensionality inherent in text data, capturing latent semantic structures at various levels (words, sentences, documents) to enhance similarity measurement, classification, and other downstream tasks.
3. How can multimodal and cross-lingual methods be leveraged to generate or expand text representations from other modalities or enhance text understanding?
This research area investigates approaches that integrate information across modalities (e.g., visual to text) and across linguistic levels (e.g., text expansion, normalization, or anaphora resolution) to produce richer, more context-aware textual representations. These methods are crucial for tasks such as image captioning in low-resource languages, text normalization for speech and translation systems, expanding text representations for narratology, and resolving linguistic ambiguities in MT.