Academia.eduAcademia.edu

Unstructured Information Source

description16 papers
group3 followers
lightbulbAbout this topic
An unstructured information source refers to data that does not have a predefined format or organization, making it difficult to collect, process, and analyze using traditional data management tools. This type of information includes text, images, audio, and video, which require advanced techniques for extraction and interpretation.
lightbulbAbout this topic
An unstructured information source refers to data that does not have a predefined format or organization, making it difficult to collect, process, and analyze using traditional data management tools. This type of information includes text, images, audio, and video, which require advanced techniques for extraction and interpretation.

Key research themes

1. How can ontologies guide the extraction and structuring of information from unstructured text?

This research area focuses on leveraging domain ontologies to improve the precision and semantic richness of information extraction from unstructured text. By linking extracted data to ontology concepts and relationships, it enables structured semantic representations such as RDF triples, which facilitate enhanced querying and integration with semantic web technologies. This theme matters because it bridges the gap between raw textual data and machine-processable structured knowledge, empowering applications like semantic search, question answering, and knowledge graph construction in specific domains.

Key finding: Presented a domain-specific ontology-based information extraction framework that identifies relevant ontologies, extracts semantic triples from unstructured text guided by ontology concepts, and converts these into RDF,... Read more
Key finding: Developed a knowledge graph and mining framework using machine learning and ontological guidance to extract cyber incident information from free text. Utilized non-technical cyber-ontology for entity recognition and linking... Read more
Key finding: Introduced a semantic web-based mediator architecture integrating heterogeneous structured databases and unstructured textual sources using ontologies expressed in OWL. Employed semantic document representations and reverse... Read more

2. What are effective methods for modeling, visualizing, and authoring unstructured and semantic content to enhance usability and exploration?

This theme investigates interfaces and models that integrate unstructured content with underlying semantic representations to enable richer exploration, editing, and understanding. By formalizing mappings between semantic elements and user interface components, these approaches make semantic data more accessible and usable by domain experts and end-users, facilitating personalized views, annotation, and improved interaction with heterogeneous content forms. This line of research is crucial for closing the gap between raw unstructured information and its meaningful use in applications.

Key finding: Presented the WYSIWYM concept formalizing the binding between semantic representation models (e.g., RDFa) and UI elements for authoring, visualization, and exploration of semantic and unstructured content. Introduced RDFaCE... Read more
Key finding: Developed and validated, via Delphi method, a comprehensive usability enhancement model for unstructured text that incorporates multiple usability dimensions, determinants, and enhancement rules. The model systematically... Read more
Key finding: Proposed a generalized expressive JSON-based document model integrating provenance (inspired by PROV-O) and temporal metadata for enriched representation of content extracted by NLP enrichment pipelines. This model supports... Read more

3. How can multidimensional modeling be adapted to represent and analyze unstructured XML documents combining semantic and structural aspects?

This research explores extensions and new models of multidimensional data warehousing to capture the complex structure and semantics of XML documents and textual data. By integrating semantic hierarchies as special dimensions and relaxing classical constraints like dimension orthogonality, such models enable advanced OLAP analysis over document collections, supporting navigation across both structural and semantic levels. This approach enhances the analytical capability on semi-structured and unstructured information, allowing decision-makers to discover insights grounded in document content and context.

Key finding: Introduced the Diamond model combining structural and semantic dimensions in a multidimensional warehouse dedicated to XML document collections. By linking semantic dimensions to conventional ones and breaking orthogonality... Read more

All papers in Unstructured Information Source

CNN Universal Machines that contain two different processors working interactively with each other, have an important impact for image processing applications with their advanced computing features. These processors are named as the... more
Engineering design review meetings are unique opportunities for all the parties involved to share information about the product and its related engineering processes. For product development teams, the knowledge and information transfer... more
Actualmente, una de las mayores amenazas para las empresas es no ser capaces de hacer frente a los cambios constantes que se dan en el mercado, por no predecirlos con la suficiente antelación. Por ello, el desarrollo de nuevos procesos... more
Abstract—Despite of the hope arised a few years ago, Content Based Image Retrieval-CBIR-systems has not reached the initial goal, ie to manage and search images in database: we are unable to link the semantic sens of an image to numerical... more
CNN Universal Machines that contain two different processors working interactively with each other, have an important impact for image processing applications with their advanced computing features. These processors are named as the... more
Abstract—At this time all people, especially managers and businessmen, are exposed to the ever-present information pollution. This is why tools of business intelligence are of great importance; nevertheless the current methods can hardly... more
In this paper, the development of informatics tools and their testing for active learning support is described. The tools -programming scripts, optional menus, and tailored applications, are a part of the developed "Batch Information and... more
An increasing safety and reducing road accidents, thereby by saving lives are one of great interest in the context of Advanced Driver Assistance Systems. Apparently, among the complex and challenging tasks of future road vehicles is road... more
The problem of language barrier in medical consultations is a limitation to healthcare provisioning to the larger population of the developing and underdeveloped world. The doctor to patient ratio is dismally low, especially in cases... more
The problem of language barrier in medical consultations is a limitation to healthcare provisioning to the larger population of the developing and underdeveloped world. The doctor to patient ratio is dismally low, especially in cases... more
Despite of the hope arised a few years ago, Content Based Image Retrieval CBIR systems has not reached the initial goal, ie to manage and search images in database: we are unable to link the semantic sens of an image to numerical values.... more
In this paper, the development of informatics tools and their testing for active learning support is described. The tools-programming scripts, optional menus, and tailored applications, are a part of the developed "Batch Information and... more
We have been witnessing a meticulous expansion in the amount of biological databases as an outcome of the human genome-sequencing project. These biological databases are created and updated by the inventions of new molecules by the... more
We have been witnessing a meticulous expansion in the amount of biological databases as an outcome of the human genome-sequencing project. These biological databases are created and updated by the inventions of new molecules by the... more
In this paper, the development of informatics tools and their testing for active learning support is described. The tools -programming scripts, optional menus, and tailored applications, are a part of the developed "Batch Information and... more
Customer Relationship Management has become the main interest of researchers and practitioners especially in the domains of Marketing and Information Systems (IS). This paper is an overview on success factors that could facilitate... more
Abstract—Despite of the hope arised a few years ago, Content Based Image Retrieval-CBIR-systems has not reached the initial goal, ie to manage and search images in database: we are unable to link the semantic sens of an image to numerical... more
Abstract—Despite of the hope arised a few years ago, Content Based Image Retrieval-CBIR-systems has not reached the initial goal, ie to manage and search images in database: we are unable to link the semantic sens of an image to numerical... more
The problem of language barrier in medical consultations is a limitation to healthcare provisioning to the larger population of the developing and underdeveloped world. The doctor to patient ratio is dismally low, especially in cases... more
The problem of language barrier in medical consultations is a limitation to healthcare provisioning to the larger population of the developing and underdeveloped world. The doctor to patient ratio is dismally low, especially in cases... more
At this time all people, especially managers and businessmen, are exposed to the ever-present information pollution. This is why tools of business intelligence are of great importance; nevertheless the current methods can hardly cope with... more
Download research papers for free!