An Ontology based document management
2008
Sign up for access to the world's latest research
Abstract
In this article an approach to the problem of associations of documents with a knowledge base is demonstrated in a real world application. It is based on combination of annotating documents with concepts from a knowledge base and grouping documents together into clusters. Our knowledge base is an ontology provided by a dedicated ontology server.
Related papers
ADHO, 2018
Explains and presents the three-part ontology of documents, acts of communication and texts developed by the author, proposing that a text is an instance of an act of communication inscribed in a document. Accordingly, all texts have a double aspect: they are acts of communication, and they are present in documents. Each aspect may be represented as a tree, with each tree independent of the the other. Text may therefore be conceived as a collection of leaves, with each leaf present on both the document and act of communication trees. The talk describes briefly how this is model is implemented in the textual communities system.
2010
Software Documentation is an important mean for stakeholders to collaborate in the software development context. However, several works point out that gathering relevant information from different documents can be so wearing that involved people may tend not to do it. Combining ontologies and documents by adding semantic annotations to documents can help diminish the burden of gathering information later on. However, this approach also adds an overhead in documentation, concerning the time spent on document annotation. In order to overcome some of these obstacles, we developed an infrastructure for managing semantic documents, combining semantic annotation on document templates, versioning data extracted from semantic documents and notifying interested people when extracted data has changed.
2004
Abstract. We describe the recent enhancement of the CAFETIERE formalism (Conceptual Annotation of Facts, Events, Terms, Individual Entities and RElations) with the ability to link natural language words and phrases in textual documents with instances and classes from a language-enabled ontology. The language-enabled ontology is one with an index from one or more natural language expressions to each concept (as in WordNet). In an information extraction application.
Emerging Challenges, Solutions, and Best Practices for Digital Enterprise Transformation, 2021
This study presents a method for the storage of data organized in digital documents, which is proven in practice. The discussed method does not bear any disadvantages of the relational model used for data organization, such as the loss of data context and complications evoked by the lack of data redundancy. The method presented here can be used for data organization into documents (digital and paper) as classified aggregates and for data classification. The study also describes a new metamodel for the data structure which assumes that documents, being data structures, form compact aggregates, classified as objects, or event descriptions, thus always assigning them a specific and unambiguous context. Furthermore, the study presents a design method for documents as context aggregates that allows level-ing the disadvantages of the relational model and ensures efficient information management. The work also contains practical examples of the application of the described method.
Knowledge is a critical resource but we still do not have many new ideas on how to manage it. Most (online) knowledge is currently kept in conventional documents that are hard to structure, classify, browse, search, and even find. Organizations struggle with masses of such documents in hundreds of formats. Classical AI has largely ignored this real and serious problem, and while information retrieval research has tackled some of the problems, it is totally at odds with how AI tries to deal with knowledge problems. Cooperative work systems such as the Web and Lotus Notes are beginning to tackle that aspect. Database systems can contribute much of the required functionality. Hence we seek to integrate functionality and ideas from these sources.
Journal of Computer and Knowledge Engineering, 2020
One of the most challenging aspects of developing information systems is the processing and management of large volumes of information. One way to overcome this problem is to implement efficient data indexing and classification systems. As large volumes of generated data comprise of non-structured textual data, developing text processing, management and indexing frameworks can play an important role in providing users with accurate information according to their preferences. In this paper, a novel method of semantic information processing, management and indexing is introduced. The main goals of this study is to integrate structured knowledge of ontology and Knowledge Bases (KBs) in the core components of the method, to enrich the contents of the documents, to have multi-level semantic network representation of textual resources, to introduce a hybrid weighting schema (salient score) and finally to propose a hybrid method of semantic similarity computation. The structured knowledge of ontology and KBs are integrated from all aspects of the proposed method. The obtained results indicate the accuracy and optimal performance of the proposed framework. The obtained results suggest that using knowledge-based models leads to higher performance and accuracy in identifying and classifying documents according to user preferences; however, if learning-based models are not provided with sufficient amount of training data, they cannot yield satisfying results. The results also demonstrate that the complete integration of ontology and KBs in information systems can significantly contribute to a better representation of documents and evidently superior functionality of information processing, management and indexing systems.
2000
Abstract The CONCERTO project is concerned with the creation and management of knowledge repositories. The distinctive approach is to maintain an association between the textual form in which knowledge is expressed in source documents, and an expressive narrative knowledge representation language that supports inference and query operations.
2009
As an extension of the Web, in the highway of the construction of the Semantic Web we find the same problems such as the difficulty to share and reuse knowledge. The aim of this article is to present the development of an ontology in the context of a digital library, based on the use of Natural Language Processing (NLP) tools. Our approach is based on the analysis of scientific documents and the use of the tool for acquisition of terms called Nomino. A corpus was treated by extracting noun phrases in order to been used with LIKES, a tool capable to identify relationships between concepts. The final ontology was modeled using Protégé-2000. This way, our ontology provides a comprehensive representation of scientific terms and it's used to enhance user's requests.
2007
Being able to create views on the document space via grouping the documents is a key functionality in intelligent document management in view of browsing and querying. Hierarchically grouped sets of Documents can be viewed as simple extensionally defined ontological concepts. In an example Knowledge Management system (KnowCat) developed at UAM, Madrid, we investigate how agents for the maintenance of this ontology (these document groupings) can be constructed. We discuss two examples: A classification agent and a maintenance agent support users and administrators of the system to keep the ontology tight and functional. The agents are tested, developed targeted toward Spanish natural language documents, which requires adapted NLP techniques.
2010
The architecture of the future Digital Libraries should be able to allow any users to access available knowledge resources from anywhere and at any time and efficient manner. Moreover to the individual user, there is a great deal of useless information in addition to the substantial amount of useful information. The goal is to investigate how to best combine Artificial Intelligent and Semantic Web technologies for semantic searching across largely distributed and heterogeneous digital libraries. The Artificial Intelligent and Semantic Web have provided both new possibilities and challenges to automatic information processing in search engine process. The major research tasks involved are to apply appropriate infrastructure for specific digital library system construction, to enrich metadata records with ontologies and enable semantic searching upon such intelligent system infrastructure. We study improving the efficiency of search methods to search a distributed data space like a Digital Library. This paper outlines the development of a Case-Based Reasoning prototype system based in an ontology for retrieval information of the Digital Library University of Seville. The results demonstrate that the used of expert system and the ontology into the retrieval process, the effectiveness of the information retrieval is enhanced.

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
References (4)
- Gruber, T., R. (1993): A translation approach to portable ontologies. Knowledge Acquisition, 5(2):199- 220.
- Mach, M.; Dridi, F.; Furdik, K. (2001): Webocrat System Architecture and Functionality. Webocracy report 2.4.
- Noy, N., F.; Fergerson, R., W.; Musen, M., A. (2000): The knowledge model of Protégé-2000: combining interoperability and flexibility. International Conference on Knowledge Engineering and Knowledge Management (EKAW '2000), Juan-les-Pins, France.
- Sabol, T.; Jackson, M.; Dridi, F.; Palola, I.; Novacek, E.; Cizmarik, T.; Thompson, P. (2001): Dissemination and Use Plan. Webocracy report 15.2.1.