Skip to main content
The paper presents a method for automatic detection of "non-trivial" word combinations in the text. It is based on automatic syntactic analysis. The method shows better precision and recall than the baseline method (bigrams). It was... more
    • by 
    • Syntactic Analysis
The problem of document categorization is considered. The set of domains and the keywords specific for these domains is supposed to be selected beforehand as initial data. We apply the well-known statistical hypothesis test that considers... more
    • by 
    •   5  
      StatisticsNatural Language ProcessingComputational LinguisticsApplied Linguistics
La Elicitación de Requisitos de software es un proceso básico para garantizar la calidad del software y por lo general se realiza entre los Analistas y los Interesados en Lenguaje Natural, para obtener una especificación; dicha... more
    • by 
Particle Swarm Optimization (PSO) technique has proved its ability to deal with very complicated optimization and search problems. Several variants of the original algorithm have been proposed. This paper proposes a novel hybrid PSO... more
    • by 
    •   5  
      Swarm IntelligenceEvolutionary AlgorithmHybrid AlgorithmSpeed of Convergence
The fast growth of Internet is creating a society where the demand on information storage, organization, access, and analysis services is continuously growing. This constantly increases the number of inexperienced users that need to... more
    • by 
    •   5  
      Relational DatabaseSpanishText to SpeechInformation Analysis
We have previously suggested that Internet word usage statistics can help selecting a word choice variant out of a set of hypotheses. In this paper we show how Googe's wildcard operator can help either to directly obtain the specific word... more
    • by 
    • by 
The use of conceptual graphs for the representation of text contents in information retrieval is discussed. A method for measuring the similarity between two texts represented as conceptual graphs is presented. The method is based on... more
    • by 
    •   2  
      Information RetrievalInformation Retrieval System
Free access to full-text scientific papers in major digital libraries and other web repositories is limited to only their abstracts consisting of no more than several dozens of words. Current keyword-based techniques allow for clustering... more
    • by 
    • Natural Language Processing
    • by 
A simple representation framework for ontological knowledge with dynamic and deontic characteristics is presented. It represents structural relationships (is-a, part/whole), dynamic relationships (actions such as register, pay, etc.), and... more
    • by  and +1
    •   6  
      Software EngineeringArtificial IntelligenceOntologyUnified Modelling Language
Malapropism is a (real-word) error in a text consisting in unintended replacement of one content word by another existing content word similar in sound but semantically incompatible with the context and thus destructing text cohesion,... more
    • by 
The main objective of Bolshakov and Gelbukh's new textbook in computational linguistics is to provide students in computer science with a foundation in the fundamentals of general linguistics necessary to develop applied software systems,... more
    • by 
    •   2  
      Cognitive ScienceComputational Linguistics
The problem of automatic text segmentation is subcategorized into two different problems: thematic segmentation into rather large topically selfcontained sections and splitting into paragraphs, i.e., lexico-grammatical segmentation of... more
    • by 
    • Text Segmentation
Conceptual graphs allow for powerful and computationally affordable representation of the semantic contents of natural language texts. We propose a method of comparison (approximate matching) of conceptual graphs. The method takes into... more
    • by 
    •   3  
      Information RetrievalSemanticsText
    • by 
We propose a method of synonymous paraphrasing of a text based on WordNet synonymy data and Internet statistics of stable word combinations (collocations). Given a text, we look for words or expressions in it for which WordNet provides... more
    • by 
    • Information Hiding
It is argued that the collections of combinations of content words in the modern general-purpose electronic dictionaries are to be broadened as much as possible.
    • by 
Resumen: Presentamos una técnica heurística para convertir un corpus anotado sintácticamente dentro del formalismo de constituyentes, a un corpus anotado dentro del formalismo de dependencias. Particularmente comentamos sobre nuestra... more
    • by  and +1
Prepositional Phrase (PP) attachment can be addressed by considering frequency counts of dependency triples seen in a non-annotated corpus. However, not all triples appear even in very big corpora. To solve this problem, several... more
    • by 
    • Computational Linguistics