Academia.eduAcademia.edu

Text Mining and Information Retrieval

description217 papers
group1,280 followers
lightbulbAbout this topic
Text Mining and Information Retrieval is the interdisciplinary field that focuses on the extraction of meaningful information and patterns from unstructured text data, utilizing techniques from natural language processing, machine learning, and data mining to enhance the retrieval and organization of relevant information from large text corpora.
lightbulbAbout this topic
Text Mining and Information Retrieval is the interdisciplinary field that focuses on the extraction of meaningful information and patterns from unstructured text data, utilizing techniques from natural language processing, machine learning, and data mining to enhance the retrieval and organization of relevant information from large text corpora.
DC Universe is a fictional universe in which a collection of superheroes and super villains based on characters that appear in comic books by DC Comics is in it. DC Comics itself is the largest and oldest comic book publisher that... more
In recent years, impressive attention has been given for mining the publically available huge amount of data to gain situational awareness, which may help in preventing or decrease the effect of some disaster by taking the correct... more
The world is intrigued by data. In fact, huge capitals are invested to devise means that implements statistics and extract analytics from these sources. However, when we examine the studies performed on applicant tracking systems that... more
This paper introduces a three-step methodology of identifying skills demand on labour markets. By accessing publicly available vacancy data, with web and text mining tools, we are able to extract valuable facts about competences and... more
Natural Language Processing is a programmed approach to analyze text that is based on both a set of theories and a set of technologies. This forum aims to bring together researchers who have designed and build software that will analyze,... more
Natural Language Processing is a programmed approach to analyze text that is based on both a set of theories and a set of technologies. This forum aims to bring together researchers who have designed and build software that will analyze,... more
Each argument begins with a conclusion, which is followed by one or more premises supporting the conclusion. The warrant is a critical component of Toulmin's argument model; it explains why the premises support the claim. Despite its... more
The availability of lexical resources is huge to accelerate and simplify the sentiment analysis in English. In Arabic, there are few resources and these resources are not comprehensive. Most of the current research efforts for... more
Natural Language Processing is a programmed approach to analyze text that is based on both a set of theories and a set of technologies. This forum aims to bring together researchers who have designed and build software that... more
O presente estudo teve por objetivo propor um processo de mineração de conteúdos em mídias sociais para auxiliar na gestão de destinos turísticos composto por sete fases, elaborado com base nas metodologias propostas por Neves (2013),... more
Information digitization makes information dissemination faster, actual, and cheaper. The information disseminated occurs in the form of text, which contains much of the information contained in it. Because of the vast amount of important... more
An essential part of bioinformatic research concerns the iterative process of validating hypotheses by analyzing facts stored in databases and in published literature. This process can be enhanced by language technology methods, in... more
Natural Language Processing is a programmed approach to analyze text that is based on both a set of theories and a set of technologies. This forum aims to bring together researchers who have designed and build software that will analyze,... more
Text mining is the process of extracting interesting and non-trivial knowledge or information from unstructured text data. Text mining is the multidisciplinary field which draws on data mining, machine learning, information retrieval,... more
Belakangan ini masyarakat banyak disuguhkan berbagai macam situs berita mulai dari thin-content dengan kategori berita yang sedikit sampai dengan yang rich-content dengan berbagai macam kategori. Tapi biasanya situs berita tersebut... more
Dokumen teks tergolong dalam data tidak terstruktur. Jika dibandingkan dengan informasi yang tersimpan dalam bentuk yang terstruktur (misalnya pada tabel dalam sebuah database), maka data tidak terstruktur relatif lebih sulit dalam hal... more
The autism spectrum disorder (ASD) is increasingly being recognized as a major public health issue which affects approximately 0.5-0.6% of the population. Promoting the general awareness of the disorder, increasing the engagement with the... more
Письменные источники - это документы, с которыми связана история как наука. Документы истории и история документов тесно переплетены. Автоматический контент-анализ и компьютерная обработка исторических текстов обрели свое собственное... more
There are two pre-nominal anaphoric demonstratives in Japanese: kono and sono. I had revealed some differences between the demonstratives in previous papers. Some of them are as follows: kono can be used in rewording, case in which the... more
This study applied text mining techniques, machine learning approaches and statistical methods to construct a predictive model of a prioritized English vocabulary list to help nonnative English speakers prepare for college entrance... more
With the advancement of technology and reduced storage costs, individuals and organizations are tending towards the usage of electronic media for storing textual information and documents. It is time consuming for readers to retrieve... more
The study provides the frequency distribution of significant terms over past time periods for text trend analysis via an Indonisai newspaper. The approach consists of two steps:(1) Data Preprocessing (2) significant term extraction and term... more
Data mining is the knowledge discovery in databases and the gaol is to extract patterns and knowledge from large amounts of data. The important term in data mining is text mining. Text mining extracts the quality information highly from... more
Volcanic activity may influence climate parameters and impact people safety, and hence monitoring its characteristic indicators and their temporal evolution is crucial. Several databases, communications and literature providing data,... more
Since its inception in 2001, the Center for Management and Strategic Studies (CGEE) has as its main activity the conduct of foresight studies in support of the decision making process related to the establishment of ST&I policies and... more
Now a day's sentiment analysis performs a very vital role in text mining. In essence web mining is a very broad area in a data mining field for extracts the sentiment of the text. To identify the sentiment of the textual data is a very... more
Content-Based Image Retrieval (CBIR) locates, retrieves and displays images alike to one given as a query, using a set of features. It demands accessible data in medical archives and from medical equipment, to infer meaning after some... more
A huge array of personalized healthcare and wellness systems are introduced into the portfolio of digital health and quantified-self movement in recent years. These systems share common capabilities including self-tracking/monitoring and... more
Efficient Market Hypothesis is the popular theory about stock prediction. With its failure much research has been carried in the area of prediction of stocks. This project is about taking non quantifiable data such as financial news... more
Vector Space Model, Cosine Similarity, Part of Speech Tagging (POS - Tagging) Hidden Markov Model (HMM) Information Extraction dengan Algortima Naive Bayes Based NER dan Peringkasan Teks atau text summarization pada Text Mining Teknik... more
by Irina Pak and 
1 more
Minimal research has been done on how letter repetition affects readers' perception of expressed sentiment within a text. To the best of the researchers' knowledge, no studies have tested samples of text with letter repetition using... more
This paper analyses the features of Web Search Engines, Vertical Search Engines, Meta Search Engines, and proposes a Meta Search Engine for searching and retrieving documents on Multiple Domains in the World Wide Web (WWW). A web search... more
The survey conducted for this study reveals that more than 84% of respondents have never encountered the term " agile commerce " and do not understand its meaning. At the same time, they are active participants of this strategy. Using... more
The world is intrigued by data. In fact, huge capitals are invested to devise means that implements statistics and extract analytics from these sources. However, when we examine the studies performed on applicant tracking systems that... more
Rich information is scattered under Indonesian Choral Lovers (ICL) mailing list and many of its members prefer posting a query-mail to using the available search engine. A text retrieval system based on ontology is then proposed. However,... more
Various embodiments of the present invention provide systems, methods, and computer programs for generating a hypothesis. Specifically, some method embodiments include steps for accessing a system for extracting relationships and... more
Now a days, huge amount data has generated on the internet and it is important to extract useful information from that huge data. Different data mining techniques are used to extract and implement to solve divers types of problems. In the... more
Digital text documents are spread in various formats, the most widely used formats today include word format, and PDF format. This research will try to make text search application in text document using vector space approach model. The... more
I. Introduction: This documentation presents the installation notes for ARDA supervisor and analyst interfaces. The main feature of the supervisor interface is to allow a supervisor to assign a task to an analyst. A task is defined to be... more
Knowledge about climate change is very important in human life. The information gained can be used to learn about time period of weather season from January to December. Indonesia is a tropical country with two weather seasons, they are... more
We aim to model an adaptive log file parser. As the content of log files often evolves over time, we established a dynamic statistical model which learns and adapts processing and parsing rules. First, we limit the amount of unstructured... more
The world is intrigued by data. In fact, huge capitals are invested to devise means that implements statistics and extract analytics from these sources. However, when we examine the studies performed on applicant tracking systems that... more
ABSTRACT
The improvement of health and nutritional status of the society has been one of the thrust areas for social developments programmes of the country. The present states of healthcare facilities in India are inadequate when compared to... more
[The document is in portuguese] O imenso volume de documentos textuais armazenados em meios eletrônicos e disponíveis em repositórios públicos ou internamente em uma organização são fontes importantes de conhecimento. As áreas de... more
Several studies in the literature have shown that the words people use are indicative of their psychological states. In particular, depression was found to be associated with distinctive linguistic patterns. However, there is a lack of... more
Download research papers for free!