Topic Classification

description16 papers

group1 follower

lightbulbAbout this topic

Topic classification is the process of categorizing text or documents into predefined topics or themes using algorithms and machine learning techniques. It involves analyzing the content to identify relevant features and assigning labels that represent the main subject matter, facilitating information retrieval and organization.

lightbulbAbout this topic

Key research themes

1. How do machine learning approaches address document representation and classifier construction for effective automated text classification?

This research focuses on the machine learning (ML) paradigm for automated text classification, emphasizing methods for representing textual data as numerical features, constructing classifiers from labeled datasets, and evaluating their effectiveness. This theme is vital because textual data is inherently high-dimensional and sparse, and successful categorization demands carefully engineered document representations and robust classification algorithms to improve accuracy while ensuring scalability.

Machine Learning in Automated Text Categorization

by JOSEPH ALEXANDER

2020

Key finding: This foundational survey delineates the transformation from knowledge-engineering to machine learning approaches for text classification, highlighting three main problems: document representation (text indexing), classifier... Read more

articleView Paper downloadDownload

A SURVEY OF TEXT CLASSIFICATION ALGORITHMS

by yousef abofathy

2017

Key finding: This survey contextualizes text classification as a variant of the broader classification problem and highlights specific challenges arising from text's high-dimensionality and sparse features. It analyzes various... Read more

articleView Paper downloadDownload

A SURVEY ON MACHINE LEARNING TECHNIQUES FOR TEXT CLASSIFICATION

by IJESRT Journal

2017

Key finding: The study presents the application of supervised machine learning algorithms—Naive Bayes, Vector Space Model (VSM)-based classifiers, and methods incorporating syntactic features (e.g., Stanford Tagger)—evaluating their... Read more

articleView Paper downloadDownload

Abstract feature extraction for text classification

by Banu Diri

2024, Turkish Journal of Electrical Engineering and Computer Sciences

Key finding: This paper introduces a novel supervised feature extraction method that projects high-dimensional document vectors into an abstract feature space with dimensions equal to the number of classes by aggregating evidence for each... Read more

articleView Paper downloadDownload

116. Use of a weighted topic hierarchy for text retrieval and classification

by Adolfo Guzman Arenas

2021

Key finding: This work proposes a classification framework using a hierarchical topic dictionary enhanced by relevance and discrimination weights on keywords and ontology nodes. By propagating weighted relevance scores up the hierarchy,... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

2. Can ontology-based methods enable dynamic, training-free text classification with semantic understanding?

This area investigates approaches that leverage structured domain knowledge in the form of ontologies to classify text without relying on labeled data for training. By modeling documents and classes semantically rather than statistically, these methods allow flexible, dynamic topic definitions and the incorporation of background knowledge, addressing limitations of conventional supervised algorithms with fixed categories and training dependence.

Ontology-based Text Classification into Dynamically Defined Topics

by Mehdi Allahyari

2016

Key finding: This paper introduces an innovative classification approach using ontological contexts as dynamic topics, treating the ontology as a classifier and obviating the need for pre-classified training documents. It employs semantic... Read more

articleView Paper downloadDownload

3. How do topic modeling techniques facilitate the discovery and classification of latent subjects in textual corpora?

Topic modeling represents unsupervised probabilistic methods to extract latent semantic structures from text collections, which aids in document clustering, classification, and exploration. This research theme focuses on statistical models such as Latent Dirichlet Allocation (LDA) and Latent Semantic Analysis (LSA), their extensions, evaluation metrics like topic coherence, and challenges dealing with short texts and semantic ambiguities.

Topic Modeling Using Latent Dirichlet allocation

by Uttam Chauhan

2024, ACM Computing Surveys

Key finding: The paper elaborates on the theoretical underpinnings of LDA in comparison to related methods like LSA and probabilistic LSA, emphasizing LDA’s virtue as a fully generative Bayesian model that assigns topics as latent... Read more

articleView Paper downloadDownload

LSA & LDA topic modeling classification: comparison study on e-books

by shaimaa abdelhafeez

2023, Indonesian Journal of Electrical Engineering and Computer Science

Key finding: This empirical study compares LSA and LDA applied to full-text classification of a large corpus of e-books, finding that while both can effectively cluster documents by topic, LSA shows superiority in some recommendation... Read more

articleView Paper downloadDownload

Topic Modeling Based Classification of Clinical Reports

by Hyeong-Ah Choi

2023

Key finding: The authors develop several topic modeling-based classification frameworks for clinical imaging reports, including binary and aggregate topic classifiers built around LDA. They show that topic distribution features outperform... Read more

articleView Paper downloadDownload

Topic Modeling: Perspectives From a Literature Review

by Andres Grisales

2023, IEEE Access

Key finding: Employing scientometric and bibliometric analyses on publications from Web of Science and Scopus, this study maps the evolution, primary application areas, and prominent models used in topic modeling research. It identifies... Read more

articleView Paper downloadDownload

A Semantics-enhanced Topic Modelling Technique: Semantic-LDA

by Dakshi Tharanga

2024

Key finding: This paper extends the traditional LDA framework by integrating external ontology concepts to capture semantic relationships and address polysemy and word ambiguity more effectively. Semantic-LDA computes dynamic word-concept... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

All papers in Topic Classification

Leveraging Diabetes Prediction using the Deep Learning-based Hybrid ANN-CNN Architecture

by International Journal of Advanced Networking and Applications (IJANA)

2025, Eswar Publications

Diabetes mellitus is a worldwide pandemic chronic metabolic disease that threatens human health seriously. Correct and early prediction of diabetes is one of the important factors for medical treatment and diabetes management. In the... more

descriptionView Paper arrow_downwardDownload

Natural Language Processing: Text Categorization And Classifications

by Mina Atef

2024, International Journal of Advanced Networking and Applications

There are huge data from unstructured text obtained daily from various resources like emails, tweets, social media posts, customer comments, reviews, and reports in many different fields, etc. Unstructured text data can be analyzed to... more

descriptionView Paper arrow_downwardDownload

A Naïve Bayes Approach to Classifying Topics in Suicide Notes

by Michael Arribas-Ayllon

2024, Biomedical informatics insights

The authors present a system developed for the 2011 i2b2 Challenge on Sentiment Classification, whose aim was to automatically classify sentences in suicide notes using a scheme of 15 topics, mostly emotions. The system combines machine... more

descriptionView Paper arrow_downwardDownload

Natural Language Processing: Text Categorization And Classifications

by mina atef

2024, International Journal of Advanced Networking and Applications

descriptionView Paper arrow_downwardDownload

FAKE NEWS DETECTION USING DATASCIENCE

by NADIYA MD

2023, Industrial Engineering Journal

The purpose of this thesis is to assist in automating the detection of Fake News by identifying which features are more useful for different classifiers. The effectiveness of different extracted features for Fake News detection are going... more

descriptionView Paper arrow_downwardDownload

FAKE NEWS DETECTION USING DATASCIENCE

by Priyanka Mannepeli

2023, Industrial Engineering Journal

descriptionView Paper arrow_downwardDownload

Graph-based Techniques for Topic Classification of Tweets in Spanish

by Luis Maximiliano Anto Chiroque

2023, International Journal of Interactive Multimedia and Artificial Intelligence

Topic classification of texts is one of the most interesting challenges in Natural Language Processing (NLP). Topic classifiers commonly use a bag-of-words approach, in which the classifier uses (and is trained with) selected terms from... more

descriptionView Paper arrow_downwardDownload

Text Document Classification basedon Least Square Support Vector Machines with Singular Value Decomposition

by Reddy gnanendra prasad

2023, International Journal of Computer Applications

Due to rapid growth of on-line information, text classification has become one of key technique for handling and organizing text data. One of the reasons to build taxonomy of documents is to make it easier to find relevant documents,... more

descriptionView Paper arrow_downwardDownload

Natural Language Processing: Text Categorization And Classifications

by Mina Atef

2022, International Journal of Advanced Networking and Applications

descriptionView Paper arrow_downwardDownload

Natural Language Processing: Text Categorization And Classifications

by mina atef

2022, International Journal of Advanced Networking and Applications

descriptionView Paper arrow_downwardDownload

Natural Language Processing: Text Categorization And Classifications

by Mario Raouf

2022, International Journal of Advanced Networking and Applications

descriptionView Paper arrow_downwardDownload

Natural Language Processing: Text Categorization And Classifications

by Mina Atef

2022, International Journal of Advanced Networking and Applications

descriptionView Paper arrow_downwardDownload

Comparison of Topic Modeling Algorithms

by tilak satra

2022, INTERNATIONAL JOURNAL OF RECENT TRENDS IN ENGINEERING & RESEARCH

We present a topic identification system for news, which is based upon an evaluation of similarity between the topics and a large amount of documents in the news database. Our system is able to provide the topics for every news samples.... more

descriptionView Paper arrow_downwardDownload

Generating Summaries for Scientific Paper Review

by Ana Sabina Uban

2022, ArXiv

The review process is essential to ensure the quality of publications. Recently, the increase of submissions for top venues in machine learning and NLP has caused a problem of excessive burden on reviewers and has often caused concerns... more

Table 3: Performance of fine-tuned model on abstract and review summary cal comments sections, we separately evaluate our model using the full reviews as targets, as well as against the separate sections (we consider the Strengths and Weaknesses sections), as show in Table 4. We notice that the performance is gen- erally lower than for the review summary, but still comparable. The Strengths section seems to have the most in common with the review summary ac- cording the better results. cal comments sections, we separately evaluate our

descriptionView Paper arrow_downwardDownload

Text Document Classification basedon Least Square Support Vector Machines with Singular Value Decomposition

by ramakrishna murty

2021, International Journal of Computer Applications

descriptionView Paper arrow_downwardDownload

LSA LDA topic modeling classification: comparison study on e-books

by Salam Hassan Mhesn Al-augby

2021, Indonesian Journal of Electrical Engineering and Computer Science

With the rapid growth of information technology, the amount of unstructured text data in digital libraries is rapidly increased and has become a big challenge in analyzing, organizing and how to classify text automatically in E-research... more

descriptionView Paper arrow_downwardDownload

NEWS ARTICLE TEXT CLASSIFICATION AND SUMMARY FOR AUTHORS AND TOPICS

by Computer Science & Information Technology (CS & IT) Computer Science Conference Proceedings (CSCP)

2020

News articles are important for providing timely, historic information. However, the Internet is replete with text that may contain irrelevant or unhelpful information, therefore means of processing it and distilling... more

descriptionView Paper arrow_downwardDownload

Natural Language Processing: Text Categorization and Classifications

by Prof. Mona Nasr and

2020, Int. J. Advanced Networking and Applications

descriptionView Paper arrow_downwardDownload

Arabic Sentiment Analysis using Different Representation Models

by WARSE The World Academy of Research in Science and Engineering

2020, International Journal of Advanced Trends in Computer Science and Engineering (IJATCSE)

Social network users generate a large number of reviews and comments, these reviews and comments express their opinions about on different topics. As a result, there is a great need to understand and classify these opinions. Sentiment... more

Figure. 1: Architecture of a Text Categorization System Sentiment analysis system can be considered as a Text Categorization (TC) system. TC is the task of assigning a text to a predefined category based on its content. Machine Learning is the tool that allows deciding whether a text belongs to a set of predefined categories [14] [15]. In the process of TC, the document must pass through a series of steps (Figure 1): transforming documents into brut text; removing the stop words, which are considered irrelevant word; and finally, all words must be stemmed. To represent the internal of each document, the document must have passed by the process consists of three phases [8]: a) Defining the term set containing all the terms existing in the dataset; b) Term selection, which is the process of selecting a subset of relevant terms without using any information; c) Term weighting, which is the process of calculating a weight for each term selected in phase (b). The weight may be calculated using a weighting scheme such as TF-IDF, which combine the definition of term frequency and inverse document frequency.

Figure. 2: Singular value decomposition of the term-document LSA uses a matrix X whose lines correspond to the documents, the columns correspond to the terms and the components give the presence of the term (or rather, its importance) in each document. A single value decomposition is then performed on X, which gives two orthonormal matrices U and V and a diagonal matrix & (Figure 2). The values of X are the singular values of X.

Figure 3: Graphic model of a typical generative model LDA p(D) =|] deD { J p@dja) pln) TT] t=1,.V ¥ k=L.K p(at = k|6d)p(wt|zt = k,B) d@d dy }

In this section, we present the obtained results of our Arabic sentiment analysis system (Figure 4). The system includes two main steps: preparing the opinions and choosing the representation model to use. Figure 4: Arabic Sentiment Analysis System

descriptionView Paper arrow_downwardDownload

Topic Classification

Key research themes

1. How do machine learning approaches address document representation and classifier construction for effective automated text classification?

2. Can ontology-based methods enable dynamic, training-free text classification with semantic understanding?

3. How do topic modeling techniques facilitate the discovery and classification of latent subjects in textual corpora?

Related Topics

All papers in Topic Classification