Academia.eduAcademia.edu

Outline

Identification of Text Similarity Based On Context

2021

Abstract

1Student, Dept. of IT Engineering, Pillai College of Engineering, New Panvel, Maharashtra, India 2Student, Dept. of IT Engineering, Pillai College of Engineering, New Panvel, Maharashtra, India 3Student, Dept. of IT Engineering, Pillai College of Engineering, New Panvel, Maharashtra, India ---------------------------------------------------------------------------***---------------------------------------------------------------------------AbstractText similarity computing plays an important role in natural language processing. The similarity calculation of short text is influenced by the small feature of text words and the accuracy is low. so it is a common improvement method to calculate the similarity of short texts with word semantic similarity. The word similarity calculation method combines two word semantic similarity by some strategies. Instead of doing a word for word comparison, we also need to pay attention to context in order to capture more of the semantics. Calculating...

Key takeaways
sparkles

AI

  1. Text similarity relies on semantic understanding, not just lexical overlap, enhancing natural language processing accuracy.
  2. The study tested three algorithms: Random Forest, Recursive Partitioning (rpart), and Boosted tree for efficiency.
  3. Rpart algorithm outperformed others with superior accuracy and minimal running time for classifying research papers.
  4. Cosine similarity measures were utilized to identify the most similar research papers based on their content.
  5. The proposed system involves tokenization, stop word removal, and lemmatization to preprocess text for similarity analysis.

References (7)

  1. Thanh-Phu Nguyen1(B), Mina Ryoke, and Van-Nam Huynh1 "A New Context-Based Similarity Measure for Categorical Data Using Information Theory" Asahidai, Nomi, Ishikawa 923-1292, Japan
  2. Maake Benard Magara, Sunday O. Ojo, Tranos Zuva "A Comparative Analysis of Text Similarity Measures and Algorithms in Research PaperRecommender Systems" Information Communications Technology and Society (ICTAS), (2018).
  3. Haoyu Pu, Gaolei Fei, Hailin Zhao, Guangmin Hu, Chengbo Jiao, Zhoujun Xu "Short Text Similarity Calculation Using Semantic Information" Big Data Computing and Communications, (2017).
  4. Ishrath Jahan C, Abitha E "Context Based Similarity Matching" International Journal of Science and Research (IJSR)
  5. Kuntal Dey, Ritvik Shrivastava, Saroj Kaus "A Paraphrase and Semantic Similarity Detection System for User Generated Short-Text Content on Microblogs" IBM Research India, NIST Delhi , IIT Delhi
  6. Wael H. Gomaa And Aly A. Fahmy"A Survey of Text Similarity Approaches", International Journal of Computer Applications, (2018)
  7. Samuel Fernando and Mark Stevenson "A Semantic Similarity Approach to Paraphrase Detection" ,University of Sheffield