Claim Detection in Persian Twitter Posts
2024, International Journal of Information and Communication Technology Research
Abstract
The proliferation of false information on social media has profound negative impacts across various aspects of people's lives. To mitigate these effects, numerous studies have focused on developing automated factchecking systems aimed at enhancing the accuracy and reliability of news and information. Claim detection, recognized as the initial stage in constructing such systems, has been explored in several languages. In our paper, we introduce a corpus of Persian tweets annotated with 11 labels derived from linguistic analysis, representing different types of claims. Additionally, we establish a baseline claim detection model to assess the dataset. This study frames claim detection as a classification task and employs a transformer-based approach to train a multi-label classifier capable of identifying various types of claims in Persian texts.
References (84)
- Herman, Edward S., and Noam Chomsky. Manufacturing con- sent: The political economy of the mass media. Random House, 2010.
- Bradshaw, Samantha, and Philip N. Howard. "The global dis- information order: 2019 global inventory of organised social media manipulation." (2019).
- Shu, Kai, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. "Fake news detection on social media: A data mining per- spective." ACM SIGKDD explorations newsletter 19, no. 1 (2017): 22-36.
- Zubiaga, Arkaitz, Ahmet Aker, Kalina Bontcheva, Maria Lia- kata, and Rob Procter. "Detection and resolution of rumours in social media: A survey." ACM Computing Surveys (CSUR) 51, no. 2 (2018): 1-36.
- Bastick, Zach. "Would you notice if fake news changed your behavior? An experiment on the unconscious effects of disin- formation." Computers in human behavior 116 (2021): 106633.
- Ecker, Ullrich KH, Stephan Lewandowsky, John Cook, Philipp Schmid, Lisa K. Fazio, Nadia Brashier, Panayiota Kendeou, Emily K. Vraga, and Michelle A. Amazeen. "The psychologi- cal drivers of misinformation belief and its resistance to cor- rection." Nature Reviews Psychology 1, no. 1 (2022): 13-29.
- Plotnick, Linda, Starr Hiltz, Sukeshini Grandhi, and Julie Dug- dale. "Real or fake? User behavior and attitudes related to de- termining the veracity of social media posts." arXiv preprint arXiv:1904.03989 (2019).
- Featherstone, Jieyu Ding, C. A. Davis, and J. Zhang. "Correct- ing Vaccine Misinformation on Social Media Using Fact- checking Labels."
- In APHA's 2019 annual meeting and expo. 2019.
- Roozenbeek, Jon, Claudia R. Schneider, Sarah Dryhurst, John Kerr, Alexandra LJ Freeman, Gabriel Recchia, Anne Marthe Van Der Bles, and Sander Van Der Linden. "Susceptibility to misinformation about COVID-19 around the world." Royal So- ciety open science 7, no. 10 (2020): 201199.
- Greene, Ciara M., and Gillian Murphy. "Quantifying the ef- fects of fake news on behavior: Evidence from a study of COVID-19 misinformation." Journal of Experimental Psychol- ogy: Applied (2021).
- Zhou, Cheng, Haoxin Xiu, Yuqiu Wang, and Xinyao Yu. "Characterizing the dissemination of misinformation on social media in health emergencies: An empirical study based on COVID-19." Information Processing and Management 58, no. 4 (2021): 102554.
- Van Der Linden, Sander, Costas Panagopoulos, and Jon Roozenbeek. "You are fake news: political bias in perceptions of fake news." Media, Culture and Society 42, no. 3 (2020): 460-470.
- Muqsith, Munadhil Abdul, R. Ridho Pratomo, Anna Gustina Zaina, and Ana Kuswanti. "Fake News as a Tool to Manipulate the Public with False Information." In 2nd International Indo- nesia Conference on Interdisciplinary Studies (IICIS 2021), pp. 118-127. Atlantis Press, 2021.
- Polak, Mateusz. "The misinformation effect in financial mar- kets: An emerging issue in behavioural fianance." e-Finanse: Financial Internet Quarterly 8, no. 3 (2012): 55-61.
- Kogan, Shimon, Tobias J. Moskowitz, and Marina Niessner. "Fake news: Evidence from financial markets." Available at SSRN 3237763 (2019).
- Rich, Patrick R., and Maria S. Zaragoza. "The continued inu- ence of implied and explicitly stated misinformation in news reports." Journal of experimental psychology: learning, memory, and cognition 42, no. 1 (2016): 62.
- Thorson, Emily. "Belief echoes: The persistent effects of cor- rected misinformation." Political Communication 33, no. 3 (2016): 460-480.
- Desai, Saoirse Connor, and Stian Reimers. "Some misinfor- mation is more easily countered: An experiment on the contin- ued inuence effect." In CogSci. 2018.
- Wang, Xuezhi, Cong Yu, Simon Baumgartner, and Flip Korn. "Relevant document discovery for fact-checking articles." In Companion Proceedings of the The Web Conference 2018, pp. 525-533. 2018.
- Jo, Saehan, Immanuel Trummer, Weicheng Yu, Xuezhi Wang, Cong Yu, Daniel Liu, and Niyati Mehta. "Verifying text sum- maries of relational data sets." In Proceedings of the 2019 In- ternational Conference on Management of Data, pp. 299-316. 2019.
- Trokhymovych, Mykola, and Diego Saez-Trumper. "Wikicheck: An end-to-end open source automatic fact-check- ing api based on wikipedia." In Proceedings of the 30th ACM International Conference on Information and Knowledge Man- agement, pp. 4155-4164. 2021.
- Graves, Lucas, Brendan Nyhan, and Jason Reier. "Understand- ing innovations in journalistic practice: A field experiment ex- amining motivations for fact-checking." Journal of Communi- cation 66, no. 1 (2016): 102-138.
- Graves, D. "Understanding the promise and limits of automated factchecking." (2018).
- Lippi, Marco, and Paolo Torroni. "Context-independent claim detection for argument mining." In Twenty-Fourth Interna- tional Joint Conference on Artificial Intelligence. 2015.
- Stab, Christian, and Iryna Gurevych. "Parsing argumentation structures in persuasive essays." Computational Linguistics 43, no. 3 (2017): 619-659.
- Konstantinovskiy, Lev, Oliver Price, Mevan Babakar, and Arkaitz Zubiaga. "Toward automated factchecking: Develop- ing an annotation schema and benchmark for consistent auto- mated claim detection." Digital Threats: Research and Practice 2, no. 2 (2021): 1-16.
- Shin, Jieun, Lian Jian, Kevin Driscoll, and Francois Bar. "The diffusion of misinformation on social media: Temporal pattern, message, and source." Computers in Human Behavior 83 (2018): 278-287.
- Wu, Liang, Fred Morstatter, Kathleen M. Carley, and Huan Liu. "Misinformation in social media: definition, manipulation, and detection." ACM SIGKDD Explorations Newsletter 21, no. 2 (2019): 80-90.
- Levy, Ran, Yonatan Bilu, Daniel Hershcovich, Ehud Aharoni, and Noam Slonim. "Context dependent claim detection." In Proceedings of COLING 2014, the 25th International Confer- ence on Computational Linguistics: Technical Papers, pp. 1489-1500. 2014.
- Lippi, Marco, and Paolo Torroni. "Argument mining: A ma- chine learning perspective." In International Workshop on The- ory and Applications of Formal Argumentation, pp. 163-176.
- Springer, Cham, 2015.
- Vlachos, Andreas, and Sebastian Riedel. "Fact checking: Task definition and dataset construction." In Proceedings of the ACL 2014 workshop on language technologies and computa- tional social science, pp. 18-22. 2014.
- Daniel, Anna, Terry Flew, and Christina Spurgeon. "The prom- ise of computational journalism." In Proceedings of the Aus- tralian and New Zealand Communication Association (ANZCA) Conference 2010: Media, Democracy and Change, pp. 1-19. Australia and New Zealand Communication Associ- ation, 2010.
- Cohen, Sarah, Chengkai Li, and Jun Yang. "C. Yu. Computa- tional journalism: A call to arms to database researchers." CIDR, 2011.
- Graves, D. "Understanding the promise and limits of automated factchecking." (2018).
- Guo, Zhijiang, Michael Schlichtkrull, and Andreas Vlachos. "A survey on automated fact-checking." Transactions of the Asso- ciation for Computational Linguistics 10 (2022): 178-206.
- Lazarski, Eric, Mahmood Al-Khassaweneh, and Cynthia How- ard. "Using NLP for Fact Checking: A Survey." Designs 5, no. 3 (2021)
- Walton, Douglas. "Argumentation theory: A very short intro- duction." In Argumentation in artificial intelligence, pp. 1-22.
- Springer, Boston, MA, 2009.
- Moens, Marie-Francine, Erik Boiy, Raquel Mochales Palau, and Chris Reed. "Automatic detection of arguments in legal texts." In Proceedings of the 11th international conference on Artifi- cial intelligence and law, pp. 225-230. 2007.
- Palau, Raquel Mochales, and Marie-Francine Moens. "Argu- mentation mining: the detection, classification and structure of arguments in text." In Proceedings of the 12th international conference on arti-ficial intelligence and law, pp. 98-107. 2009.
- Saint-Dizier, Patrick. "Processing natural language arguments with the< TextCoop> platform." Argument and Computation 3, no. 1 (2012): 49-82.
- Cabrio, Elena, and Serena Villata. "Natural language argu- ments: A combined approach." In ECAI 2012, pp. 205-210. IOS Press, 2012.
- Stylianou, Nikolaos, and Ioannis Vlahavas. "Transformed: End- to-End transformers for evidence-based medicine and argu- ment mining in medical literature." Journal of Biomedical In- formatics 117 (2021):103767.
- Al Khatib, Khalid, Tirthankar Ghosal, Yufang Hou, Anita deWaard, and Dayne Freitag. "Argument mining for scholarly document processing: Taking stock and looking ahead." In Proceedings of the Second Workshop on Scholarly Document Processing, pp. 56-65. 2021.
- Mebane, Waleed. "Detection of Claims and Supporting Evi- dence in Wikipedia Articles on Controversial Topics." PhD diss., 2017.
- Lawrence, John, and Chris Reed. "Combining argument mining techniques." In Proceedings of the 2nd Workshop on Argu- mentation Mining, pp. 127-136. 2015.17
- Budzynska, Katarzyna, Mathilde Janier, Juyeon Kang, Chris Reed, Patrick Saint-Dizier, Manfred Stede, and Olena Yaskorska. "Towards argument mining from dialogue." In Computational Models of Argument, pp. 185-196. IOS Press, 2014.
- Patwari, Ayush, Dan Goldwasser, and Saurabh Bagchi. "Tathya: A multi-classifier system for detecting check-worthy statements in political debates." In Proceedings of the 2017 ACM on Conference on Information and Knowledge Manage- ment, pp. 2259-2262. 2017.
- Hassan, Naeemul, Gensheng Zhang, Fatma Arslan, Josue Cara- ballo, Damian Jimenez, Siddhant Gawsane, Shohedul Hasan et al. "Claimbuster: The first-ever end-to-end fact-checking sys- tem." Proceedings of the VLDB Endowment 10, no. 12 (2017): 1945-1948.
- Gencheva, Pepa, Preslav Nakov, Liuis Marquez, Alberto Barron Cede, and Ivan Koychev. "A context-aware approach for de- tecting worth-checking claims in political debates." In Pro- ceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pp. 267-276. 2017.
- Hanto, Vigdis, and Mats Tostrup. "Towards Automated Fake News Classification-On Building Collections for Claim Anal- ysis Research." Master's thesis, NTNU, 2018.
- Atanasova, Pepa, Preslav Nakov, Georgi Karadzhov, Mitra Mohtarami, and Giovanni Da San Martino. "Overview of the CLEF-2019 CheckThat! Lab: Automatic Identification and Verification of Claims. Task 1: Check-Worthiness." CLEF (Working Notes) 2380 (2019).
- Shaar, Shaden, Alex Nikolov, Nikolay Babulkov, Firoj Alam, Alberto Barron-Cedeno, Tamer Elsayed, Maram Hasanain et al. "Overview of CheckThat! 2020 English: Automatic identi- fication and verification of claims in social media." In CLEF (Working Notes). 2020.
- Atanasova, Pepa, Alberto Barron-Cedeno, Tamer Elsayed, Reem Suwaileh, Wajdi Zaghouani, Spas Kyuchukov, Giovanni Da San Martino, and Preslav Nakov. "Overview of the CLEF- 2018 CheckThat! Lab on automatic identification and verifica- tion of political claims. Task 1: Check-worthiness." arXiv pre- print arXiv:1808.05542 (2018).
- Barron-codeno, Alberto, Tamer Elsayed, Preslav Nakov, Gio- vanni Da San Martino, Maram Hasanain, Reem Suwaileh, Fat- ima Haouari et al. "Overview of CheckThat! 2020: Automatic identification and verification of claims in social media." In In- ternational Conference of the Cross-Language Evaluation Fo- rum for European Languages, pp. 215-236. Springer, Cham, 2020.
- Kartal, Yavuz Selim, and Mucahid Kutlu. "TrClaim-19: The first collection for Turkish check-worthy claim detection with annotator rationales." In Proceedings of the 24th Conference on Computational Natural Language Learning, pp. 386-395. 2020.
- Berendt, Bettina, Peter Burger, Rafael Hautekiet, Jan Jagers, Al- exander Pleijter, and Peter Van Aelst. "FactRank: Developing automated claim detection for Dutch-language fact-checkers." Online Social Networks and Media 22 (2021): 100113.
- Atanasova, Pepa, Preslav Nakov, Luis Marquez, Alberto Bar- ron-Cede~no, Georgi Karadzhov, Tsvetomila Mihaylova, Mi- tra Mohtarami, and James Glass. "Automatic fact-checking us- ing context and discourse information." Journal of Data and In- formation Quality (JDIQ) 11, no. 3 (2019): 1-27.18
- Williams, Evan, Paul Rodrigues, and Valerie Novak. "Accen- ture at CheckThat! 2020: If you say so: post-hoc fact-checking of claims using transformer-based models." arXiv preprint arXiv:2009.02431 (2020).
- Hasanain, Maram, and Tamer Elsayed. "bigIR at CheckThat! 2020: Multilingual BERT for Ranking Arabic Tweets by Check-worthiness." In CLEF (Working Notes). (2020).
- Goudas, Theodosis, Christos Louizos, Georgios Petasis, and Vangelis Karkaletsis. "Argument extraction from news, blogs, and social media." In Hellenic Conference on Artificail Intelli- gence, pp. 287-299. Springer, Cham, (2014).
- Sardianos, Christos, Ioannis Manousos Katakis, Georgios Peta- sis, and Vangelis Karkaletsis. "Argument extraction from news." In Proceedings of the 2nd Workshop on Argumentation Mining, pp. 56-66. (2015).
- Levy, Ran, Shai Gretz, Benjamin Sznajder, Shay Hummel, Ranit Aharonov, and Noam Slonim. "Unsupervised corpus {wide claim detection." In Proceedings of the 4th Workshop on Argument Mining, pp. 79-84. (2017).
- Josue, Caraballo. \PolitiTax A Taxonomy of Political Claims.". (2018). https://https://perma.cc/4RQF-FCPV.
- Zarharan, Majid, Samane Ahangar, Fateme Sadat Rezvaninejad, Mahdi Lotfi Bidhendi, Mohammad Taher Pilevar, Behrouz Minaei, and Sauleh Eetemadi. "Persian Stance Classification Data Set." In TTO. 2019.
- Samadi, Mohammadreza, Maryam Mousavian, and Saeedeh Momtazi. "Persian fake news detection: Neural representation and classification at word and text levels." Transactions on Asian and Low-Resource Language Information Processing 21, no. 1 (2021): 1-11.
- Mottaghi, Vahid, Mahdi Esmaeili, Ghasem Ali Bazaee, and Mo- hammadali Afshar Kazemi. "A decision-making system for de- tecting fake persian news by improving deep learning algo- rithms{case study of Covid-19 news." Journal of applied re- search on industrial engineering 8, no. Special Issue (2021): 1- 17.
- Sadr, Mohammad Mohsen, Afshin Mousavi Chelak, Soraya Ziaei, and Jafar Tanha. "A predictive model based on machine learning methods to recognize fake persian news on twitter." International Journal of Nonlinear Analysis and Applications 11 (2020): 119-128.
- Sadr, Mohammad Mohsen. "The Use of LSTM Neural Network to Detect Fake News on Persian Twitter." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 11 (2021): 6658-6668.
- Jahanbakhsh-Nagadeh, Zoleikha, Mohammad-Reza Feizi-De- rakhshi, and Arash Sharifi. "A semi-supervised model for Per- sian rumor verification based on content information." Multi- media Tools and Applications 80, no. 28 (2021): 35267-35295.
- Jahanbakhsh-Nagadeh, Zoleikha, Mohammad-Reza Feizi-De- rakhshi, and Arash Sharifi. "A Deep Content-Based Model for Persian Rumor Verification." Transactions on Asian and Low- Resource Language Information Processing 21, no. 1 (2021): 1-29.19
- Thoyyibah, Luthfiyatun. "Presupposition Triggers-a Compara- tive Analysis Between Oral News and Written Online News Discourse." JALL (Journal of Applied Linguistics and Liter- acy) 1, no. 2 (2017): 10-23.
- Cohen, Jacob. "A coefficient of agreement for nominal scales." Educational and psychological measurement 20, no. 1 (1960): 37-46.
- Krippendorff , Klaus. "Validity in content analysis." (1980): 69.
- Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosu- khin. "Attention is all you need." Advances in neural infor- mation processing systems 30 (2017).
- Farahani, Mehrdad, Mohammad Gharachorloo, Marzieh Fara- hani, and Mohammad Manthouri. "Parsbert: Transformer- based model for persian language understanding." Neural Pro- cessing Letters 53, no. 6 (2021): 3831-3847.
- Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. "Bert: Pre-training of deep bidirectional transform- ers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
- Kingma, Diederik P., and Jimmy Ba. "Adam: A method for sto- chastic optimization." arXiv preprint arXiv:1412.6980 (2014).
- Abadi, Martin, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado et al. "Tensorow: Largescale machine learning on heterogeneous distributed sys- tems." arXiv preprint arXiv:1603.04467 (2016).
- Tenney, Ian, Patrick Xia, Berlin Chen, Alex Wang, Adam Poliak, R. Thomas McCoy, Najoung Kim et al. "What do you learn from context? probing for sentence structure in contextu- alized word representations." arXiv preprint arXiv:1905.06316 (2019).
- Hewitt, John, and Christopher D. Manning. "A structural probe for finding syntax in word representations". In Proceedings of the 2019 Conference of the North American Chapter of the As- sociation for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4129- 4138. (2019).
- Ehsan Doostmohammadi, Minoo Nassajian, and Adel Rahimi. "Persian Ezafe Recognition Using Transformers and Its Role in Part-Of-Speech Tagging". In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 961{971, Online. Association for Computational Linguistics. (2020).