Academia.eduAcademia.edu

Outline

Prediction of reading difficulty in Russian academic texts

2019, Journal of Intelligent & Fuzzy Systems

https://doi.org/10.3233/JIFS-179007

Abstract

Education policy makers view measuring academic texts readability and profiling classroom textbooks as a primary task of education management aimed at sustaining quality of reading programs. As Russian readability metrics, i.e. "objective" features of texts determining its complexity for readers, are still a research niche, we undertook a comparative analysis of academic texts features exemplified in textbooks on Social Science and examination texts of Russian as a foreign language. Experiments for 7 classifiers and 4 methods of linear regression on Russian Readability corpus demonstrated that ranking textbooks for native speakers is a much more difficult task than ranking examination texts written (or designed) for foreign students. The authors see a possible reason for this in differences between two processes: acquiring a native language on the one hand and learning a foreign language on the other. The results of the current study are extremely relevant in modern Russia which is joining the Bologna Process and needs to provide profiled texts for all types of learners and testees. Based on a qualitative and quantitative analysis of a text, the research offers a guide for education managers to help build consensus on selecting a reading material when educators have differing views.

References (28)

  1. G.R. Klare, The measurement of readability: Useful infor- mation for communicators, ACM J Comput Doc 24(3) (2000), 107-121. DOI=http://dx.doi.org/10.1145/344599. 344630
  2. R. Flesch, How to write plain English: A book for lawyers and consumers, Harpercollins, 1979.
  3. I. Obobroneva, Avtomatizirovannaya otsenka slozhnosti uchebnykh tekstov na osnove statisticheskikh parametrov [Semiautomatic evaluation of the complexity of academic texts on the base of statistic parameters], Moscow: RAE institute of content and methods of teaching. M.: RAS Insti- tut soderzhaniya i metodov obucheniya, 2006.
  4. V. Solovyev, V. Ivanov and M. Solnyshkina, Assessment of reading difficulty levels in Russian academic texts: Approaches and Metrics, Journal of Intelligent & Fuzzy Systems 34(5) (2018), 3049-3058.
  5. Z. Jiang, Q. Gu, Y. Yin and D. Chen, Enriching Word Embeddings with Domain Knowledge for Readability Assessment, In Proceedings of the 27th International Con- ference on Computational Linguistics, 2018, pp. 366-378.
  6. K. Collins-Thompson, Computational assessment of text readability: A survey of current and future research, ITLIn- ternational Journal of Applied Linguistics 165(2) (2014), 97-135.
  7. J.R.D.M. Palotti, G. Zuccon and A. Hanbury, The influence of pre-processing on the estimation of readability of web documents, In Proceedings of the 24th ACM International on Conference on Information and Knowledge Manage- ment, 2015, pp. 1763-1766.
  8. S.E. Petersen and M. Ostendorf, A machine learning approach to reading level assessment, Computer Speech and Language 23 (2009), 86-106.
  9. S. Schwarm and M. Ostendorf, Reading level assessment using support vector machines and statistical language mod- els, In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL-05), 2005, pp. 523-530.
  10. L. Feng, Automatic Readability Assessment, PhD thesis, City University of New York (CUNY), 2010.
  11. L. Feng, M. Jansche, M. Huenerfauth and N. Elhadad, A comparison of features for automatic readability assess- ment, In Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), 2010.
  12. A.C. Graesser, D.S. McNamara, M.M. Louweerse and Z. Cai, Cohmetrix: Analysis of text on cohesion and language, Behavior Research Methods, Instruments and Computers 36 (2004), 193-202.
  13. S.A. Crossley, D.F. Dufty, P.M. McCarthy and D.S. McNa- mara, Toward a new readability: A mixed model approach. In D.S. McNamara and G. Trafton, editors, Proceedings of the 29th Annual Conference of the Cognitive Science Society Cognitive Science Society, 2007.
  14. S.A. Crossley, J. Greenfield and D.S. McNamara, Assessing text readability using cognitively based indices, Teach- ers of English to Speakers of Other Languages (2008), 475-493.
  15. S.A. Crossley, M.M. Louwerse, P.M. McCarthy and D.S. Mc-Namara, A linguistic analysis of simplified and authen- tic texts, The Modern Language Journal, 2007.
  16. S. Vajjala, Analyzing Text Complexity and Text Simplifica- tion: Connecting Linguistics, Processing and Educational Applications, PhD thesis, University of Tubingen, 2015.
  17. S. Sharoff, S. Kurella and A. Hartley, Seeking needles in the web's haystack: Finding texts suitable for language learners, In Proceedings of the 8th Teaching and Language Corpora Conference, 2008.
  18. R. Reynolds, Insights from Russian second language readability classification: Complexity-dependent training requirements, and feature evaluation of multiple categories, In Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications, 2016, pp. 289-300.
  19. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reute- mann and I.H. Witten, The WEKA data mining software: An update, In The SIGKDD Explorations, volume 11, 2009, pp. 10-18.
  20. L. Breiman, Random forests, Machine Learning 45(1) (2001), 5-32.
  21. M.A. Sadov, Razrabotka podhoda dlja izmerenija chitae- mosty tekstov na russom jazyke, Magisterskaja rabota, 2018, Vysshaja shkola ekonomiki.
  22. A.N. Laposhina, T.V. Veselovskaya, M.U. Lebedeva and O.F. Kupreshchenko, Automated Text Readabil- ity Assessment for Russian Second Language Learners, Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference, 18, 2018, pp. 1-11.
  23. A. Laposhina, Relevant features selection for the auto- matic text complexity measurement for Russian as a foreign language, Computational Linguistics and Intellectual Tech- nologies: Papers from the Annual International Conference "Dialogue", V. 17, 2017, pp. 1-7.
  24. N. Karpov, J. Baranova and F. Vitugin, Single-sentence read- ability prediction in Russian, In Proceedings of Analysis of Images, Social Networks, and Texts Conference (AIST), 2014, pp. 91-100.
  25. D. Biber, Methodological issues regarding corpus-based analyses of linguistic variation, Literary and Linguistic Computing 5(4) (1990), 257-269.
  26. O. Lyashevskaya and S. Sharov, The frequency dictionary of modern Russian language, Azbukovnik, Moscow, 2009.
  27. G.V. Golovin, Receptive vocabulary size measurement for Russian language, Sotcio-Psikhologicheskie Issledovanija 3 (2015), 148-159.
  28. V.V. Ivanov, M.I. Solnyshkina and V.D. Solovyev, Efficiency of text readability features in Russian academic texts, In Computational Linguistics and Intellectual Technologies, V. 17, 2018, pp. 277-287.