Academia.eduAcademia.edu

Outline

The Dynamics of Extensive Text Variables in Russian Short Stories

2020

Abstract

The research presented in this paper is aimed at the analysis of dynamic organization of a literary text. Using the statistical time series method, the dynamics of the main extensive text variables — the mean paragraph length and the mean sentence length — is considered. The material for this study was the annotated subcorpus from the Corpus of the Russian Short Stories of 19001930, which consists of 310 stories written by 300 Russian writers. It was narrative fragments of texts (the narrator's speech) that were subjected to analysis, dialogical fragments were not taken into consideration. As a result, the most frequent dynamic profiles of paragraph length and sentence length were obtained, which reflect the most typical structures of the dynamic organization of short literary texts.

References (31)

  1. Admoni, V.G.: Razmer predlozheniya i slovosochetaniya kak yavlenie sintaksicheskogo stroya [The Length of Sentences and Phrases as a Phenomenon of Syntactic Structure].
  2. Voprosy yazykoznaniya [Topics in the Study of Language], 1966(4), 111-118 (1966).
  3. Akimova, G.N.: Razmer predlozheniya kak faktor stilistiki i grammatiki [Sentence Length as a Factor of Stylistics and Grammar]. Voprosy yazykoznaniya [Topics in the Study of Language], 1973(2), 67-79 (1973).
  4. Coghlan, A.A.: A Little Book of R for Time Series, Release 0.2. Wellcome Trust Sanger Institute, Cambridge (2018).
  5. Grieve, J.: Quantitative Authorship Attribution: An Evaluation of Techniques. Literary and Linguistic Computing, 22(3), 251-270 (2007).
  6. Huxtable, R.: Sentence length. Science, 197(4300), 208 (1977).
  7. Kelih E., Grzybek P., Antić G., Stadlober E.: Quantitative Text Typology: The Impact of Sentence Length. In: Spiliopoulou, M., Kruse, R., Borgelt, C., Nürnberger, A., Gaul, W. (eds.) From Data and Information Analysis to Knowledge Engineering, Proceedings of the 29th Annual Conference of the Gesellschaft für Klassifikation e.V., University of Magde- burg, March 9-11, 2005, 382-389. Springer, Berlin (2006).
  8. Lagutina, K., Lagutina, N., Boychuk, E., Vorontsova, I., Shilakhtina, E., Belyaeva, O., Paramonov, I., Demidov, P.G.: A Survey on Stylometric Text Features. In: Balandin, S., Niemi, V., Tuytina, T. (eds.). Proceedings of the 25th Conference of Open Innovations As- sociation FRUCT, Helsinki, Finland, 184-195. Institute of Electrical and Electronic Engi- neers, New York (2019).
  9. Lesskis, G.A.: Nekotorye statisticheskie zakonomernosti kharakteristiki prostogo i slozhnogo predlozheniya v russkoj nauchnoj i khudozhestvennoj proze XVIII-XX vv. [Some Statistical Laws of the Characteristics of Simple and Compound Sentences in Rus- sian Scientific and Fiction Texts of 18-20th centuries]. Russkij yazyk v nacionalnoj shkole [Russian language in the national school], 1968(2), 67-80 (1968).
  10. Manovich, L.: Software Takes Command. Bloomsbury Academic, New York (2013).
  11. Martynenko, G.Y.: Vvedenie v chislovuyu garmoniyu teksta [The Introduction to Numeral Harmony of the Text]. St. Petersburg State University, Saint-Petersburg (2009).
  12. Martynenko, G.Y.: Metody matematicheskoy lingvistiki v stilisticheskikh issledovaniyakh [Computational Linguistics Methods in the Stylistics Research].
  13. Nestor-Istoriya, Saint- Petersburg (2019).
  14. Martynenko, G.Y., Sherstinova, T.Y.: Chislovoj profil syujeta [The Numeric Profile of the Plot]. In: Proceedings of IV Congress of Russian language researchers 'Russkij yazyk: is- toricheskie sudby i sovremennost' [Russian Language: Historical Fates and Modern Age], 524-525. Moscow State university, Moscow (2010).
  15. Martynenko, G., Sherstinova, T.: Emotional waves of a plot in literary texts: new ap- proaches for investigation of the dynamics in digital culture. In: Alexandrov, D.A., Bou- khanovsky, A. V., Chugunov, A.V., Kabanov, Y., Koltsova, O. (eds.) Digital Transfor- mation and Global Society. DTGS 2018. Communications in Computer and Information Science, 859, 299-309. Springer, Cham (2018).
  16. Martynenko, G., Sherstinova, T.: Analytical Distribution Model for Syntactic Variables Average Values in Russian literary Texts. In: Alexandrov, D.A., Boukhanovsky, A.V., Chugunov, A.V., Kabanov, Y., Koltsova, O., Musabirov, I. (eds.) Digital Transformation and Global Society. DTGS 2019. Communications in Computer and Information Science, 1038, 719-731. Springer, Cham (2019).
  17. Martynenko G., Sherstinova T.: Linguistic and Stylistic Parameters for the Study of Liter- ary Language in the Corpus of Russian Short Stories of the First Third of the 20th Century. In: Ronzhin, A., Noskova, T., Karpov, A. (eds.) R. Piotrowski's Readings in Language En- gineering and Applied Linguistics, Proc. of the III International Conference on Language Engineering and Applied Linguistics (PRLEAL-2019), Saint Petersburg, Russia, Novem- ber 27, 2019, CEUR Workshop Proceedings, 2552, 105-120. RWTH Aachen University, Aachen (2020).
  18. Martynenko, G.Y., Sherstinova, T.Y., Popova, T.I., Melnik, A.G., Zamirajlova, Y.V.: O printsipakh sozdaniya korpusa russkogo rasskaza pervoy treti XX veka [On the Principles of Creation of the Russian Short Stories Corpus of the First Third of the XX Century]. In: Proceedings of the 15th TEL International Conference on Computational and Cognitive Linguistics (TEL-2018), 1, 180-197. Izdatelstvo AN RT, Kazan (2018).
  19. Moretti, F.: Distant Reading. Verso, London (2013).
  20. Olmsted, D.: On some axioms about sentence length. Language 43(1), 303-305 (1967).
  21. Martynenko, G.Y., Sherstinova, T.Y., Melnik, A.G., Popova, T.I.: Metodologicheskie problemy sozdaniya Kompyuternoy antologii russkogo rasskaza kak yazykovogo resursa dlya issledovaniya yazyka i stilya russkoy hudozhestvenny prozy v epokhy revo- lutsionnykh peremen (pervoy treti XX veka) [Methodological problems of creating a Computer Anthology of the Russian story as a language resource for the study of the lan- guage and style of Russian artistic prose in the era revolutionary changes (first third of the 20th century)]. In: Kompjuternaya lingvistika i vychislitelnye ontologii [Computational Linguistics and Computational Ontologies]. Issue 2 (Proceedings of the XXI International Conference "Internet i sovremennoe obshchestvo" [Internet and Modern IMS- 2018, St. Petersburg, 30 May 2018-2 June 2018. Collection of scientific articles), 99-104. ITMO University, St. Petersburg (2018).
  22. Reagan, A.J., Mitchell, L., Kiley, D., Danforth, C.M., Dodds, P.S.: The emotional arcs of stories are dominated by six basic shapes. EPJ Data Science 5, 31 (2016).
  23. Rudnicka, K.: Variation of sentence length across time and genre. Influence on syntactic usage in English. In: Whitt, R.J. (ed.) Diachronic Corpora, Genre, and Language Change (Studies in Corpus Linguistics, 85), 219-240. John Benjamins, Amsterdam (2018).
  24. Silge J., Robinson, D.: Text Mining with R. O'Reilly Media, Sebastopol (2020).
  25. Sherstinova, T., Mitrofanova, O., Skrebtsova, T., Zamiraylova, E., Kirina, M.: Topic Mod- elling with NMF vs. Expert Topic Annotation: the Case Study of Russian Fiction. In: Mar- tínez-Villaseñor, L., Herrera-Alcántara, O., Ponce H., Castro-Espinoza, F.A. (eds) MICAI 2020, LNCS, 12469. Springer, Cham (2020).
  26. Sherstinova T., Skrebtsova T.: Russian Literature Around the October Revolution: A Quantitative Exploratory Study of Literary Themes and Narrative Structure in Russian Short Stories of 1900-1930. In: Proc. of the International Workshop "Computational Lin- guistics" (CompLing-2020)(in print).
  27. Sherstinova, T., Ushakova, E., Melnik, A.: Measures of Syntactic Complexity and their Change over Time (the Case of Russian). In: Balandin, S., Turchet, L., Tuytina, T. (eds.) Proceedings of the 27th Conference of Open Innovations Association FRUCT, Trento, Ita- ly, 221-229. Institute of Electrical and Electronic Engineers, New York (2020).
  28. Stamatatos, E.: A survey of modern authorship attribution methods. Journal of the Ameri- can Society for Information Science and Technology, 60(3), 538-556 (2009).
  29. Venecky, I.G., Veneckaya V.I.: Osnovniye matematiko-statisticheskie ponyatiya i formuly v ekonomicheskom analize [Basic Math and Statistics Concepts and Formulas in Econom- ic Analysis].
  30. Statistika, Moscow (1979).
  31. Yule, G.: On sentence-length as a statistical characteristic of style in prose: with applica- tion to two cases of disputed authorship. Biometrika, 30(3/4), 363-390 (1939).