Papers by Filip Larsson

Humanities and Social Sciences Communications
Languages of diverse structures and different families tend to share common patterns if they are ... more Languages of diverse structures and different families tend to share common patterns if they are spoken in geographic proximity. This convergence is often explained by horizontal diffusibility, which is typically ascribed to language contact. In such a scenario, speakers of two or more languages interact and influence each otherās languages, and in this interaction, more grammaticalized features tend to be more resistant to diffusion compared to features of more lexical content. An alternative explanation is vertical heritability: languages in proximity often share genealogical descent. Here, we suggest that the geographic distribution of features globally can be explained by two major pathways, which are generally not distinguished within quantitative typological models: feature diffusion and language expansion. The first pathway corresponds to the contact scenario described above, while the second occurs when speakers of genetically related languages migrate. We take the worldwide...
Linguistic distance against chronological distance
<p>Linguistic distance plotted against chronological distance (Ba = Basque, I-E = Indo-Euro... more <p>Linguistic distance plotted against chronological distance (Ba = Basque, I-E = Indo-European, Kar = Kartvelian, NeC = North-East Caucasian, NwC = North-West Caucasian, Tk = Turkic, Ur = Uralic).</p
Linguistic distance against geographic distance
<p>Linguistic distance measure plotted against geographic distance (Ba = Basque, I-E = Indo... more <p>Linguistic distance measure plotted against geographic distance (Ba = Basque, I-E = Indo-European, Kar = Kartvelian, NeC = North-East Caucasian, NwC = North-West Caucasian, Tk = Turkic, Ur = Uralic).</p
Typological dendrogram of Eurasian languages
<p>Dendrogram of data set DiACL Typology/ Eurasia, based on Manhattan distance values, usin... more <p>Dendrogram of data set DiACL Typology/ Eurasia, based on Manhattan distance values, using hierarchical clustering by means of Wardās method.</p
Linguistic distance of language families
<p>Boxplot showing linguistic distance values as a function of language family differential... more <p>Boxplot showing linguistic distance values as a function of language family differentials (Ba = Basque, I-E = Indo-European, Kar = Kartvelian, NeC = North-East Caucasian, NwC = North-West Caucasian, Tk = Turkic, Ur = Uralic).</p
Explanation of coding variants of alignment [51]
Online interface of DiACL
<p>Screenshot of the interface of DiACL Typology/ Eurasia.</p
Samples of coding variants for the alignment feature Noun/ Present progressive in the data, with explanation (N.B.: The list of languages is not complete)
<p>Samples of coding variants for the alignment feature Noun/ Present progressive in the da... more <p>Samples of coding variants for the alignment feature Noun/ Present progressive in the data, with explanation (N.B.: The list of languages is not complete).</p
The lexical and typological trajectory of Indo-European gender evolution
Annual Meeting of the Societas Linguistica Europaea, Aug 31, 2021

This thesis aims to explain the variation that is found among the Latin orthographies of Europe. ... more This thesis aims to explain the variation that is found among the Latin orthographies of Europe. The main question is if it can be explained as genealogical, areal or social. The hypothesis presented in this thesis is that genealogical factors are the most important. Orthographies are relevant to study in their own right since they are autonomous from spoken languages. Since orthographies basically express the relation between phonemes and graphemes the study has been done as a comparative analysis by comparing the amount of shared combinations of phonemes and graphemes in 45 orthographies. These shared combinations constituted the basis for a tree model of the relation of the studied orthographies. The results of the tree model and the database showed that orthographical variation is not random and that genealogical factors were the most important but historical factors were also important. The tree model also showed that the variation is greater among vowels than among consonants....

This thesis aims to examine the lexical and typological change found in the Western European lang... more This thesis aims to examine the lexical and typological change found in the Western European language families of Germanic, Romance and Celtic over the last two millennia. The method used was to create one lexical and one typological database and the data was analysed according to etic grids. Tree models were generated from the results of the databases and the groups found in the tree models were mapped out on a map over Western Europe. The lexical results were similar to traditional classifications. The lexical results also showed that the changes appeared according to a pattern that could be described by the wave theory where lexical changes spread from the centre to the periphery. The results of the typological data were different from traditional classifications as it did not follow the boundaries of the three language families. In general the wave theory was applicable to a lesser extent to the typological data but it was relevant for the verbal morphology. Contrary to the lexi...

The Mouton Atlas of Languages and Cultures
The notion of cultural aspects of language variation and change is a growing fi eld. However, col... more The notion of cultural aspects of language variation and change is a growing fi eld. However, collective works on the current stance within this domain are still scarce. The Mouton Atlas of Languages and Cultures embraces a substantial part of the Eurasian continent and equips the reader to better observe, reconstruct and understand the impact of culture and cultural changes on language diversity and linguistic developments. Along the way, a fascinating range of interdis-ciplinary issues, from database encoding conventions to etymologies and cultural anthropology, are discussed. Based on an extensive database assembled by Gerd Carling and her team in Lund, Sweden, the atlas presents typological and lexical data of more than 200 ancient and modern languages, any encoded for the very fi rst time. Alongside classic maps, the atlas features new visualizations, such as polygons and network diagrams, which smartly illustrate complex linguistic patterns of borro-wability, co-lexifi cation ...

This thesis aims to examine the lexical and typological change found in the Western European lang... more This thesis aims to examine the lexical and typological change found in the Western European language families of Germanic, Romance and Celtic over the last two millennia. The method used was to create one lexical and one typological database and the data was analysed according to etic grids. Tree models were generated from the results of the databases and the groups found in the tree models were mapped out on a map over Western Europe. The lexical results were similar to traditional classifications. The lexical results also showed that the changes appeared according to a pattern that could be described by the wave theory where lexical changes spread from the centre to the periphery. The results of the typological data were different from traditional classifications as it did not follow the boundaries of the three language families. In general the wave theory was applicable to a lesser extent to the typological data but it was relevant for the verbal morphology. Contrary to the lexical results the typological results indicated the existence of conservative centres with the peripheral languages being more typologically innovative. The conclusions drawn were that lexical change and typological change are two diametrically different and independent processes.

Diachronica. International Journal for Historical Linguistics
This article investigates the evolutionary and spatial dynamics of typological characters in 117 ... more This article investigates the evolutionary and spatial dynamics of typological characters in 117 Indo-European languages. We partition types of change (i.e., gain or loss) for each variant according to whether they bring about a simplification in morphosyntactic patterns that must be learned, whether they are neutral (i.e., neither simplifying nor introducing complexity) or whether they introduce a more complex pattern. We find that changes which introduce complexity show significantly less areal signal (according to a metric we devise) than changes which simplify and neutral changes, but we find no significant differences between the latter two groups. This result is compatible with a scenario where certain types of parallel change are more likely to be mediated by advergence and contact between proximate speech communities, while other developments are due purely to drift and are largely independent of intercultural contact.
PloS one, 2018
Feature stability, time and tempo of change, and the role of genealogy versus areality in creatin... more Feature stability, time and tempo of change, and the role of genealogy versus areality in creating linguistic diversity are important issues in current computational research on linguistic typology. This paper presents a database initiative, DiACL Typology, which aims to provide a resource for addressing these questions with specific of the extended Indo-European language area of Eurasia, the region with the best documented linguistic history. The database is pre-prepared for statistical and phylogenetic analyses and contains both linguistic typological data from languages spanning over four millennia, and linguistic metadata concerning geographic location, time period, and reliability of sources. The typological data has been organized according to a hierarchical model of increasing granularity in order to create datasets that are complete and representative.

This thesis aims to explain the variation that is found among the Latin orthographies of Europe. ... more This thesis aims to explain the variation that is found among the Latin orthographies of Europe. The main question is if it can be explained as genealogical, areal or social. The hypothesis presented in this thesis is that genealogical factors are the most important. Orthographies are relevant to study in their own right since they are autonomous from spoken languages. Since orthographies basically express the relation between phonemes and graphemes the study has been done as a comparative analysis by comparing the amount of shared combinations of phonemes and graphemes in 45 orthographies. These shared combinations constituted the basis for a tree model of the relation of the studied orthographies. The results of the tree model and the database showed that orthographical variation is not random and that genealogical factors were the most important but historical factors were also important. The tree model also showed that the variation is greater among vowels than among consonants. Another conclusion that was made was that political dominance is a relevant factor when new orthographies are created.

This thesis aims to examine the lexical and typological change found in the Western European lang... more This thesis aims to examine the lexical and typological change found in the Western European language families of Germanic, Romance and Celtic over the last two millennia. The method used was to create one lexical and one typological database and the data was analysed according to etic grids. Tree models were generated from the results of the databases and the groups found in the tree models were mapped out on a map over Western Europe. The lexical results were similar to traditional classifications. The lexical results also showed that the changes appeared according to a pattern that could be described by the wave theory where lexical changes spread from the centre to the periphery. The results of the typological data were different from traditional classifications as it did not follow the boundaries of the three language families. In general the wave theory was applicable to a lesser extent to the typological data but it was relevant for the verbal morphology. Contrary to the lexical results the typological results indicated the existence of conservative centres with the peripheral languages being more typologically innovative. The conclusions drawn were that lexical change and typological change are two diametrically different and independent processes.
Published papers by Filip Larsson

Nature Humanities & Social Sciences Communications, 2021
Languages of diverse structures and different families tend to share common patterns if they are ... more Languages of diverse structures and different families tend to share common patterns if they are spoken in geographic proximity. This convergence is often explained by horizontal diffusibility, which is typically ascribed to language contact. In such a scenario, speakers of two or more languages interact and influence each other's languages, and in this interaction, more grammaticalized features tend to be more resistant to diffusion compared to features of more lexical content. An alternative explanation is vertical heritability: languages in proximity often share genealogical descent. Here, we suggest that the geographic distribution of features globally can be explained by two major pathways, which are generally not distinguished within quantitative typological models: feature diffusion and language expansion. The first pathway corresponds to the contact scenario described above, while the second occurs when speakers of genetically related languages migrate. We take the worldwide distribution of nominal classification systems (grammatical gender, noun class, and classifier) as a case study to show that more grammaticalized systems, such as gender, and less grammaticalized systems, such as classifiers, are almost equally widespread, but the former spread more by language expansion historically, whereas the latter spread more by feature diffusion. Our results indicate that quantitative models measuring the areal diffusibility and stability of linguistic features are likely to be affected by language expansion that occurs by historical coincidence. We anticipate that our findings will support studies of language diversity in a more sophisticated way, with relevance to other parts of language, such as phonology.
Books by Filip Larsson

Nominal and verbal affixation in the Caucasus : A morphological and phonological approach
This thesis aims to investigate the interaction between complex morphology and complex phonology ... more This thesis aims to investigate the interaction between complex morphology and complex phonology in the languages of the Caucasus. The Caucasus is well-known for containing languages with exceptionally large case systems and complex polysynthetic verbal morphology, which is paired with some of the largest consonant inventories in the world outside Africa. The study focuses specifically on nominal and verbal affixation, the morphological process of adding bound morphemes to lexical stems, as the languages of the Caucasus present some of the most intricate affixation patterns in the world.
The underlying hypothesis of the thesis is that larger consonant inventories enable more complex morphology, which was operationalised as the number of grammatical functions expressed by affixation. A data set of more than 11,000 affixes was compiled, which enabled a comparison of the vast variety of grammatical functions expressed by affixation and the related phonological forms in 56 languages from the five language families of the Caucasus, i.e. Kartvelian, Nakh-Dagestanian, Northwest Caucasian, Indo-European and Turkic. The results indicated a significant positive correlation between the number of grammatical functions expressed by affixation and the size of a languageās consonant phoneme inventory, which was also true for the combined inventories of both consonant and vowel phonemes.
It has previously been proposed that the three endemic language families of the Caucasus, i.e. Kartvelian, Nakh-Dagestanian and Northwest Caucasian, belong to a common linguistic area, known as the Caucasian Sprachbund. The thesis also intended to test whether the nominal and verbal affixation inventories could support the notion of a morphological Caucasian Sprachbund, and the results could not support such a morphological sprachbund. A second hypothesis postulated that there are systematic phonological differences between affixes and lexical stems, which motivated a second data set of more than 21,500 lexical items from 52 of the 56 languages of the affixal data set. When the affixal data set and the lexical data set were compared, a significant difference could be observed between phonological distributions of combinations of place and manner of articulation. The results also demonstrated that voiceless consonants are significantly more common in lexical stems than in affixes. The phonological results also indicated that there are significant differences for certain combinations of place, manner and voicing, where particularly the various ejective consonants of the Caucasus all presented significantly different distributions in the affixal and lexical data sets. This suggests that the large inventories of ejectives in the Caucasus potentially facilitate the distinction between affixes and lexical stems in these languages.
Uploads
Papers by Filip Larsson
Published papers by Filip Larsson
Books by Filip Larsson
The underlying hypothesis of the thesis is that larger consonant inventories enable more complex morphology, which was operationalised as the number of grammatical functions expressed by affixation. A data set of more than 11,000 affixes was compiled, which enabled a comparison of the vast variety of grammatical functions expressed by affixation and the related phonological forms in 56 languages from the five language families of the Caucasus, i.e. Kartvelian, Nakh-Dagestanian, Northwest Caucasian, Indo-European and Turkic. The results indicated a significant positive correlation between the number of grammatical functions expressed by affixation and the size of a languageās consonant phoneme inventory, which was also true for the combined inventories of both consonant and vowel phonemes.
It has previously been proposed that the three endemic language families of the Caucasus, i.e. Kartvelian, Nakh-Dagestanian and Northwest Caucasian, belong to a common linguistic area, known as the Caucasian Sprachbund. The thesis also intended to test whether the nominal and verbal affixation inventories could support the notion of a morphological Caucasian Sprachbund, and the results could not support such a morphological sprachbund. A second hypothesis postulated that there are systematic phonological differences between affixes and lexical stems, which motivated a second data set of more than 21,500 lexical items from 52 of the 56 languages of the affixal data set. When the affixal data set and the lexical data set were compared, a significant difference could be observed between phonological distributions of combinations of place and manner of articulation. The results also demonstrated that voiceless consonants are significantly more common in lexical stems than in affixes. The phonological results also indicated that there are significant differences for certain combinations of place, manner and voicing, where particularly the various ejective consonants of the Caucasus all presented significantly different distributions in the affixal and lexical data sets. This suggests that the large inventories of ejectives in the Caucasus potentially facilitate the distinction between affixes and lexical stems in these languages.