slides (pdf)
Sign up for access to the world's latest research
Abstract
VP VBD increased NP CD 11 NN % PP TO to NP QP # # CD 2.5 CD billion PP IN from NP QP # # CD 2.25 CD billion Slav Petrov, Leon Barrett and Dan Klein Non-Local Modeling with a Mixture of PCFGs Empirical Motivation VP VBD increased NP CD 11 NN % PP TO to NP QP # # CD 2.5 CD billion PP IN from NP QP # # CD 2.25 CD billion Verb Phrase Expansion: capture with lexicalization. VP VBD increased NP CD 11 NN % PP TO to NP QP # # CD 2.5 CD billion PP IN from NP QP # # CD 2.25 CD billion
Related papers
2001
The present paper investigates the word order alternation of English transitive phrasal verbs such as, e.g., to pick up the book versus to pick the book up. It builds on traditional monofactorial analyses, but argues that previously used methods of analysis are grossly inadequate to describe, explain and predict the word order choice by native speakers. A hypothesis integrating virtually all relevant variables ever postulated is proposed and investigated from a multifactorial perspective (using GLM, linear discriminant analysis and CART). As a result, more than 84% of native speakers' choices can be predicted. Further implications (linguistic and methodological) are discussed. 1 The grammatical notation is not committed to any particular grammatical framework and serves expository reasons only. Likewise, the choice of terminology in terms of movement processes is not meant to truly imply any such processesÐit merely re¯ects that these phenomena have most frequently been dealt with within the transformational-generative paradigm.
2009
Principles and parameters theory (PPT), as developed by Chomsky and other linguists, aims to explain and account for both the similarities and differences exhibited by the grammatical structures of the world's human languages. The principles are certain generic properties which grammars of all languages are thought to possess. The parameters are aspects of grammatical structures that have limited variability, and are fixed in one of a limited number of possible configurations. The lexicon, which is a list of words with their meanings, pronunciations and various properties, also has a significant role to play in this model. This paper outlines each of these components, and then examines them in depth to reveal the nature of some of the interactions. The rationale for the model and its major components are discussed, as well as the implications of the relevant modules and elements of the theory.
Sergi Torner and Elisenda Bernal (eds.): Collocations and other lexical combinations in Spanish, 2017
In this chapter, we provide an overview of one of the theoretical frameworks that encode the selectional constraints in the lexicon, the Generative Lexicon theory. We will review the different compositional mechanisms put forward in GL (with special attention to the type shifting or coercion ) and apply them to analyze a set of predicate-argument (verb-argument) and modifi cation (adjectival modifi er-noun) constructions in Spanish.
2002
In this paper we investigate the phenomenon of verb-particle constructions, discussing their characteristics and the challenges that they present for a computational grammar. We concentrate our discussion on the treatment adopted in a wide-coverage HPSG grammar: the LinGO ERG. Given the constantly growing number of verb-particle combinations, possible ways of extending this treatment are investigated, taking into account the regular patterns found in some productive combinations of verbs and particles. We analyse possible ways of identifying regular patterns using different resources. One possible way to try to capture these is by means of lexical rules, and we discuss the dif£culties encountered when adopting such an approach. We also investigate how to restrict the productivity of lexical rules to deal with subregularities and exceptions to the patterns found.
2020
Sentence formation is a highly structured, history-dependent, and sample-space reducing (SSR) process. While the first word in a sentence can be chosen from the entire vocabulary, typically, the freedom of choosing subsequent words gets more and more constrained by grammar and context, as the sentence progresses. This sample-space reducing property offers a natural explanation of Zipf’s law in word frequencies, however, it fails to capture the structure of the word-to-word transition probability matrices of English text. Here we adopt the view that grammatical constraints (such as subject–predicate–object) locally re-order the word order in sentences that are sampled by the word generation process. We demonstrate that superimposing grammatical structure–as a local word re-ordering (permutation) process–on a sample-space reducing word generation process is sufficient to explain both, word frequencies and word-to-word transition probabilities. We compare the performance of the grammat...
We propose a statistical measure for the degree of acceptability of light verb constructions, such as take a walk, based on their linguistic properties. Our measure shows good correlations with human ratings on unseen test data. Moreover, we find that our measure correlates more strongly when the potential complements of the construction (such as walk, stroll, or run) are separated into semantically similar classes. Our analysis demonstrates the systematic nature of the semi-productivity of these constructions.
2012
Abstract Statistical language models used in deployed systems for speech recognition, machine translation and other human language technologies are almost exclusively n-gram models. They are regarded as linguistically naıve, but estimating them from any amount of text, large or small, is straightforward. Furthermore, they have doggedly matched or outperformed numerous competing proposals for syntactically well-motivated models. This unusual resilience of n-grams, as well as their weaknesses, are examined here.
An Integrated View of Language Development. Papers in Honor of Henning Wode, eds. P. Burmeister, T. Piske & A. Rohde, pp. 109-134. , 2002
Wissenschaftlicher Verlag Trier. 'be' 24094 13.7 2 ha 'have' 13826 7.8 3 kunna 'can' 7265 4.1 4 ska 'shall' 5606 3.1 5 få 'get; may' 4588 2.6 6 komma 'come' 3348 1.9 7 bli 'become' 3113 1.7 8 säga 'say' 2868 1.6 9 göra 'make; do' 2669 1.5 10 se 'see' 2592 1.4 11 gå 'go' 2476 1.4 12 finnas 'there is' 2382 1.3 13 ta 'take' 2189 1.2 14 vilja 'want' 1536 0.8 15 ge 'give' 1399 0.7 16 måste 'must' 1251 0.7 17 stå 'stand' 1105 0.6 18 känna 'feel' 1067 0.6 19 veta 'know' 1032 0.5 20 gälla 'apply to' 995 0.5 Total 1-20 most frequent verb types 85401 48.7 Total 1-50 most frequent verb types 104 327 59.5 Total 1-100 most frequent verb types 119 537 68.2 Total Corpus 175 255 100 One important observation that can be made by inspecting Table 1 is the extreme dominance in terms of frequency of a small number of verbs. The 20 most frequent verb types cover close to 50% of all the verb tokens and the 100 most frequent verbs close to 70% in spite of the fact that the corpus contains close to 4000 verb types and larger printed dictionaries of Swedish up to 10 000 verb types. 1.1 Nuclear verbs Some of the basic verbs are language-specific in the sense that they tend to lack an equivalent in other languages. One example of that in Swedish is the verb få 'get; may' with rank 5 in the table (see Viberg 2001a). There is, however, an important set of verb meanings that tend to
… Acquisition and Development: Proceedings of GALA …, 2006

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
References (1)
- Recent Development "Learning Accurate, Compact, and Interpretable Tree Annotation", Petrov et al., ACL 2006: F 1 = 90.2%. More flexible learning framework. Split and merge training to keep grammar compact. Similar in spirit to Klein & Manning 2003 and Matsuzaki et al. 2005.