Improving Semantic Composition with Offset Inference
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
https://doi.org/10.18653/V1/P17-2069Abstract
Count-based distributional semantic models suffer from sparsity due to unobserved but plausible co-occurrences in any text collection. This problem is amplified for models like Anchored Packed Trees (APTs), that take the grammatical type of a co-occurrence into account. We therefore introduce a novel form of distributional inference that exploits the rich type structure in APTs and infers missing data by the same mechanism that is used for semantic composition.
References (54)
- Paolo Annesi, Danilo Croce, and Roberto Basili. 2014. Semantic compositionality in tree ker- nels. In Proceedings of the 23rd ACM Inter- national Conference on Conference on Informa- tion and Knowledge Management. ACM, New York, NY, USA, CIKM '14, pages 1029-1038. https://doi.org/10.1145/2661829.2661955.
- Marco Baroni, Raffaella Bernardi, and Roberto Zam- parelli. 2014. Frege in space: A program for compo- sitional distributional semantics. Linguistic Issues in Language Technology 9(6):5-110.
- Marco Baroni, Silvia Bernardini, Adriano Fer- raresi, and Eros Zanchetta. 2009. The wacky wide web: a collection of very large linguis- tically processed web-crawled corpora.
- Lan- guage Resources and Evaluation 43(3):209-226. https://doi.org/10.1007/s10579-009-9081-4.
- Marco Baroni and Alessandro Lenci. 2010. Dis- tributional memory: A general framework for corpus-based semantics. Computational Linguistics 36(4):673-721.
- Marco Baroni and Roberto Zamparelli. 2010. Nouns are vectors, adjectives are matrices: Repre- senting adjective-noun constructions in seman- tic space. In Proceedings of the 2010 Con- ference on Empirical Methods in Natural Lan- guage Processing. Association for Computational Linguistics, Cambridge, MA, pages 1183-1193. http://www.aclweb.org/anthology/D10-1115.
- William Blacoe and Mirella Lapata. 2012. A com- parison of vector-based representations for seman- tic composition. In Proceedings of the 2012
- Joint Conference on Empirical Methods in Natu- ral Language Processing and Computational Natu- ral Language Learning. Association for Computa- tional Linguistics, Jeju Island, Korea, pages 546- 556. http://www.aclweb.org/anthology/D12-1050.
- Samuel R. Bowman, Jon Gauthier, Abhinav Ras- togi, Raghav Gupta, Christopher D. Manning, and Christopher Potts. 2016. A fast unified model for parsing and sentence understanding. In Pro- ceedings of the 54th Annual Meeting of the As- sociation for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, pages 1466-1477. http://www.aclweb.org/anthology/P16-1139.
- Bob Coecke, Mehrnoosh Sadrzadeh, and Stephen Clark. 2011. Mathematical foundations for a com- positional distributed model of meaning. Linguistic Analysis 36(1-4):345-384.
- Danilo Croce, Alessandro Moschitti, and Roberto Basili. 2011. Structured lexical similarity via convolution kernels on dependency trees. In Proceedings of the 2011 Conference on Em- pirical Methods in Natural Language Process- ing. Association for Computational Linguistics, Edinburgh, Scotland, UK., pages 1034-1046. http://www.aclweb.org/anthology/D11-1096.
- Ido Dagan, Shaul Marcus, and Shaul Markovitch. 1993. Contextual word similarity and estimation from sparse data. In Proceedings of the 31st An- nual Meeting on Association for Computational Lin- guistics. Association for Computational Linguistics, Stroudsburg, PA, USA, ACL '93, pages 164-171. https://doi.org/10.3115/981574.981596.
- Ido Dagan, Fernando Pereira, and Lillian Lee. 1994. Similarity-based estimation of word cooccurrence probabilities. In Proceedings of the 32nd Annual Meeting of the Association for Computational Lin- guistics. Association for Computational Linguistics, Las Cruces, New Mexico, USA, pages 272-278. https://doi.org/10.3115/981732.981770.
- Georgiana Dinu, Nghia The Pham, and Marco Baroni. 2013. General estimation and evaluation of compo- sitional distributional semantic models. In Proceed- ings of the Workshop on Continuous Vector Space Models and their Compositionality. Association for Computational Linguistics, Sofia, Bulgaria, pages 50-58. http://www.aclweb.org/anthology/W13- 3206.
- Katrin Erk and Sebastian Padó. 2008. A struc- tured vector space model for word meaning in context. In Proceedings of the 2008 Con- ference on Empirical Methods in Natural Lan- guage Processing. Association for Computational Linguistics, Honolulu, Hawaii, pages 897-906. http://www.aclweb.org/anthology/D08-1094.
- Katrin Erk and Sebastian Pado. 2010. Exemplar- based models for word meaning in context. In Proceedings of the ACL 2010 Conference Short Papers. Association for Computational Linguistics, Uppsala, Sweden, pages 92-97. http://www.aclweb.org/anthology/P10-2017.
- Pablo Gamallo and Martín Pereira-Fariña. 2017. Com- positional semantics using feature-based models from wordnet. In Proceedings of the 1st Work- shop on Sense, Concept and Entity Representa- tions and their Applications. Association for Com- putational Linguistics, Valencia, Spain, pages 1-11. http://www.aclweb.org/anthology/W17-1901.
- E. Grefenstette, G. Dinu, Y. Zhang, M. Sadrzadeh, and M. Baroni. 2013. Multi-step regression learning for compositional distributional seman- tics. In Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) -Long Papers. Association for Computa- tional Linguistics, Potsdam, Germany, pages 131- 142. http://www.aclweb.org/anthology/W13-0112.
- Edward Grefenstette and Mehrnoosh Sadrzadeh. 2011. Experimental support for a categorical compo- sitional distributional model of meaning. In Proceedings of the 2011 Conference on Em- pirical Methods in Natural Language Process- ing. Association for Computational Linguistics, Edinburgh, Scotland, UK., pages 1394-1404. http://www.aclweb.org/anthology/D11-1129.
- Edward Grefenstette, Mehrnoosh Sadrzadeh, Stephen Clark, Bob Coecke, and Stephen Pulman. 2011. Concrete sentence spaces for compositional distri- butional models of meaning. Proceedings of the 9th International Conference on Computational Seman- tics (IWCS 2011) pages 125-134.
- Emiliano Guevara. 2010. A regression model of adjective-noun compositionality in distribu- tional semantics. In Proceedings of the 2010
- Workshop on GEometrical Models of Natural Language Semantics. Association for Computa- tional Linguistics, Uppsala, Sweden, pages 33-37. http://www.aclweb.org/anthology/W10-2805.
- Emiliano Guevara. 2011. Computing semantic com- positionality in distributional semantics. In Pro- ceedings of the Ninth International Conference on Computational Semantics. Association for Compu- tational Linguistics, Stroudsburg, PA, USA, IWCS '11, pages 135-144.
- Kazuma Hashimoto, Pontus Stenetorp, Makoto Miwa, and Yoshimasa Tsuruoka. 2014. Jointly learning word representations and composition functions using predicate-argument structures. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Pro- cessing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pages 1544-1555. http://www.aclweb.org/anthology/D14-1163.
- Karl Moritz Hermann and Phil Blunsom. 2013. The role of syntax in vector space models of composi- tional semantics. In Proceedings of the 51st Annual Meeting of the Association for Computational Lin- guistics (Volume 1: Long Papers). Association for Computational Linguistics, Sofia, Bulgaria, pages 894-904. http://www.aclweb.org/anthology/P13- 1088.
- Felix Hill, KyungHyun Cho, Anna Korho- nen, and Yoshua Bengio. 2016. Learning to understand phrases by embedding the dictionary. Transactions of the Associa- tion for Computational Linguistics 4:17-30. http://www.aclweb.org/anthology/Q/Q16/Q16- 1002.pdf.
- Dimitri Kartsaklis and Mehrnoosh Sadrzadeh. 2014. A study of entanglement in a categorical framework of natural language. In Proceedings of the 11th Work- shop on Quantum Physics and Logic (QPL).
- Douwe Kiela, Felix Hill, Anna Korhonen, and Stephen Clark. 2014. Improving multi-modal representa- tions using image dispersion: Why less is sometimes more. In Proceedings of the 52nd Annual Meet- ing of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computa- tional Linguistics, Baltimore, Maryland, pages 835- 841. http://www.aclweb.org/anthology/P14-2135.
- Thomas Kober, Julie Weeds, Jeremy Reffin, and David Weir. 2016. Improving sparse word rep- resentations with distributional inference for se- mantic composition. In Proceedings of the 2016 Conference on Empirical Methods in Natu- ral Language Processing. Association for Computa- tional Linguistics, Austin, Texas, pages 1691-1702. https://aclweb.org/anthology/D16-1175.
- Omer Levy and Yoav Goldberg. 2014a. Dependency- based word embeddings. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Lin- guistics, Baltimore, Maryland, pages 302-308. http://www.aclweb.org/anthology/P14-2050.
- Omer Levy and Yoav Goldberg. 2014b. Neural word embedding as implicit matrix factorization. In Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, Cur- ran Associates, Inc., pages 2177-2185.
- Jeff Mitchell and Mirella Lapata. 2008. Vector-based models of semantic composition. In Proceed- ings of ACL-08: HLT. Association for Computa- tional Linguistics, Columbus, Ohio, pages 236-244. http://www.aclweb.org/anthology/P/P08/P08-1028.
- Jeff Mitchell and Mirella Lapata. 2010. Com- position in distributional models of seman- tics. Cognitive Science 34(8):1388-1429. https://doi.org/10.1111/j.1551-6709.2010.01106.x.
- Lili Mou, Hao Peng, Ge Li, Yan Xu, Lu Zhang, and Zhi Jin. 2015. Discriminative neural sentence mod- eling by tree-based convolution. In Proceedings of the 2015 Conference on Empirical Methods in Nat- ural Language Processing. Association for Compu- tational Linguistics, Lisbon, Portugal, pages 2315- 2325. http://aclweb.org/anthology/D15-1279.
- Joakim Nivre, Johan Hall, and Jens Nilsson. 2006. Maltparser: A data-driven parser-generator for de- pendency parsing. Technical report, Växjö Univer- sity.
- Sebastian Padó and Mirella Lapata. 2007. Dependency-based construction of semantic space models. Computational Linguistics 33(2):161-199. https://doi.org/10.1162/coli.2007.33.2.161.
- Denis Paperno, Nghia The Pham, and Marco Baroni. 2014. A practical and linguistically-motivated ap- proach to compositional distributional semantics. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Vol- ume 1: Long Papers). Association for Computa- tional Linguistics, Baltimore, Maryland, pages 90- 99. http://www.aclweb.org/anthology/P14-1009.
- Hinrich Schütze. 1992. Dimensions of mean- ing. In Proceedings of ACM/IEEE Con- ference on Supercomputing. IEEE Com- puter Society Press, pages 787-796.
- Hinrich Schütze. 1998. Automatic word sense discrim- ination. Computational Linguistics 24(1):97-123.
- Richard Socher, Brody Huval, Christopher D. Man- ning, and Andrew Y. Ng. 2012. Semantic compositionality through recursive matrix-vector spaces. In Proceedings of the 2012 Joint Con- ference on Empirical Methods in Natural Lan- guage Processing and Computational Natural Lan- guage Learning. Association for Computational Linguistics, Jeju Island, Korea, pages 1201-1211. http://www.aclweb.org/anthology/D12-1110.
- Richard Socher, Andrej Karpathy, Quoc Le, Christo- pher Manning, and Andrew Ng. 2014. Grounded compositional semantics for finding and describing images with sentences. Transactions of the Associa- tion for Computational Linguistics 2:207-218.
- James H Steiger. 1980. Tests for comparing ele- ments of a correlation matrix. Psychological Bul- letin 87(2):245.
- Stefan Thater, Hagen Fürstenau, and Manfred Pinkal. 2010. Contextualizing semantic representations us- ing syntactically enriched vector models. In Pro- ceedings of the 48th Annual Meeting of the Associa- tion for Computational Linguistics. Association for Computational Linguistics, Uppsala, Sweden, pages 948-957. http://www.aclweb.org/anthology/P10- 1097.
- Stefan Thater, Hagen Fürstenau, and Manfred Pinkal. 2011. Word meaning in context: A simple and ef- fective vector model. In Proceedings of 5th Interna- tional Joint Conference on Natural Language Pro- cessing. Asian Federation of Natural Language Pro- cessing, Chiang Mai, Thailand, pages 1134-1143. http://www.aclweb.org/anthology/I11-1127.
- Ran Tian, Naoaki Okazaki, and Kentaro Inui. 2016. Learning semantically and additively composi- tional distributional representations. In Proceed- ings of the 54th Annual Meeting of the As- sociation for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, pages 1277-1287. http://www.aclweb.org/anthology/P16-1121.
- Peter D. Turney. 2012. Domain and function: A dual-space model of semantic relations and compo- sitions. Journal of Artificial Intelligence Research 44(1):533-585.
- Tim Van de Cruys, Thierry Poibeau, and Anna Korho- nen. 2013. A tensor-based factorization model of se- mantic compositionality. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computa- tional Linguistics, Atlanta, Georgia, pages 1142- 1151. http://www.aclweb.org/anthology/N13-1134.
- Julie Weeds, Thomas Kober, Jeremy Reffin, and David Weir. 2017. When a red herring in not a red herring: Using compositional methods to detect non-compositional phrases. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Vol- ume 2, Short Papers. Association for Computa- tional Linguistics, Valencia, Spain, pages 529-534. http://www.aclweb.org/anthology/E17-2085.
- Julie Weeds, David Weir, and Jeremy Reffin. 2014. Distributional composition using higher-order de- pendency vectors. In Proceedings of the 2nd Work- shop on Continuous Vector Space Models and their Compositionality (CVSC). Association for Compu- tational Linguistics, Gothenburg, Sweden, pages 11-20. http://www.aclweb.org/anthology/W14- 1502.
- David Weir, Julie Weeds, Jeremy Reffin, and Thomas Kober. 2016. Aligning packed dependency trees: a theory of composition for distributional semantics. Computational Linguistics, special issue on Formal Distributional Semantics 42(4):727-761.
- John Wieting, Mohit Bansal, Kevin Gimpel, and Karen Livescu. 2015. From paraphrase database to compositional paraphrase model and back. Transactions of the Association for Computational Linguistics 3:345-358.
- Mo Yu and Mark Dredze. 2015. Learning composition models for phrase embeddings. Transactions of the Association for Computational Linguistics 3:227- 242. http://aclweb.org/anthology/Q/Q15/Q15- 1017.pdf.
- Fabio Massimo Zanzotto and Lorenzo Dell'Arciprete. 2012. Distributed tree kernels. In John Lang- ford and Joelle Pineau, editors, Proceedings of the 29th International Conference on Machine Learning (ICML-12). Omnipress, New York, NY, USA, ICML '12, pages 193-200.
- Fabio Massimo Zanzotto, Ioannis Korkontzelos, Francesca Fallucchi, and Suresh Manandhar. 2010. Estimating linear models for compositional distribu- tional semantics. In Proceedings of Coling. pages 1263-1271. http://www.aclweb.org/anthology/C10- 1142.