Learning Graph Embeddings from WordNet-based Similarity Measures

Alexander  Panchenko; Andrey Kutuzov

Outline

Learning Graph Embeddings from WordNet-based Similarity Measures

Alexander Panchenko

Andrey Kutuzov

2019, Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)

Abstract

We present path2vec, a new approach for learning graph embeddings that relies on structural measures of pairwise node similarities. The model learns representations for nodes in a dense space that approximate a given user-defined graph distance measure, such as e.g. the shortest path distance or distance measures that take information beyond the graph structure into account. Evaluation of the proposed model on semantic similarity and word sense disambiguation tasks, using various WordNet-based similarity measures, show that our approach yields competitive results, outperforming strong graph embedding baselines. The model is computationally efficient, being orders of magnitude faster than the direct computation of graph-based distances.

Figures (7)

Table 2: Spearman correlations with human Sim Lex999 noun similarities (WordNet synset selection).

beddings (including path2vec) on the semantic similarity task. For example, the word2vec modd of vector size 300 trained on the Google News cor- pus (Mikolov ef al., 2013) achieves Spearman cor- relation of only 0.449 with SimLex999, when test- ing only on nouns. The GloVe erbeddings (Pen- nington et al., 2014) of the sare vector size trained on the Common Crawl corpus achieve 0.404.

Figure 3: Evaluation on SimLex999 noun pairs, model-based synset sd ection: JCN-S (left) and WuP (right).

Table 4: F1 scores on all-words WSD tasks.

Figure 4: A sentence graph for WSD, where a colurm lists all the possible synsets of a corresponding word.

References (41)

Steven Bird, Ewan Klein, and Edward Loper. 2009. Natural language processing with Python: analyz- ing text with the natural language toolkit . O'Reilly Media, Inc.
Antoine Bordes, Nicolas Usunier, Alberto Garcia- Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi- relational data. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 26, pages 2787 2795. Curran Associates, Inc.
Antoine Bordes, Jason Weston, Ronan Collobert, and Yoshua Bengio. 2011. Learning structured embed- dings of knowledge bases. In Twenty-Fifth AAAI Conference on Arti cial Intelligence , pages 301 306, San Francisco, CA, USA. AAAI Press.
Ulrik Brandes. 2001. A faster algorithm for between- ness centrality. Journal of mathematical sociology , 25(2):163 177.
Alexander Budanitsky and Graeme Hirst. 2006. Eval- uating WordNet-based measures of lexical semantic relatedness. Computational Linguistics, 32(1):13
Shaosheng Cao, Wei Lu, and Qiongkai Xu. 2015. Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM International on Conference on Informa- tion and Knowledge Management, pages 891 900. ACM. Edsger W Dijkstra. 1959. A note on two problems in connexion with graphs. Numerische mathematik, 1(1):269 271.
Robert W Floyd. 1962. Algorithm 97: shortest path. Communications of the ACM, 5(6):345.
Francois Fouss, Alain Pirotte, Jean-Michel Renders, and Marco Saerens. 2007. Random-walk compu- tation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Transactions on knowledge and data engineering , 19(3):355 369.
Aditya Grover and Jure Leskovec. 2016. Node2vec: Scalable feature learning for networks. In Proceed- ings of the 22nd ACM SIGKDD international con- ference on Knowledge discovery and data mining, pages 855 864. ACM.
William Hamilton, Rex Ying, and Jure Leskovec. 2017a. Representation learning on graphs: Methods and applications. IEEE Data Engineering Bulletin , 40(3):52 74.
William Hamilton, Zhitao Ying, and Jure Leskovec. 2017b. Inductive representation learning on large graphs. In Advances in Neural Information Process- ing Systems, pages 1024 1034.
Richard Hamming. 1950. Error detecting and error correcting codes. Bell System technical journal, 29(2):147 160.
Felix Hill, Roi Reichart, and Anna Korhonen. 2015. SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation. Computational Linguistics, 41(4):665 695.
Jay J. Jiang and David W. Conrath. 1997. Seman- tic similarity based on corpus statistics and lexical taxonomy. In Proceedings of the 10th Research on Computational Linguistics International Confer- ence, pages 19 33, Taipei, Taiwan. The Associa- tion for Computational Linguistics and Chinese Lan- guage Processing (ACLCLP).
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. CoRR, abs/1412.6980.
Henry Kucera and Nelson Francis. 1982. Frequency analysis of English usage: Lexicon and grammar . Boston: Houghton Mif in.
Claudia Leacock and Martin Chodorow. 1998. Com- bining local context and WordNet similarity for word sense identi cation. WordNet: An electronic lexical database , 49(2):265 283.
Bertrand Lebichot, Guillaume Guex, Ilkka Kivim¤ aki, and Marco Saerens. 2018. A constrained random- ized shortest-paths framework for optimal explo- ration. arXiv preprint arXiv:1807.04551.
Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning entity and relation em- beddings for knowledge graph completion. In Pro- ceedings of the Twenty-Ninth AAAI Conference on Arti cial Intelligence , volume 15, pages 2181 2187, Austin, TX, USA. AAAI Press.
Rada Mihalcea, Timothy Chklovski, and Adam Kilgar- riff. 2004. The Senseval-3 English lexical sample task. In Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Anal- ysis of Text , pages 25 28, Barcelona, Spain. Associ- ation for Computational Linguistics.
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Cor- rado, and Jeff Dean. 2013. Distributed representa- tions of words and phrases and their composition- ality. In Advances in Neural Information Process- ing Systems 26, pages 3111 3119, Lake Tahoe, NV, USA. Curran Associates, Inc.
George A. Miller. 1995. WordNet: A lexical database for English. Communications of the ACM, 38(11):39 41.
Nikola Mrk si• c, Ivan Vuli• c, Diarmuid • O S• eaghdha, Ira Leviant, Roi Reichart, Milica Ga si• c, Anna Korho- nen, and Steve Young. 2017. Semantic special- ization of distributional word vector spaces using monolingual and cross-lingual constraints. Transac- tions of the Association for Computational Linguis- tics, 5:309 324.
Roberto Navigli. 2009. Word sense disambiguation: A survey. ACM Computing Surveys (CSUR) , 41(2):10.
Diarmuid • O S• eaghdha. 2009. Semantic classi cation with wordnet kernels. In Proceedings of Human Language Technologies: The 2009 Annual Confer- ence of the North American Chapter of the Associa- tion for Computational Linguistics, Companion Vol- ume: Short Papers , pages 237 240, Boulder, CO, USA. Association for Computational Linguistics.
Mingdong Ou, Peng Cui, Jian Pei, Ziwei Zhang, and Wenwu Zhu. 2016. Asymmetric transitivity preserv- ing graph embedding. In Proceedings of the 22nd ACM SIGKDD international conference on Knowl- edge discovery and data mining, pages 1105 1114. ACM.
Martha Palmer, Christiane Fellbaum, Scott Cotton, Lauren Delfs, and Hoa Trang Dang. 2001. En- glish tasks: All-words and verb lexical sample. In Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambigua- tion Systems, pages 21 24, Toulouse, France. Asso- ciation for Computational Linguistics.
Jeffrey Pennington, Richard Socher, and Christo- pher D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532 1543, Doha, Qatar. Association for Computational Lin- guistics.
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social rep- resentations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 701 710, New York, NY, USA. ACM.
Mohammad Taher Pilehvar and Nigel Collier. 2016. De-con ated semantic representations. In Proceed- ings of the 2016 Conference on Empirical Methods in Natural Language Processing , pages 1680 1690, Austin, TX, USA. Association for Computational Linguistics.
Mohammad Taher Pilehvar and Roberto Navigli. 2015. From senses to texts: An all-in-one graph-based ap- proach for measuring semantic similarity. Arti cial Intelligence, 228:95 128.
Alessandro Raganato, Jose Camacho-Collados, and Roberto Navigli. 2017. Word sense disambiguation: A uni ed evaluation framework and empirical com- parison. In Proceedings of the 15th Conference of the European Chapter of the Association for Compu- tational Linguistics: Volume 1, Long Papers, pages 99 110, Valencia, Spain. Association for Computa- tional Linguistics.
Delip Rao, David Yarowsky, and Chris Callison-Burch. 2008. Af nity measures based on the graph Lapla- cian. In Coling 2008: Proceedings of the 3rd Textgraphs workshop on Graph-based Algorithms for Natural Language Processing , pages 41 48, Manchester, UK. Coling 2008 Organizing Commit- tee.
Philip Resnik. 1999. Semantic similarity in a tax- onomy: An information-based measure and its ap- plication to problems of ambiguity in natural lan- guage. Journal of Arti cial Intelligence Research , 11(1):95 130.
Sascha Rothe and Hinrich Sch¤ utze. 2015. Autoex- tend: Extending word embeddings to embeddings for synsets and lexemes. In Proceedings of the 53rd Annual Meeting of the Association for Compu- tational Linguistics and the 7th International Joint Conference on Natural Language Processing (Vol- ume 1: Long Papers) , pages 1793 1803, Beijing, China. Association for Computational Linguistics.
Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolu- tional networks. In European Semantic Web Confer- ence, pages 593 607, Heraklion, Greece. Springer.
Ravi Sinha and Rada Mihalcea. 2007. Unsupervised graph-based word sense disambiguation using mea- sures of word semantic similarity. In International Conference on Semantic Computing (ICSC), pages 363 369, Irvine, CA, USA. IEEE.
Mark Steyvers and Joshua B. Tenenbaum. 2005. The large-scale structure of semantic networks: statisti- cal analyses and a model of semantic growth. Cog- nitive science, 29(1):41 78.
Julien Subercaze, Christophe Gravier, and Fr• ed• erique Laforest. 2015. On metric embedding for boost- ing semantic similarity computations. In Proceed- ings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Interna- tional Joint Conference on Natural Language Pro- cessing (Volume 2: Short Papers), pages 8 14, Bei- jing, China. Association for Computational Linguis- tics.
Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge graph embedding by trans- lating on hyperplanes. In AAAI Conference on Ar- ti cial Intelligence , pages 1112 1119, Qu • ebec City, QC, Canada.
Zhibiao Wu and Martha Palmer. 1994. Verb seman- tics and lexical selection. In Proceedings of the 32nd Annual Meeting of the Association for Com- putational Linguistics, pages 133 138, Las Cruces, NM, USA. Association for Computational Linguis- tics.

Learning Graph Embeddings from WordNet-based Similarity Measures

Sign up for access to the world's latest research

Abstract

Related papers

References (41)

Related papers

Related topics