Semantic Processing of Compounds in Indian Languages
Abstract
Compounds occur very frequently in Indian Languages. There are no strict orthographic conventions for compounds in modern Indian Languages. In this paper, Sanskrit compounding system is examined thoroughly and the insight gained from the Sanskrit grammar is applied for the analysis of compounds in Hindi and Marathi. It is interesting to note that compounding in Hindi deviates from that in Sanskrit in two aspects. The data analysed for Hindi does not contain any instance of Bahuvrīhi (exo-centric) compound. Second, Hindi data presents many cases where quite a lot of compounds require a verb as well as vibhakti(a case marker) for its paraphrasing. Compounds requiring a verb for paraphrasing are termed as madhyama-pada-lopī in Sanskrit, and they are found to be rare in Sanskrit.
References (21)
- Butnariu, C. and Veale, T. (2008). A concept-centered approach to noun compound inter- pretation. In Proceedings of the 22nd International Conference on Computational Linguistics (COLING-08), Manchester, UK.
- Finin, T. W. (1980). The semantic interpretation of nominal compounds. In In the Proceedings of the 1st Conference on Artificial Intelligence (AAAI-80).
- Girju, R., Badulescu, A., and Moldovan, D. (2003). Learning semantic constraints for the automatic discovery of part-whole relations. In In the proceedings of the Human Language Technology Conference (HLT).
- Girju, R., Nakov, P., Nastase, V., Szpakowicz, S., Turney, P., and Yuret, D. (2007). Classification of semantic relations between nominals. In Proceedings of The Semantic Evaluation Workshop (SemEval) in Conjunction with ACL, Prague.
- Huddleston, R. and Pullum, G. K. (2002). The Cambridge Grammar of the English Language. Cambridge University Press.
- Huet, G. (2009). Formal structure of Sanskrit text: Requirements analysis for a mechanical Sanskrit processor. In Huet, G., Kulkarni, A., and Scharf, P., editors, Sanskrit Computational Linguistics 1 & 2. Springer-Verlag LNAI 5402.
- Kim, S. N. and Baldwin, T. (2006). Interpreting semantic relation in noun compound via verb semantics. In Proceedings of ACL/COLING-2006.
- Kulkarni, A. and Kumar, A. (2011). Statistical constituency parser for Sanskrit compounds. In Proceedings of ICON 2011. Macmillan Advanced Research Series, Macmillan Publishers India Ltd. Kulkarni, M., Dangarikar, C., Kulkarni, I., Nanda, A., and Bhattacharyya, P. (2010). Introducing sanskrit wordnet. In Pushpak Bhattacharyya, C. F. and Vossen, P., editors, Principles, Construc- tion and Application of Multilingual Wordnets, Proceedings of the Global Wordnet Conference, 2010. Narosa Publishing House, New Delhi.
- Kumar, A. (2012). An automatic Sanskrit Compound Processing. PhD thesis, University of Hyderabad, Hyderabad.
- Kumar, A., Mittal, V., and Kulkarni, A. (2010). Sanskrit compound processor. In Jha, G. N., editor, Proceedings of the International Sanskrit Computational Linguistics Symposium. Springer- Verlag LNAI 6465.
- Kumar, A., SheebaSudheer, V., and Kulkarni, A. (2009). Sanskrit compound paraphrase generator. In Proceedings of ICON 2009.
- Lauer, M. (1995). Designing Statistical Language Learners: Experiments on Noun compounds. PhD thesis, Macquarie University, Australia.
- Mittal, V. (2010). Automatic sanskrit segmentizer using finite state transducers. In Proceedings of the ACL 2010 Student Research Workshop, pages 85-90, Uppsala, Sweden. Association for Computational Linguistics.
- Nair, S. and Kulkarni, A. (2010). The knowledge structure in amarakośa. In Jha, G. N., editor, Proceedings of the International Sanskrit Computational Linguistics Symposium. Springer-Verlag LNAI 6465.
- Nakov, P. (2008). Noun compound interpretation using paraphrasing verbs: Feasibility study. In Proceeding of 13th International Conference on Artificial Intelligence: Methodology, Systems and Applications (AIMSA-08), Varna, Bulgaria.
- Nastase, V. and Szpakowicz, S. (2009). The same semantic relations link structurally different realizations of concept. In Linguistic Issues in Language.
- Paul, S., Mathur, P., and Kishore, S. (2010). Syntactic construct: An aid for translating english nominal compound into hindi. In Proceedings of the NAACL HLT Workshop on Extracting and Using Constructions in Computational Linguistics, Los Angeles, California.
- Prince, A. and Smolensky, P. (1993). Optimality theory: Constraint interaction in generative grammar. Technical report, Rutgers University, Piscataway.
- Séaghdha, D. O. and Copestake, A. (2007). Co-occurrence contexts for noun-compound interpretation. In Proceedings of the ACL-07 Workshop on a Broader Perspective on Multiword Expression (MWE-07), Prague, Czech Republic.
- Shastri, G. (2006). Patañjali's Vyākaran . a Mahābhās . ya with Kaiyat . a's Pradīpa and Nāgojibhat . t . a's Uddyota with the Notes by Guruprasad Shastri (Adhyāya 2). Rashtriya Sanskrit Sansthan, New Delhi (reprint of 1938 edition).
- Vanderwende, L. (1995). The Analysis of Noun Sequences Using semantic Information Extracted from on-line Dictionaries. PhD thesis, Georgetown University.