Boosting Chinese Question Answering with Two Lightweight Methods
2008, ACM Transactions on Asian Language Information Processing
https://doi.org/10.1145/1450295.1450297Abstract
Question Answering (QA) research has been conducted in many languages. Nearly all the top performing systems use heavy methods that require sophisticated techniques, such as parsers or logic provers. However, such techniques are usually unavailable or unaffordable for underresourced languages or in resource-limited situations. In this article, we describe how a topperforming Chinese QA system can be designed by using lightweight methods effectively. We propose two lightweight methods, namely the Sum of Co-occurrences of Question and Answer Terms (SCO-QAT) and Alignment-based Surface Patterns (ABSPs). SCO-QAT is a co-occurrencebased answer-ranking method that does not need extra knowledge, word-ignoring heuristic rules, or tools. It calculates co-occurrence scores based on the passage retrieval results. ABSPs are syntactic patterns trained from question-answer pairs with a multiple alignment algorithm. They are used to capture the relations between terms and then use the relations to filter answers. We attribute the success of the ABSPs and SCO-QAT methods to the effective use of local syntactic information and global co-occurrence information.
References (32)
- BARZILAY, R. AND LEE, L. 2003. Learning to paraphrase: An unsupervised approach using multiple-sequence alignment. In Proceedings of the Joint Human Language Technology Con- ference/Annual Meeting of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL'03), 16-23.
- BOUMA, G., MUR, J., AND NOORD, G. V. 2005. Reasoning over dependency relations for QA. In Proceedings of the IJCAI Workshop on Knowledge and Reasoning for Answering Questions (KRAQ'05), 15-21.
- CLARKE, C. L. A., CORMACK, G., KEMKES, G., LASZLO, M., LYNAM, T., TERRA, E., AND TILKER, P. 2002. Statistical selection of exact answers (multitext experiments for TREC'02). In Proceedings of the 11th Text Retrieval Conference (TREC'02), 823-831.
- CLARKE, C. L. A., CORMACK, G. V., AND LYNAM, T. R. 2001. Exploiting redundancy in question answering. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'01), 358-365.
- COOPER, R. J. AND RUGER, S. M. 2000. A Simple Question Answering System. In Proceedings of the 9th Text Retrieval Conference (TREC'00).
- CUI, H., SUN, R., LI, K., KAN, M.-Y., AND CHUA, T.-S. 2005. Question answering passage re- trieval using dependency relations. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'05), Salvador, Brazil, 400-407.
- DAY, M.-Y., LEE, C.-W., WU, S.-H., ONG, C.-S., AND HSU, W.-L. 2005. An integrated knowledge- based and machine learning approach for Chinese question classification. In Proceedings of the IEEE International Joint Conference on Natural Language Processing and Knowledge Engineer- ing (NLP-KE'05).
- HARABAGIU, S., MOLDOVAN, D., CLARK, C., BOWDEN, M., HICKL, A., AND WANG, P. 2005. Em- ploying two question answering systems in TREC'05. In Proceedings of the 14th Text Retrieval Conference (TREC'05).
- HSU, W.-L., WU, S.-H., AND CHEN, Y.-S. 2001. Event identification based on the information map-INFOMAP. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC'01), Tucson, AZ, 1661-1666.
- HUANG, M., ZHU, X., HAO, Y., PAYAN, D. G., QU, K., AND LI, M. 2004. Discovering patterns to extract protein -protein interactions from full texts. Bioinformatics 20, 3604-3612.
- KWOK, K.-L. AND DENG, P. 2006. Chinese question-answering: Comparing monolingual with English-Chinese cross-lingual results. In Proceedings of the 3rd Asia Information Retrieval Sym- posium (AIRS'06), 244-257.
- LAURENT, D., S ÉGU ÉLA, P., AND N ÈGRE, S. 2006. Cross lingual question answering using QRISTAL for CLEF 2006. In Proceedings of the 7th Workshop of the Cross-Language Evalua- tion Forum (CLEF'06).
- LEE, C.-W., DAY, M.-Y., SUNG, C.-L., LEE, Y.-H., JIANG, T.-J., WU, C.-W., SHIH, C.-W., CHEN, Y.-R., AND HSU, W.-L. 2007. Chinese-Chinese and English-Chinese question answering with ASQA at NTCIR-6 CLQA. In Proceedings of NII-NACSIS Test Collection for Information Retrieval Systems (NTCIR'07), Tokyo, Japan, 175-181.
- LEE, C.-W., SHIH, C.-W., DAY, M.-Y., TSAI, T.-H., JIANG, T.-J., WU, C.-W., SUNG, C.-L., CHEN, Y.-R., WU, S.-H., AND HSU, W.-L. 2005. ASQA: Academia sinica question answering system for NTCIR-5 CLQA. In Proceedings of NII-NACSIS Test Collection for Information Retrieval Systems (NTCIR'05), Tokyo, Japan.
- LIN, F., SHIMA, H., WANG, M., AND MITAMURA, T. 2005. CMU JAVELIN System for NTCIR5 CLQA1. In Proceedings of NII-NACSIS Test Collection for Information Retrieval Systems (NTCIR'05).
- LIN, J. 2005. Evaluation of resources for question answering evaluation. Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'05), 392-399.
- LIN, S.-J., SHIA, M.-S., LIN, K.-H., LIN, J.-H., YU, S., AND LU, W.-H. 2005. Improving an- swer ranking using cohesion between answer and keywords. In Proceedings of NII-NACSIS Test Collection for Information Retrieval Systems (NTCIR'05).
- MAGNINI, B., PREVETE, M. N. R., AND TANEV, H. 2001. Is it the right answer? Exploiting Web redundancy for Answer Validation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL'02), 425-432.
- MOLLA, D. AND GARDINER, M. 2005. AnswerFinder-Question answering by combining lexical, syntactic and semantic information. In Australasian Language Technology Workshop (ALTW'05).
- MUSLEA, I. 1999. Extraction patterns for information extraction tasks: A survey. In Proceedings of the Workshop on Machine Learning for Information Extraction (MLIE'99), Orlando, Florida.
- RAVICHANDRAN, D. AND HOVY, E. 2001. Learning surface text patterns for a question answering system. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL'02), Philadelphia, Pennsylvania, 41-47.
- RAVICHANDRAN, D. AND HOVY, E. 2002. Learning surface text patterns for a question answering system. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL'02), 41-47.
- SAIZ-NOEDA, M., SU'AREZ, A., AND PALOMAR, M. 2001. Semantic pattern learning through max- imum entropy-based WSD technique. In Proceedings of the 5th Conference on Natural Language Learning (CoNLL'01), Toulouse, France.
- SASAKI, Y., CHEN, H.-H., CHEN, K.-H., AND LIN, C.-J. 2005. Overview of the NTCIR-5 cross- lingual question answering task. In Proceedings of the NII-NACSIS Test Collectionfor Informa- tion Retrieval Systems (NTCIR'05), Tokyo, Japan, 175-185.
- SMITH, T. F. AND WATERNMAN, M. S. 1981. Identification of common molecular subsequences. J. Mol. Biol. 147, 195-197.
- SOUBBOTIN, M. M. AND SOUBBOTIN, S. M. 2001. Patterns of potential answer expressions as clues to the right answers. Proceedings of the 10th Text Retrieval Conference (TREC'01).
- STAAB, S., ERDMANN, M., AND MAEDCHE, A. 2001. Engineering Ontologies using Semantic Patterns. In Proceedings of the IJCAI-2001 Workshop on E-Business and Intelligent Web. Seattle, Washington.
- TAKAHASHI, T., NAWATA, K., INUI, K., AND MATSUMOTO, Y. 2004. NAIST QA System for QAC2. In Proceedings of the NII-NACSIS Test Collection for Information Retrieval Systems (NTCIR'04), Tokyo, Japan.
- VAPNIK, V. N. 1995. The nature of statistical learning theory. Springer.
- WU, C.-W., JAN, S.-Y., TSAI, R. T.-H., AND HSU, W.-L. 2006. On using ensemble methods for Chinese named entity recognition. In Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing.
- ZHAO, Y., XU, Z. M., GUAN, Y., AND LI, P. 2005. Insun05QA on QA track of TREC'05. In Proceed- ings of the 14th Text Retrieval Conference (TREC'05).
- ZHENG, Z. 2002. AnswerBus question answering system. Human Language Technology Confer- ence (HLT'02), 24-27.