With Learning-To-Rank Topic Modeling
2015
Abstract
Topic modeling has emerged as a popular learning technique not only in mining text representations, but also in modeling authors’ interests and influence, as well as predicting linkage among documents or authors. However, few existing topic models distinguish and make use of the prior knowledge in regard to the different importance of documents (authors) over topics. In this paper, we focus on the ability of topic models in modeling author interests and influence. We introduce a pair-wise based learningto-rank algorithm into the topic modeling process with the hypothesis that investigating and exploring the prior-knowledge on authors’ different importance over topics can help to achieve more accurate and cohesive topic modeling results. Moreover, the framework integrating learning-to-rank mechanism with topic modeling can help to facilitate ranking in new authors. In this paper, we particularly apply this integrated model into two applications: the task of predicting future award wi...
References (44)
- K. Balog, L. Azzopardi and M. Rijke, Formal models for expert finding in enterprise corpora, SIGIR, 2006.
- K. Balog, Y. Fang, M. de Rijke, P. Serdyukov and S. Luo, Expertise retrieval, Found. Trends Inf. Retr., vol.6, pp.127-256, 2012.
- P. Batista, M. Campiteli and O. Kinouchi, Is it possible to compare researchers with different scientific interests? Scientometrics, vol.68, pp.179-189, 2006.
- D. M. Blei and J. D. Lafferty, Dynamic topic models, ICML, pp.113-120, 2006.
- D. M. Blei and J. D. McAuliffe, Supervised topic models, NIPS, 2007.
- D. M. Blei, A. Y. Ng and M. I. Jordan, Latent dirichlet allocation, Journal of Machine Learning Research, pp.993-1022, 2003.
- J. Chang and D. M. Blei, Hierarchical relational models for document networks, Annals of Applied Statistics, 2010.
- X. Chen, X. Hu, Z. Zhou, C. Lu, G. Rosen, T. He and E. K. Park, A probabilistic topic-connection model for automatic image annotation, CIKM, pp.899-908, 2010.
- H. Deng, J. Han, M. Lyu and I. King, Modeling and exploiting heterogeneous bibliographics networks for expertise ranking, JCDL, 2011.
- D. Duan, Y. Li, R. Li, R. Zhang and A. Wen, Ranktopic: Ranking based topic modeling, ICDM, pp.211-220, 2012.
- E. Erosheva, S. Fienberg and J. Lafferty, Mixed-membership models of scientific publications, Proc. of the National Academy Sciences, pp.5220-5227, 2004.
- L. Egghe, Theory and practice of the g-index, Scientometrics, vol.69, pp.131-152, 2006.
- Y. Fang, L. Si and A. Mathur, Discriminative models of integrating document evidence and document-candidate associations for expert search, SIGIR, 2010.
- A. Harzing, The Publish or Perish Book, Tarma Software Research Pty Ltd, Melbourne, Australia, 2010.
- J. Hirsch, An index to quantify an individual's scientific research output, Proc. of the National Academy of Sciences, vol.102, no.46, pp.16569-16572, 2005.
- P. D. Hoff, A. E. Raftery and M. S. Handcock, Latent space approaches to social network analysis, Journal of the American Statistical Association, vol.97, pp.1090-1098, 2002.
- T. Hofmann, Probabilistic latent semantic indexing, SIGIR, pp.50-57, 1999.
- B. Jin, The AR-index: Complementing the h-index, Intl. Socienty for Scientometrics and Informet- rics Newsletter, 2007.
- T. Joachims, Optimizing search engines using clickthrough data, KDD, pp.133-142, 2002.
- S. Kataria, P. Mitra, C. Caragea and C. Giles, Context sensitive topic models for author influence in document networks, IJCAI, 2011.
- T.-Y. Liu, Learning to rank for information retrieval, Found. Trends Inf. Retr., vol.3, no.3, pp.225- 331, 2009.
- C. Macdonald and I. Ounis, Voting for candidates: Adapting data fusion techniques for an expert search task, CIKM, 2006.
- C. Macdonald and I. Ounis, Learning models for ranking aggregates, Proc. of the 33rd European Conference on Advances in Information Retrieval, pp.517-529, 2011.
- D. Metzler and W. B. Croft, Linear feature-based models for information retrieval, Inf. Retr., vol.10, no.3, pp.257-274, 2007.
- T. Minka and J. Lafferty, Expectation-propagation for the generative aspect model, UAI, 2002.
- C. Moreira, Learning to Rank Academic Experts, Master Thesis, Universidade Tecnica de Lisboa, 2011.
- R. Nallapati, A. Ahmed, E. Xing and W. Cohen, Joint latent topic models for text and citations, KDD, 2008.
- L. Nie, B. D. Davison and X. Qi, Topical link analysis for web search, SIGIR, pp.91-98, 2006.
- L. Page, S. Brin, R. Motwani and T. Winograd, The PageRank citation ranking: Bringing order to the Web, Stanford InfoLab, Technical Report 1999-66, 1998.
- D. Ramage, D. Hall, R. Nallapati and C. D. Manning, Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora, EMNLP, pp.248-256, 2009.
- M. Rosen-Zvi, T. Griffiths, M. Steyvers and P. Smyth, The author-topic model for authors and documents, UAI, 2004.
- F. Ruane and R. Tol, Rational (successive) h-indices: An application to economics in the Republic of Ireland, Scienctometrics, 2008.
- A. Sidiropoulos, D. Katsaros and Y. Manolopoulos, Generalized hirsch h-index for disclosing latent facts in citation networks, Scientometrics, vol.72, pp.253-280, 2007.
- J. Tang, R. Jin and J. Zhang, A topic modeling approach and its integration into the random walk framework for academic search, ICDM, 2008.
- J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang and Z. Su, ArnetMiner: Extraction and mining of academic social network, KDD, 2008.
- Y. Tu, N. Johri, D. Roth and J. Hockenmaier, Citation author topic model in expert search, COLING, 2010.
- J. Wang, X. Hu, X. Tu and T. He, Author-conference topic-connection model for academic network search, CIKM, pp.2179-2183, 2012.
- J. Xu and H. Li, Adarank: A boosting algorithm for information retrieval, SIGIR, pp.391-398, 2007.
- Z. Yang, L. Hong and B. D. Davison, Topic-driven multi-type citation network analysis, RIAO, 2010.
- Z. Yang, L. Hong and B. D. Davison, Academic network analysis: A joint topic modeling approach, ASONAM, pp.324-333, 2013.
- Z. Yang, D. Yin and B. D. Davison, Award prediction with temporal citation network analysis, Proc. of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, NY, USA, 2011.
- Z. Yin, L. Cao, J. Han, C. Zhai and T. Huang, Geographical topic discovery and comparison, WWW, pp.247-256, 2011.
- C. Zhang, The e-index, complementing the h-index for excess citations, PLos One, vol.4, no.5, pp.1-4, 2009.
- J. Zhu, A. Ahmed and E. P. Xing, Medlda: Maximum margin supervised topic models for regression and classification, ICML, pp.1257-1264, 2009.