Learning reputation in an authorship network
2014, Proceedings of the 29th Annual ACM Symposium on Applied Computing
https://doi.org/10.1145/2554850.2555098Abstract
The problem of searching for experts in a given academic field is hugely important in both industry and academia. We study exactly this issue with respect to a database of authors and their publications. The idea is to use Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) to perform topic modelling in order to find authors who have worked in a query field. We then construct a coauthorship graph and motivate the use of influence maximisation and a variety of graph centrality measures to obtain a ranked list of experts. The ranked lists are further improved using a Markov Chainbased rank aggregation approach. The complete method is readily scalable to large datasets. To demonstrate the efficacy of the approach we report on an extensive set of computational simulations using the Arnetminer dataset. An improvement in mean average precision is demonstrated over the baseline case of simply using the order of authors found by the topic models.
References (29)
- J. A. Aslam and M. Montague. Models for metasearch. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 276-284. ACM, 2001.
- P. Bailey, N. Craswell, A. P. de Vries, and I. Soboroff. Overview of the trec 2007 enterprise track. 2007.
- K. Balog, I. Soboroff, P. Thomas, P. Bailey, N. Craswell, and A. P. de Vries. Overview of the trec 2008 enterprise track. 2008.
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993-1022, 2003.
- N. Craswell, A. P. de Vries, and I. Soboroff. Overview of the trec-2005 enterprise track. In TREC 2005 conference notebook, pages 199-205, 2005.
- S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. JASIS, 41(6):391-407, 1990.
- H. Deng, I. King, and M. R. Lyu. Formal models for expert finding on dblp bibliography data. In Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on, pages 163-172. IEEE, 2008.
- C. Dwork, R. Kumar, M. Naor, and D. Sivakumar. Rank aggregation methods for the web. In Proceedings of the 10th international conference on World Wide Web, pages 613-622. ACM, 2001.
- R. Fagin, R. Kumar, and D. Sivakumar. Efficient similarity search and classification via rank aggregation. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 301-312. ACM, 2003.
- L. C. Freeman. Centrality in social networks conceptual clarification. Social networks, 1(3):215-239, 1979.
- G. H. Golub and C. F. Van Loan. Matrix computations, volume 3. JHUP, 2012.
- N. Halko, P.-G. Martinsson, and J. A. Tropp. Finding structure with randomness: Probabilis- tic algorithms for constructing approximate matrix decompositions. SIAM review, 53(2):217- 288, 2011.
- T. Heck, O. Hanraths, and W. G. Stock. Expert recommendation for knowledge management in academia. Proceedings of the American Society for Information Science and Technology, 48(1):1-4, 2011.
- M. Hoffman, F. R. Bach, and D. M. Blei. Online learning for latent dirichlet allocation. In advances in neural information processing systems, pages 856-864, 2010.
- D. Kempe, J. Kleinberg, and É. Tardos. Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 137-146, 2003.
- J. M. Kleinberg. Hubs, authorities, and communities. ACM Computing Surveys (CSUR), 31(4es):5, 1999.
- R. M. Larsen. Lanczos bidiagonalization with partial reorthogonalization. DAIMI Report Series, 27(537), 1998.
- J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Cost- effective outbreak detection in networks. In Proceedings of the 13th ACM SIGKDD interna- tional conference on Knowledge discovery and data mining, pages 420-429. ACM, 2007.
- J. Li, J. Tang, J. Zhang, Q. Luo, Y. Liu, and M. Hong. Eos: expertise oriented search using social networks. In Proceedings of the 16th international conference on World Wide Web, pages 1271-1272. ACM, 2007.
- R. Řehuřek. Subspace tracking for latent semantic analysis. In Advances in Information Retrieval, pages 289-300. Springer, 2011.
- R. Řehůřek and P. Sojka. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pages 45-50, Valletta, Malta, May 2010. ELRA. http://is.muni.cz/publication/884893/en .
- M. E. Renda and U. Straccia. Web metasearch: rank vs. score based rank aggregation meth- ods. In Proceedings of the 2003 ACM symposium on Applied computing, pages 841-846. ACM, 2003.
- P. Resnick, K. Kuwabara, R. Zeckhauser, and E. Friedman. Reputation systems. Communi- cations of the ACM, 43(12):45-48, 2000.
- G. Salton. Automatic Text Processing: The Transformation, Analysis, and Retrieval of. Addison-Wesley, 1989.
- J. Tang, J. Sun, C. Wang, and Z. Yang. Social influence analysis in large-scale networks. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 807-816. ACM, 2009.
- J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. Arnetminer: extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 990-998. ACM, 2008.
- J. Tang, J. Zhang, D. Zhang, L. Yao, C. Zhu, J.-Z. Li, et al. Arnetminer: An expertise oriented search system for web community. In Semantic Web Challenge, 2007.
- C. J. Van Rijsbergen, S. E. Robertson, and M. F. Porter. New models in probabilistic infor- mation retrieval. Computer Laboratory, University of Cambridge, 1980.
- J. Zhang, J. Tang, and J. Li. Expert finding in a social network. In Advances in Databases: Concepts, Systems and Applications, pages 1066-1069. Springer, 2007.