Academia.eduAcademia.edu

Outline

Link prediction in complex networks based on cluster information

Abstract

Cluster in graphs is densely connected group of vertices sparsely connected to other groups. Hence, for prediction of a future link between a pair of vertices, these vertices common neighbors may play dif- ferent roles depending on if they belong or not to the same cluster. Based on that, we propose a new measure (WIC) for link prediction between a pair of vertices considering the sets of their intra-cluster or within-cluster (W) and between-cluster or inter-cluster (IC) common neighbors. Also, we propose a set of measures, referred to as W forms, using only the set given by the within-cluster common neighbors instead of using the set of all common neighbors as usually considered in the basic local similarity measures. Consequently, a previous clustering scheme must be applied on the graph. Using three different clustering algorithms, we compared WIC measure with ten basic local similarity measures and their counter- part W forms on ten real networks. Our analyses suggest that clustering information, no matter the clustering algorithm used, improves link pre- diction accuracy.

References (30)

  1. Ackland, R.: Mapping the US political blogosphere: Are conservative bloggers more prominent? Presentation to BlogTalk, Downunder, Sydney (2005)
  2. Batageli, V., Mrvar, A.: Pajek datasets (2006), http://vlado.fmf.uni-lj.si/pub/networks/data/mix/usair97.net
  3. Bertini, J., Lopes, A., Zhao, L.: Partially labeled data stream classification with the semi-supervised K-associated graph. Journal of the Brazilian Computer Society, 1-12 (2012)
  4. Bertini, J., Zhao, L., Motta, R., Lopes, A.: A nonparametric classification method based on k-associated graphs. Information Sciences 181(24), 5435-5456 (2011)
  5. Blum, A., Chawla, S.: Learning from labeled and unlabeled data using graph min- cuts. In: ICML, pp. 19-26 (2001)
  6. Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70, 066111 (2004)
  7. Demsar, J.: Statistical comparisons of classifiers over multiple data sets. JMLR 7, 1-30 (2006)
  8. Fawcett, T., Provost, F.: Activity monitoring: Noticing interesting changes in be- havior. In: Proc. of the Fifth ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 53-62 (1999)
  9. Feng, X., Zhao, J.C., Xu, K.: Link prediction in complex networks: a clustering perspective. Eur. Phys. J. B 85(1-3) (2012)
  10. Girvan, M., Newman, M.E.J.: Community structure in social and biological net- works. PNAS 99(12), 7821-7826 (2002)
  11. Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning: data mining, inference and prediction, 2nd edn. Springer (2009)
  12. Laguna, V., Lopes, A.: Combining local and global knn with cotraining. In: ECAI 2010 -19th European Conference on Artificial Intelligence, vol. 215, pp. 815-820. IOS Press, Netherlands (2010)
  13. Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. JASIST 58(7), 1019-1031 (2007)
  14. Lichtenwalter, R.N., Chawla, N.V.: Lpmade: Link prediction made easy. JMLR 12, 2489-2492 (2011)
  15. Liu, Z., Zhang, Q.-M., Lü, L., Zhou, T.: Link prediction in complex networks: A local naive bayes model. EPL 96(48007) (2011)
  16. Lopes, A.A., Bertini Jr., J.R., Motta, R., Zhao, L.: Classification Based on the Optimal K-Associated Network. In: Zhou, J. (ed.) Complex 2009. LNICST, vol. 4, pp. 1167-1177. Springer, Heidelberg (2009)
  17. Lorrain, F., White, H.C.: Structural equivalence of individuals in social networks. Journal of Mathematical Sociology 1, 49-80 (1971)
  18. Lü, L., Zhou, T.: Link prediction in complex networks: A survey. Physica A: Sta- tistical Mechanics and its Applications 390(6), 1150-1170 (2011)
  19. Lu, Q., Getoor, L.: Link-based classification. In: ICML, pp. 496-503 (2003)
  20. Motta, R., de Andrade Lopes, A., de Oliveira, M.C.F.: Centrality Measures from Complex Networks in Active Learning. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds.) DS 2009. LNCS, vol. 5808, pp. 184-196. Springer, Heidelberg (2009)
  21. Neville, J., Jensen, D., Friedland, L., Hay, M.: Learning relational probability trees. In: KDD, pp. 625-630 (2003)
  22. Newman, M.E.J.: The structure and function of complex networks. SIAM Re- view (45), 167-256 (2003)
  23. Newman, M.E.J.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74, 036104 (2006)
  24. Pons, P., Latapy, M.: Computing communities in large networks using random walks. J. Graph Algorithms Appl. 10(2), 191-218 (2006)
  25. Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and iden- tifying communities in networks. PNAS 101(9), 2658 (2004)
  26. Spring, N., Mahajan, R., Wetherall, D., Anderson, T.: Measuring ISP topologies with rocketfuel. IEEE/ACM Transactions on Networking 12(1), 2-16 (2004)
  27. von Mering, C., Krause, R., Snel, B., Cornell, M., Oliver, S.G., Fields, S., Bork, P.: Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417(6887), 399-403 (2002)
  28. Watts, D.J., Strogatz, S.H.: Collective dynamics of small-world networks. Na- ture 393(6684), 440-442 (1998)
  29. Zachary, W.W.: An information flow model for conflict and fission in small groups. Journal of Anthropological Research 33(4), 452-473 (1977)
  30. Zhou, T., Lü, L., Zhang, Y.-C.: Predicting missing links via local information. Eur. Phys. J. B 71, 623 (2009)