Academia.eduAcademia.edu

Outline

Visualizing Very Large Graphs Using Clustering Neighborhoods

2005, Lecture Notes in Computer Science

https://doi.org/10.1007/11504245_6

Abstract

This paper presents a method for visualization of large graphs in a two-dimensional space, such as a collection of Web pages. The main contribution here is in the representation change to enable better handling of the data. The idea of the method consists from three major steps: (1) First, we transform a graph into a sparse matrix, where for each vertex in the graph there is one sparse vector in the matrix. Sparse vectors have non-zero components for the vertices that are close to the vertex represented by the vector. (2) Next, we perform hierarchical clustering (eg., hierarchical K-Means) on the set of sparse vectors, resulting in the hierarchy of clusters. (3) In the last step, we map hierarchy of clusters into a two-dimensional space in the way that more similar clusters appear closely on the picture. The effect of the whole procedure is that we assign unique X and Y coordinates to each vertex, in a way those vertices or groups of vertices on several levels of hierarchy that are stronger connected in a graph are place closer in the picture. The method is particular useful for power distributed graphs. We show applications of the method on real-world examples of visualization of institution collaboration graph and cross-sell recommendation graph.

References (13)

  1. Duda, R. O., Hart, P. E. and Stork, D. G.: Pattern Classification 2nd edition, WileyInter- science (2000)
  2. Fayyad, U., Grinstein, G. G. and Wierse, A. (eds): Information Visualization in Data Mining and Knowledge Discovery, Morgan Kaufmann (2001)
  3. Grobelnik, M., and Mladenić, D.: Efficient visualization of large text corpora. Proceedings of the seventh TELRI seminar. Dubrovnik, Croatia (2002)
  4. Grobelnik, M., and Mladenić, D.: Analysis of a database of research projects using text mining and link analysis. In Mladenić, D., Lavrac, N., Bohanec, M., Moyle, S. (eds.), Data mining and decision support : integration and collaboration, (The Kluwer international se- ries in engineering and computer science, SECS 745), pp. 157-166, Boston; Dordrecht; London: Kluwer Academic Publishers (2003)
  5. Hand, D.J., Mannila, H., Smyth, P.: Principles of Data Mining (Adaptive Computation and Machine Learning), MIT Press (2001)
  6. Hastie, T., Tibshirani, R. and Friedman, J. H.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics, Springer Verlag (2001)
  7. Mitchell, T.M.: Machine Learning. The McGraw-Hill Companies, Inc. (1997)
  8. Mutzel, P., Jünger, M., Leipert, S. (eds.): Graph Drawing : 9th International Symposium, GD-2001, Lecture Notes in Computer Science, Vol. 2265. Springer-Verlag, Berlin Heidel- berg New York (2002)
  9. North, S. (ed): Symposium on Graph Drawing GD'96, Lecture Notes in Computer Science, Vol. 1190. Springer-Verlag, Berlin Heidelberg New York (1997)
  10. Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Tech. Rept. SIDL-WP-1999-020, Stanford University, January (1998)
  11. Robbins, K.S., Gorman, M.: Fast Visualization Methods for Comparing Dynamics: A Case Study in Combustion, Proceedings of the 11th IEEE Visualization 2000 Conference, IEEE Computer Society (2000)
  12. Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In Proceedings of KDD Workshop on Text Mining, pp. 109-110 (2000)
  13. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann (1999).