Spectral Decomposition for Optimal Graph Index Prediction
2013, Lecture Notes in Computer Science
https://doi.org/10.1007/978-3-642-37453-1_16Abstract
Recently, there has been ample of research on indexing for structural graph queries. However, as verified by our experiments with a large number of random graphs and scale-free graphs, the performances of indexes of graph queries may vary greatly. Unfortunately, the structures of graph indexes are too often complex and ad-hoc; and deriving an accurate performance model appears a daunting task. As a result, database practitioners may encounter difficulties in choosing the optimal index for their data graphs. In this paper, we address this problem by a spectral decomposition for predicting relative performances of graph indexes. Specifically, given a graph, we compute its spectrum. We propose a similarity function to compare the spectrums of graphs. We adopt a classification algorithm to build a model and a voting algorithm for the prediction of the optimal index. Our empirical studies on a large number of random graphs and scale-free graphs and four structurally distinguishable indexes demonstrate that our spectral decomposition is robust and almost always exhibits accuracies higher than 70%.
References (21)
- R. Agrawal, A. Borgida, and H. V. Jagadish. Efficient management of transitive relationships in large data and knowledge bases. In SIGMOD, pages 253-262, 1989.
- R. Bramandia, B. Choi, and W. K. Ng. Incremental maintenance of 2-hop labeling of large graphs. TKDE, 22:682-698, 2010.
- A. E. Brouwer and W. H. Haemers. Spectra of Graphs. Springer, 2012.
- J. Cheng, Y. Ke, W. Ng, and A. Lu. Fg-index: towards verification-free query processing on graph databases. In SIGMOD, pages 857-872, 2007.
- F. Chung. Spectral Graph Theory. Conference Board of the Mathematical Sciences, 1997.
- E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. In SODA, pages 937-946, 2002.
- J. Deng, F. Liu, Y. Peng, B. Choi, and J. Xu. Predicting the optimal ad-hoc index for reach- ability queries on graph databases. In CIKM, pages 2357-2360, 2011.
- I. S. Dhillon, Y. Guan, and B. Kulis. Weighted graph cuts without eigenvectors: A multilevel approach. TPAMI, 29:1944-1957, 2007.
- B. Hendrickson and R. Leland. An improved spectral graph partitioning algorithm for map- ping parallel computations. SIAM J. Sci. Comput., 16(2):452-469, 1995.
- J. Y. Jonathan L Gross. Handbook of Graph Theory. CRC Press, 2004.
- Y. Ke, J. Cheng, and J. X. Yu. Querying large graph databases. In DASFAA, pages 487-488, 2010.
- A. Y. Ng, M. I. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In NIPS, pages 849-856. MIT Press, 2001.
- Y. Peng, B. Choi, and J. Xu. Selectivity estimation of twig queries on cyclic graphs. In ICDE, pages 960-971, 2011.
- R. Schenkel, A. Theobald, and G. Weikum. Efficient creation and incremental maintenance of the hopi index for complex xml document collections. In ICDE, pages 360-371, 2005.
- D. A. Spielman. Spectral graph theory and its applications. In FOCS, pages 29-38, 2007.
- D. A. Spielman. Spectral graph theory. In Combinatorial Scientific Computing., Chapman and Hall/CRC Press, pages 1-23, 2011.
- X. Wu, M. L. Lee, and W. Hsu. A prime number labeling scheme for dynamic ordered xml trees. In ICDE, pages 66-, 2004.
- X. Yan and J. Han. gspan: Graph-based substructure pattern mining. In ICDM, pages 721- 724, 2002.
- X. Yan, P. S. Yu, and J. Han. Graph indexing: a frequent structure-based approach. In SIGMOD, pages 335-346, 2004.
- H. Yildirim, V. Chaoji, and M. J. Zaki. Grail: scalable reachability index for large graphs. PVLDB, 3(1-2):276-284, 2010.
- L. Zhu, B. Choi, B. He, J. X. Yu, and W. K. Ng. A uniform framework for ad-hoc indexes to answer reachability queries on large graphs. In DASFAA, pages 138-152, 2009.