Academia.eduAcademia.edu

Outline

An extensible index for spatial databases

2001

Abstract

Emerging database applications require the use of new indexing structures beyond B-trees and R-trees. Examples are the k-D tree, the trie, the quadtree, and their variants. They are often proposed as supporting structures in data mining, GIS, and CAD/CAM applications. A common feature of all these indexes is that they recursively divide the space into partitions. A new extensible index structure, termed SP-GiST, is presented that supports this class of data structures, mainly the class of space partitioning unbalanced trees. Simple method implementations are provided that demonstrate how SP-GiST can behave as a k-D tree, a trie, a quadtree, or any of their variants. Issues related to clustering tree nodes into pages as well as concurrency control for SP-GiST are addressed. A dynamic minimum-height clustering technique is applied to minimize disk accesses and to make using such trees in database systems possible and efficient. A prototype implementation of SP-GiST is presented as well as performance studies of the various SP-GiST's tuning parameters.

References (46)

  1. W. G. Aref, D. Barbará, and P. Vallabhaneni. The handwrit- ten trie: Indexing electronic ink. In Proceedings of the 1995 ACM SIGMOD, San Jose, California, May 1995.
  2. W. G. Aref and I. F. Ilyas. A framework for supporting the class of space-partitioning trees. Technical Report 01- 002, Department of Computer Sciences, Purdue University, March 2001.
  3. R. Bayer. The universal B-tree for multidimensional index- ing: General concepts. Lecture Notes in Computer Science, 1274, 1997.
  4. N. Beckmann, H. P. Kriegel, R. Schneider, and B. Seeger. The R* -tree: an efficient robust access method for points and rectangles. SIGMOD Record, 19(2), 1990.
  5. J. L. Bentley. Multidimensional binary search trees used for associative searching. Communications of the ACM, 19:509-517, 1975.
  6. T. Brinkhoff, H.-P. Kriegel, and B. Seeger. Parallel process- ing of spatial joins using R-trees. In ICDE'1996, New Or- leans, Louisiana, February 1996.
  7. K. Chakrabarti and S. Mehrotra. Dynamic granular locking approach to phantom protection in R-trees. In ICDE'1998, Orlando, Florida, USA, pages 446-454, February 1998.
  8. K. Chakrabarti and S. Mehrotra. Efficient concurrency control in multidimensional access methods. In SIGMOD 1999, Proceedings ACM SIGMOD, Philadephia, Pennsyl- vania, USA, pages 25-36, June 1999.
  9. O. Corporation. Oracle spatial (data sheet). http://www.oracle.com/database/documents/spatial- ds.pdf, March 1999.
  10. R. de la Briandais. File searching using variable length keys. In Proceedings of the Western Joint Computer Conference, pages 295-298, 1959.
  11. D. J. DeWitt, N. Kabra, J. Luo, J. M. Patel, and J.-B. Yu. Client-Server Paradise. In VLDB'1994, pages 558-569, Santiago, Chile, 1994.
  12. A. A. Diwan, S. Rane, S. Seshadri, and S. Sudarshan. Clustering techniques for minimizing external path length. In VLDB'96, Mumbai (Bombay), India, pages 342-353, September 1996.
  13. C. Esperanca and H. Samet. Spatial database programming using sand. Proceedings of the Seventh International Sym- posium on Spatial Data Handling, May 1996.
  14. K. P. Eswaran, J. N. Gray, R. A. Lorie, and I. L. Traiger. The notions of concurrency and predicate locks in a data base system. Communications of the ACM, 19(11), 1976.
  15. C. Faloutsos and V. Gaede. Analysis of n-dimensional quadtrees using the Hausdorff fractal dimension. In VLDB'1996, pages 40-50, 3-6 Sept. 1996.
  16. C. Faloutsos, H. V. Jagadish, and Y. Manolopoulos. Analysis of the n-dimensional quadtree decomposition for arbitrary hyperectangles. TKDE, 9(3):373-383, 1997.
  17. R. A. Finkel and J. L. Bentley. Quad trees: a data structure for retrieval on composite key. Acta Informatica, 4(1):1-9, 1974.
  18. E. Fredkin. Trie memory. Commun. ACM, 3:490-500, 1960.
  19. V. Gaede and O. Gunther. Multidimensional access meth- ods. In ACM Computer Surveys, 30,2, pages 170-231, June 1998.
  20. I. Gargantini. An effective way to represent quadtrees. Com- munications ACM, 1982, 25(12):905-910, 1982.
  21. J. N. Gray. Notes on data base operating systems.
  22. In Springer Verlag (Heidelberg, FRG and NewYork NY, USA) LNCS, 'Operating Systems, an Advanced Course', Bayer,Graham, Seegmuller(eds), volume 60. 1978.
  23. J. N. Gray and A. Reuter. Transaction Processing: concepts and techniques. Data Management Systems. Morgan Kauf- mann Publishers, Inc., San Mateo (CA), USA, 1993.
  24. Effect of BucketSize on maximum path length for different settings of ShrinkPolicy for the quadtree.
  25. R. H. Güting. An introduction to spatial database systems. VLDB Journal, 3(4):357-399, 1994.
  26. A. Guttman. R-trees: a dynamic index structure for spatial searching. ACM SIGMOD, pages 47-57, June 1984.
  27. J. M. Hellerstein, J. F. Naughton, and A. Pfeffer. Gener- alized search trees for database system. In proceedings of VLDB, 1995.
  28. J. M. Hellerstein and A. Pfeffer. The RD-tree: An index structure for sets. Technical report, University of Wisconsin Computer Science, 1994.
  29. G. Kedem. The quad-CIF tree: A data structure for hierar- chical on-line algorithms. In ACM IEEE Nineteenth Design Automation Conference Proceedings, pages 352-357, Los Alamitos, Ca., USA, June 1982.
  30. A. Klinger. Pattern and search statistics. In S. RUSTAGI Ed., Optimizing Methods in Statistics, pages 303-337, 1971.
  31. D. E. Knuth. The Art of Computer Programming, Vol. 3. Addison-Wesley, Reading, 1973.
  32. M. Kornacker and D. Banks. High-concurrency locking in R-trees. In VLDB'95, Zurich, Switzerland, Sept. 1995.
  33. M. Kornacker, C. Mohan, and J. M. Hellerstein. Concur- rency and recovery in generalized search trees. ACM SIG- MOD, pages 62-72, May 1998.
  34. P. L. Lehman and S. B. Yao. Efficient locking for concur- rent operations on B-Trees. ACM Transactions on Database Systems, 6(4):650-570, Dec. 1981.
  35. D. R. Morrison. PATRICIA -practical algorithm to re- trieve coded in alphanumeric. J. Assoc. Comput. Mach., 15(4):514-534, 1968.
  36. R. C. Nelson and H. Samet. A consistent hierarchical rep- resentation for vector data. In Computer Graphics (SIG- GRAPH '86 Proceedings), volume 20(4), Aug. 1986.
  37. J. Nievergelt, H. Hinterberger, and K. Sevcik. The grid file: an adaptable symmetric multi-key file structure. ACM Trans- actions on Database Systems, 9(1):38-71, 1984.
  38. J. A. Orenstein. Multidimensional tries used for associative searching. Information Processing Letters, 14(4):150-157, June 1982.
  39. J. A. Orenstein and F. Manola. PROBE spatial data model- ing and query processing in an image database application. IEEE Transactions on Software Engineering, 14(5):611- 629, May 1988.
  40. D. Papadias, N. Mamoulis, and V. Delis. Algorithms for querying by spatial structure. In VLDB'98, New York City, New York, USA, pages 546-557, August 1998.
  41. H. Samet. Applications of Spatial Data Structures. Addison- Wesley, 1990.
  42. H. Samet. The Design and Analysis of Spatial Data Struc- ture. Addison-Wesley, 1990.
  43. H. Samet and R. E. Webber. Storing a collection of polygons using quadtrees. ACM Transactions on Graphics, Volume 4, Issue 3, 1985.
  44. B. Seeger and H.-P. Kriegel. The Buddy-tree: An efficient and robust access method for spatial data base systems. In VLDB'1990, August 1990,Queensland, Australia, Proceed- ings, 1990.
  45. T. Sellis, N. Roussopoulos, and C. Faloutsos. The R+ - tree: A dynamic index for multi-dimensional objects. In VLDB'1987, Brighton, UK, Sept. 1987.
  46. A. Szalay, P. Kunszt, A. Thakar, J. Gray, D. Slutz, and R. Brunner. Designing and mining multi-terabyte astron- omy archives: The sloan digital sky survey. In The ACM SIGMOD, pages 451-462, Dallas TX, May 2000.