An extensible index for spatial databases
2001
Abstract
Emerging database applications require the use of new indexing structures beyond B-trees and R-trees. Examples are the k-D tree, the trie, the quadtree, and their variants. They are often proposed as supporting structures in data mining, GIS, and CAD/CAM applications. A common feature of all these indexes is that they recursively divide the space into partitions. A new extensible index structure, termed SP-GiST, is presented that supports this class of data structures, mainly the class of space partitioning unbalanced trees. Simple method implementations are provided that demonstrate how SP-GiST can behave as a k-D tree, a trie, a quadtree, or any of their variants. Issues related to clustering tree nodes into pages as well as concurrency control for SP-GiST are addressed. A dynamic minimum-height clustering technique is applied to minimize disk accesses and to make using such trees in database systems possible and efficient. A prototype implementation of SP-GiST is presented as well as performance studies of the various SP-GiST's tuning parameters.
References (46)
- W. G. Aref, D. Barbará, and P. Vallabhaneni. The handwrit- ten trie: Indexing electronic ink. In Proceedings of the 1995 ACM SIGMOD, San Jose, California, May 1995.
- W. G. Aref and I. F. Ilyas. A framework for supporting the class of space-partitioning trees. Technical Report 01- 002, Department of Computer Sciences, Purdue University, March 2001.
- R. Bayer. The universal B-tree for multidimensional index- ing: General concepts. Lecture Notes in Computer Science, 1274, 1997.
- N. Beckmann, H. P. Kriegel, R. Schneider, and B. Seeger. The R* -tree: an efficient robust access method for points and rectangles. SIGMOD Record, 19(2), 1990.
- J. L. Bentley. Multidimensional binary search trees used for associative searching. Communications of the ACM, 19:509-517, 1975.
- T. Brinkhoff, H.-P. Kriegel, and B. Seeger. Parallel process- ing of spatial joins using R-trees. In ICDE'1996, New Or- leans, Louisiana, February 1996.
- K. Chakrabarti and S. Mehrotra. Dynamic granular locking approach to phantom protection in R-trees. In ICDE'1998, Orlando, Florida, USA, pages 446-454, February 1998.
- K. Chakrabarti and S. Mehrotra. Efficient concurrency control in multidimensional access methods. In SIGMOD 1999, Proceedings ACM SIGMOD, Philadephia, Pennsyl- vania, USA, pages 25-36, June 1999.
- O. Corporation. Oracle spatial (data sheet). http://www.oracle.com/database/documents/spatial- ds.pdf, March 1999.
- R. de la Briandais. File searching using variable length keys. In Proceedings of the Western Joint Computer Conference, pages 295-298, 1959.
- D. J. DeWitt, N. Kabra, J. Luo, J. M. Patel, and J.-B. Yu. Client-Server Paradise. In VLDB'1994, pages 558-569, Santiago, Chile, 1994.
- A. A. Diwan, S. Rane, S. Seshadri, and S. Sudarshan. Clustering techniques for minimizing external path length. In VLDB'96, Mumbai (Bombay), India, pages 342-353, September 1996.
- C. Esperanca and H. Samet. Spatial database programming using sand. Proceedings of the Seventh International Sym- posium on Spatial Data Handling, May 1996.
- K. P. Eswaran, J. N. Gray, R. A. Lorie, and I. L. Traiger. The notions of concurrency and predicate locks in a data base system. Communications of the ACM, 19(11), 1976.
- C. Faloutsos and V. Gaede. Analysis of n-dimensional quadtrees using the Hausdorff fractal dimension. In VLDB'1996, pages 40-50, 3-6 Sept. 1996.
- C. Faloutsos, H. V. Jagadish, and Y. Manolopoulos. Analysis of the n-dimensional quadtree decomposition for arbitrary hyperectangles. TKDE, 9(3):373-383, 1997.
- R. A. Finkel and J. L. Bentley. Quad trees: a data structure for retrieval on composite key. Acta Informatica, 4(1):1-9, 1974.
- E. Fredkin. Trie memory. Commun. ACM, 3:490-500, 1960.
- V. Gaede and O. Gunther. Multidimensional access meth- ods. In ACM Computer Surveys, 30,2, pages 170-231, June 1998.
- I. Gargantini. An effective way to represent quadtrees. Com- munications ACM, 1982, 25(12):905-910, 1982.
- J. N. Gray. Notes on data base operating systems.
- In Springer Verlag (Heidelberg, FRG and NewYork NY, USA) LNCS, 'Operating Systems, an Advanced Course', Bayer,Graham, Seegmuller(eds), volume 60. 1978.
- J. N. Gray and A. Reuter. Transaction Processing: concepts and techniques. Data Management Systems. Morgan Kauf- mann Publishers, Inc., San Mateo (CA), USA, 1993.
- Effect of BucketSize on maximum path length for different settings of ShrinkPolicy for the quadtree.
- R. H. Güting. An introduction to spatial database systems. VLDB Journal, 3(4):357-399, 1994.
- A. Guttman. R-trees: a dynamic index structure for spatial searching. ACM SIGMOD, pages 47-57, June 1984.
- J. M. Hellerstein, J. F. Naughton, and A. Pfeffer. Gener- alized search trees for database system. In proceedings of VLDB, 1995.
- J. M. Hellerstein and A. Pfeffer. The RD-tree: An index structure for sets. Technical report, University of Wisconsin Computer Science, 1994.
- G. Kedem. The quad-CIF tree: A data structure for hierar- chical on-line algorithms. In ACM IEEE Nineteenth Design Automation Conference Proceedings, pages 352-357, Los Alamitos, Ca., USA, June 1982.
- A. Klinger. Pattern and search statistics. In S. RUSTAGI Ed., Optimizing Methods in Statistics, pages 303-337, 1971.
- D. E. Knuth. The Art of Computer Programming, Vol. 3. Addison-Wesley, Reading, 1973.
- M. Kornacker and D. Banks. High-concurrency locking in R-trees. In VLDB'95, Zurich, Switzerland, Sept. 1995.
- M. Kornacker, C. Mohan, and J. M. Hellerstein. Concur- rency and recovery in generalized search trees. ACM SIG- MOD, pages 62-72, May 1998.
- P. L. Lehman and S. B. Yao. Efficient locking for concur- rent operations on B-Trees. ACM Transactions on Database Systems, 6(4):650-570, Dec. 1981.
- D. R. Morrison. PATRICIA -practical algorithm to re- trieve coded in alphanumeric. J. Assoc. Comput. Mach., 15(4):514-534, 1968.
- R. C. Nelson and H. Samet. A consistent hierarchical rep- resentation for vector data. In Computer Graphics (SIG- GRAPH '86 Proceedings), volume 20(4), Aug. 1986.
- J. Nievergelt, H. Hinterberger, and K. Sevcik. The grid file: an adaptable symmetric multi-key file structure. ACM Trans- actions on Database Systems, 9(1):38-71, 1984.
- J. A. Orenstein. Multidimensional tries used for associative searching. Information Processing Letters, 14(4):150-157, June 1982.
- J. A. Orenstein and F. Manola. PROBE spatial data model- ing and query processing in an image database application. IEEE Transactions on Software Engineering, 14(5):611- 629, May 1988.
- D. Papadias, N. Mamoulis, and V. Delis. Algorithms for querying by spatial structure. In VLDB'98, New York City, New York, USA, pages 546-557, August 1998.
- H. Samet. Applications of Spatial Data Structures. Addison- Wesley, 1990.
- H. Samet. The Design and Analysis of Spatial Data Struc- ture. Addison-Wesley, 1990.
- H. Samet and R. E. Webber. Storing a collection of polygons using quadtrees. ACM Transactions on Graphics, Volume 4, Issue 3, 1985.
- B. Seeger and H.-P. Kriegel. The Buddy-tree: An efficient and robust access method for spatial data base systems. In VLDB'1990, August 1990,Queensland, Australia, Proceed- ings, 1990.
- T. Sellis, N. Roussopoulos, and C. Faloutsos. The R+ - tree: A dynamic index for multi-dimensional objects. In VLDB'1987, Brighton, UK, Sept. 1987.
- A. Szalay, P. Kunszt, A. Thakar, J. Gray, D. Slutz, and R. Brunner. Designing and mining multi-terabyte astron- omy archives: The sloan digital sky survey. In The ACM SIGMOD, pages 451-462, Dallas TX, May 2000.