Academia.eduAcademia.edu

Outline

A Clustering-based Scheme for Labeling XML Trees

2006

Abstract

Tree labeling plays a key role in XML query processing. In this paper, we propose a new labeling scheme, called Clusteringbased Labeling. Unlike all previous labeling methods, In this labeling scheme elements are separated into various groups, and a label is assigned to a group of elements instead of one element. Based on Clustering-based Labeling we design a new relational schema, similar to OrdPath scheme, for storing XML documents in relational database. Grouping Sibling nodes into one record reduces number of relational records needed for XML document storage. Our experimental results shows that our storing scheme significantly is better than tree well-known relational XML storing methods in terms of number of stored records, document reconstruction time and query processing performance.

References (18)

  1. François Yergeau, Tim Bray, Jean Paoli, C. M. Sperberg- McQueen, Eve Maler. "Extensible Markup Language (XML) 1.0," (3rd edition) W3C Recommendation 4 February 2004.
  2. Document Object Model (DOM) Events Specification, Version 1.0 W3C Recommendation 13 November, 2000
  3. Mary Fernández, Ashok Malhotra, Jonathan Marsh, Marton Nagy, Norman Walsh. "XQuery 1.0 and XPath 2.0 Data Model", W3C Working Draft, last release 23 July 2004
  4. Jayavel Shanmugasundaram, H. Gang, Kristin Tufte, Chun Zhang, David DeWitt, Jeffrey F. Naughton, "Relational Databases for Querying XML Documents: Limitations and Opportunities," Proc. Of 25th Intl. Conf. on Very Large Data Bases (VLDB), Edinburgh, Scotland, UK, pp. 302-314, September 1999.
  5. P. Bohannon, J. Freire, P. Roy, J. Simeon, "From XML schema to relations: a cost-based approach to XML storage," Proc.18th ICDE 2002, San Jose, California, USA, pp. 64 -75, March 2002.
  6. D. Florescu and D. Kossman. Storing and Querying XML Data using an RDBMS. DataEngineering Bulletin, 22(3), 1999.
  7. Masatoshi Yoshikawa, Toshiyuki Amagasa, Takeyuki Shimura, Shunsuke Uemura: XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM Trans. Internet Techn. 1(1): 110-141 (2001)
  8. Haifeng Jiang, Hongjun Lu, Wei Wang, Jeffrey Xu Yu: XParent: An Efficient RDBMS-Based XML Database System. ICDE 2002: 335-336
  9. O'Neil, E.; O'Neil, P.; Pal, S.; Cseri, I.; Schaller, G.; Westbury, N.: ORDPATHs: Insert-Friendly XML Node Labels. ACM SIGMOD Industrial Track, 2004
  10. J. Shanmugasundaram, K. Tufte, G. He, C. Zhang, D. DeWitt, and J. Naughton. Relational Databases for Querying XML Documents: Limitations and Opportunities. In VLDB, 1999.
  11. Cohen, E.; Kaplan, H.; Milo, T.: Labeling Dynamic XML Trees. In Proc. of PODS 2002
  12. Supporting Efficient Streaming and Insertion of XML Data in RDBMS, Timo Böhme, Erhard Rahm
  13. Q. Li & B. Moon, "Indexing and Querying XML Data for Regular Path Expressions", Proceeding of 27th VLDB Conference, 2001, pp. 361-370.
  14. E. Cohen, H. Kaplan, T. Milo, "Labeling Dynamic XML Trees", Proceedings of the 21st ACM SIGMOD-SIGACT- SIGART symposium on Principles of database systems, 1992, pp. 272-281.
  15. J. Lu & T.W. Ling, "Labeling and Querying Dynamic XML Trees", APWeb, LNCS 3007, 2004, pp. 180-189.
  16. X. Wu, M.L. Lee, W. Hsu, "A Prime Number Labeling Scheme for Dynamic Ordered XML Trees", Proceedings of the 20th Int Conference on Data Engineering, 2004.
  17. I. Tatarinov, S. Viglas, K. S. Beyer, J. Shanmugasun-daram, E. J. Shekita, and C. Zhang:. Storing and querying ordered XML using a relational database system. In Proc. of SIGMOD, pages 204{215, 2002.
  18. From Region Encoding To Extended Dewey: On Efficient.. -Lu, Ling, Chan, Chen (2005)