Extended XML tree pattern matching: theories and algorithms
journal contribution
posted on 2023-05-18, 01:21authored byLu, J, Ling, TW, Bao, Z, Wang, C
As business and enterprises generate and exchange XML data more often, there is an increasing need for efficient processing of queries on XML data. Searching for the occurrences of a tree pattern query in an XML database is a core operation in XML query processing. Prior works demonstrate that holistic twig pattern matching algorithm is an efficient technique to answer an XML tree pattern with parent-child (P-C) and ancestor-descendant (A-D) relationships, as it can effectively control the size of intermediate results during query processing. However, XML query languages (e.g., XPath and XQuery) define more axes and functions such as negation function, order-based axis, and wildcards. In this paper, we research a large set of XML tree pattern, called extended XML tree pattern, which may include P-C, A-D relationships, negation functions, wildcards, and order restriction. We establish a theoretical framework about “matching cross” which demonstrates the intrinsic reason in the proof of optimality on holistic algorithms. Based on our theorems, we propose a set of novel algorithms to efficiently process three categories of extended XML tree patterns. A set of experimental results on both real-life and synthetic data sets demonstrate the effectiveness and efficiency of our proposed theories and algorithms.
History
Publication title
IEEE Transactions on Knowledge and Data Engineering
Volume
23
Pagination
402-416
ISSN
1041-4347
Department/School
School of Information and Communication Technology
Publisher
IEEE Computer Soc
Place of publication
10662 Los Vaqueros Circle, Po Box 3014, Los Alamitos, USA, Ca, 90720-1314
Rights statement
Copyright 2011 IEEE Computer Society
Repository Status
Restricted
Socio-economic Objectives
Information systems, technologies and services not elsewhere classified