In this paper, we focus on efficient construction of tightest matched subtree (TMSubtree) results for keyword queries on XML data based on SLCA semantics, where “matched” means that all nodes in a returned subtree satisfy the constraint that the set of distinct keywords of the subtree rooted at each node is not subsumed by that of any of its sibling node, while “tightest” means that no two subtrees rooted at two sibling nodes can contain the same set of keywords. Assume that d is the depth of a given TMSubtree, m is the number of keywords of a given query Q, we proved that if d ≤ m, a matched subtree result has at most 2m! nodes; otherwise, the size of a matched subtree result is bounded by (d−m+2)m!. Based on this theoretical result, we propose a pipelined algorithm to construct TMSubtree results without rescanning all node labels. Experiments verify the benefits of our algorithm in aiding keyword search over XML data.
History
Publication title
Database Systems for Advanced Applications Part I
Editors
S-G Lee, Z Peng, X Zhou, Y-S Moon, R Unland, J Yoo
Pagination
95-109
ISBN
978-3-642-29037-4
Department/School
School of Information and Communication Technology
Publisher
Springer-Verlag
Place of publication
Berlin, Germany
Event title
The 17th International Conference on Database Systems for Advanced Applications 2013
Event Venue
Busan, South Korea
Date of Event (Start Date)
2012-04-15
Date of Event (End Date)
2012-04-18
Rights statement
Copyright 2012 Springer
Repository Status
Restricted
Socio-economic Objectives
Information systems, technologies and services not elsewhere classified