In this paper, we focus on efficient keyword query processing for XML data based on SLCA and ELCA semantics. We propose for each keyword a novel form of inverted list, which includes IDs of nodes that directly or indirectly contain the keyword. We propose a family of efficient algorithms that are based on the set intersection operation for both semantics. We show that the problem of SLCA/ELCA computation becomes finding a set of nodes that appear in all involved inverted lists and satisfy certain conditions. We also propose several optimization techniques to further improve the query processing performance. We have conducted extensive experiments with many alternative methods. The results demonstrate that our proposed methods outperform existing ones by up to two orders of magnitude in many cases.
History
Publication title
Proceedings of the IEEE 28th International Conference on Data Engineering
Pagination
905-916
ISBN
978-1-4673-0042-1
Department/School
School of Information and Communication Technology
Publisher
IEEE
Place of publication
United States of America
Event title
IEEE 28th International Conference on Data Engineering
Event Venue
Washington DC, USA
Date of Event (Start Date)
2012-04-01
Date of Event (End Date)
2012-04-05
Rights statement
Copyright 2012 IEEE
Repository Status
Restricted
Socio-economic Objectives
Information systems, technologies and services not elsewhere classified