We present a new approach based on Genetic Algorithm to generate maximal frequent itemsets from large databases. This new algorithm called GeneticMax is heuristic which mimics natural selection approaches to finding maximal frequent itemsets in an efficient way. The search strategy of this algorithm uses lexicographic tree that avoids level by level searching, which finally reduces the time required to mine maximal frequent itemsets in a linear way. Our implementation of the search strategy includes bitmap representation of the nodes in a lexicographic tree and from superset-subset relationship of the nodes it identifies frequent itemsets. Since this new algorithm uses the principles of Genetic Algorithm, it performs global search and its time complexity is less than that of other algorithms, for the reason that genetic algorithm is based on greedy approach. We separate the effect of each step of this algorithm by experimental analysis on real databases including Tic Tac Toe, Zoo, a 10000×8 Database, and so on. Our experimental results show that this approach is efficient and scalable for different sizes of itemsets. It accesses a major database to calculate a support value for fewer number of nodes to find frequent itemsets even when the search space is very large, which dramatically reduces the search time.
History
Publication title
Proceedings of the 9th International Conference on Information Technology and Applications
Pagination
1-6
ISBN
978-0-9803267-6-5
Department/School
School of Information and Communication Technology
Publisher
IEEE-Inst Electrical Electronics Engineers Inc
Place of publication
New York, USA
Event title
9th International Conference on Information Technology and Applications
Event Venue
Sydney, Australia
Date of Event (Start Date)
2014-07-01
Date of Event (End Date)
2014-07-04
Rights statement
Copyright 2014 ICITA
Repository Status
Restricted
Socio-economic Objectives
Information systems, technologies and services not elsewhere classified