Kabir_whole_thesis.pdf (2.61 MB)
New evolutionary algorithms for mining interesting association rules
thesisposted on 2023-05-27, 10:05 authored by Kabir, MMJ
This PHD thesis deals with the evolutionary algorithms for mining frequent patterns and discovering useful and interesting Boolean association rules from large data sets. Initially, the classical algorithms for mining frequent patterns and single and multi- objective evolutionary algorithms for discovering association rules using different measures are studied. Secondly, the problem of extracting frequent patterns using classical algorithms and obtaining a set of high quality association rules relying on the evolutionary algorithms are addressed. The objectives of this thesis are as follows: 1. Designing evolutionary algorithms for extracting frequent patterns from large data sets. 2. Designing multi-objective evolutionary algorithms for discovering a reduced set of high quality Boolean association rules from categorical data sets. 3. Improving the single seed based genetic algorithm by designing a multiple seeds based genetic algorithm for mining Boolean association rules. To accomplish these objectives, this research evolved different evolutionary algorithms for mining frequent patterns efficiently, and obtaining high quality Boolean association rules (BARs). Firstly, the method named GeneticMax, a new approach based on a genetic algorithm, is used to mine maximal frequent item sets by accessing a large data set for fewer number of nodes. This method is improved by another approach named Hybrid GeneticMax. This new model which outperforms the GeneticMax algorithm if there are a reasonable amount of infrequent items in 1- item sets. This proposal shows the power of using an evolutionary algorithm along with a local search mechanism for generating maximal frequent item sets from a lexicographic tree. On the other hand, this research proposed particle swarm optimization (PSO) based approach, a new heuristic algorithm for mining association rules for both frequent and infrequent items. This approach can mine rules for more than three items. Secondly, a new multi-objective evolutionary model named Association Rules Mining with Genetic Algorithm Using an Adaptive Mutation Method (ARMGAAM), which is very useful for mining reduced sets of Boolean association rules from categorical data sets. Another method named Mining Boolean Association Rules with Evolutionary Algorithm (MBAREA), a new evolutionary model which extends the existing Association Rule Mining with Genetic Algorithm (ARMGA) and Multi-objective Association Rule Mining with Genetic Algorithm (ARMMGA), maximizes two objectives; performance and interestingness. The former method uses a re-initialization technique along with an adaptive mutation method whereas the latter uses a class based mutation method along with a best population technique. Both methods discover a reduced set of BARs from different data sets with a good trade-off among the number of generated rules and different measures. Finally, MSGA, a new genetic algorithm based on multiple seeds for producing an effective initial population, has a higher search efficiency along with good convergence speed, prevents the limitation of selecting an effective single seed for generating an initial population for mining BARs. Of particular note, the selection of above mentioned evolutionary algorithms depends on the specific needs of users.
Rights statementCopyright 2016 the author