Main Article Content
This paper presents a new approach based on genetic algorithms (GAs) to generate maximal frequent itemsets (MFIs) from large datasets. This new algorithm, GeneticMax, is heuristic which mimics natural selection approaches for finding MFIs in an efficient way. The search strategy of this algorithm uses a lexicographic tree that avoids level by level searching which reduces the time required to mine the MFIs in a linear way. Our implementation of the search strategy includes bitmap representation of the nodes in a lexicographic tree and identifying frequent itemsets (FIs) from superset-subset relationships of nodes. This new algorithm uses the principles of GAs to perform global searches. The time complexity is less than many of the other algorithms since it uses a non-deterministic approach. We separate the effect of each step of this algorithm by experimental analysis on real datasets such as Tic-Tac-Toe, Zoo, and a 10000x8 dataset. Our experimental results showed that this approach is efficient and scalable for different sizes of itemsets. It accesses a major dataset to calculate a support value for fewer number of nodes to find the FIs even when the search space is very large, dramatically reducing the search time. The proposed algorithm shows how evolutionary method can be used on real datasets to find all the MFIs in an efficient way.