E-Business
Permanent URI for this collection
Browse
Browsing E-Business by Author "Hammoud, Suhel"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Mr-arm : a map-reduce association rule mining framework(World Scientific Publishing Co. Pte Ltd, 2013) Thabtah, Fadi; Hammoud, SuhelAssociation rule is one of the primary tasks in data mining that discovers correlations among items in a transactional database. The majority of vertical and horizontal association rule mining algorithms have been developed to improve the frequent items discovery step which necessitates high demands on training time and memory usage particularly when the input database is very large. In this paper, we overcome the problem of mining very large data by proposing a new parallel Map-Reduce (MR) association rule mining technique called MR-ARM that uses a hybrid data transformation format to quickly finding frequent items and generating rules. The MR programming paradigm is becoming popular for large scale data intensive distributed applications due to its efficiency, simplicity and ease of use, and therefore the proposed algorithm develops a fast parallel distributed batch set intersection method for finding frequent items. Two implementations (Weka, Hadoop) of the proposed MR association rule algorithm have been developed and a number of experiments against small, medium and large data collections have been conducted. The ground bases of the comparisons are time required by the algorithm for: data initialisation, frequent items discovery, rule generation, etc. The results show that MR-ARM is very useful tool for mining association rules from large datasets in a distributed environment. © 2013 World Scientific Publishing Company.Item Parallel associative classification data mining frameworks based mapreduce(World Scientific Publishing Co. Pte Ltd, 2015-06) Thabtah, Fadi; Hammoud, Suhel; Abdel-Jaber, HusseinAssociative classification (AC) is a research topic that integrates association rules with classification in data mining to build classifiers. After dissemination of the Classification-based Association Rule algorithm (CBA), the majority of its successors have been developed to improve either CBA's prediction accuracy or the search for frequent ruleitems in the rule discovery step. Both of these steps require high demands in processing time and memory especially in cases of large training data sets or a low minimum support threshold value. In this paper, we overcome the problem of mining large training data sets by proposing a new learning method that repeatedly transforms data between line and item spaces to quickly discover frequent ruleitems, generate rules, subsequently rank and prune rules. This new learning method has been implemented in a parallel Map-Reduce (MR) algorithm called MRMCAR which can be considered the first parallel AC algorithm in the literature. The new learning method can be utilised in the different steps within any AC or association rule mining algorithms which scales well if contrasted with current horizontal or vertical methods. Two versions of the learning method (Weka, Hadoop) have been implemented and a number of experiments against different data sets have been conducted. The ground bases of the comparisons are classification accuracy and time required by the algorithm for data initialization, frequent ruleitems discovery, rule generation and rule pruning. The results reveal that MRMCAR is superior to both current AC mining algorithms and rule based classification algorithms in improving the classification performance with respect to accuracy. © 2015 World Scientific Publishing Company.