Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Data Mining Knowl Discov
Impact Factor: 2.111

Evaluation and comparison of open source software suites for data mining and knowledge discovery

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

The growing interest in the extraction of useful knowledge from data with the aim of being beneficial for the data owner is giving rise to multiple data mining tools. Research community is specially aware of the importance of open source data mining software to ensure and ease the dissemination of novel data mining algorithms. The availability of these tools at no cost, and also the chance of better understanding of the approaches by examining their source code, provides the research community with an opportunity to tune and improve the algorithms. Documentation, updating, variety of algorithms, extensibility, and interoperability among others can be major issues to motivate users for opting for a specific open source data mining tool. The aim of this paper is to evaluate 19 open source data mining tools and to provide the research community with an extensive study based on a wide set of features that any tool should satisfy. The evaluation is carried out by following two methodologies. The first one is based on scores provided by experts to produce a subjective judgment of each tool. The second procedure performs an objective analysis about which features are satisfied by each tool. The ultimate aim of this work is to provide the research community with an extensive study on different features included in any data mining tool, either from a subjective and an objective point of view. Results reveal that RapidMiner, Konstanz Information Miner, and Waikato Environment for Knowledge Analysis are the tools that include higher percentage of these features. WIREs Data Mining Knowl Discov 2017, 7:e1204. doi: 10.1002/widm.1204

Summary of scores obtained in each category by each of the studied tools.
[ Normal View | Magnified View ]
Summary of the results obtained for the two methodologies (score and characterization procedures).
[ Normal View | Magnified View ]
Summary of percentage of features satisfied by each data mining tool according to a set of four categories.
[ Normal View | Magnified View ]

Related Articles

Data mining tools
Software mining and fault prediction

Browse by Topic

Application Areas > Data Mining Software Tools
Technologies > Computer Architectures for Data Mining

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts