DataLearner is an easy-to-use tool for data mining and knowledge discovery from your own compatible ARFF and CSV-formatted training datasets (* see below). It’s fully self-contained, requires no external storage or network connectivity – it builds models directly on your phone or tablet. This is not a training course or book - it is a genuine machine-learning-based data mining app.
>> ARFF and CSV support <<
Training datasets must conform to either the Weka ARFF format or CSV (comma-separated variable). CSV files must have the following features:
* must include a header row
* class attribute is initially set as last column
>> Force class attribute to nominal <<
Most of DataLearner's algorithms expect nominal/categorical class attributes and using a numeric class attribute will cause most algorithms to fail. The new 'force class attribute to nominal' overcomes this, however, nominal class attributes with too many distinct values may use up too much RAM.
*** NEWS! DataLearner research has been selected for presentation at ADMA 2019 (15th International Conference on Advanced Data Mining and Applications) and will be published in 'Lecture Notes in Artificial Intelligence' (Springer) ***
DataLearner features classification, association and clustering algorithms from the open-source Weka (Waikato Environment for Knowledge Analysis) package, plus new algorithms developed by the Data Science Research Unit (DSRU) at Charles Sturt University. Combined, the app provides 42 machine-learning/data-mining algorithms, including RandomForest, C4.5 (J48) and NaiveBayes.
DataLearner collects no information – it requires access to your device storage simply to load your datasets and build your machine-learning models.
DataLearner is being used as a teaching tool in the ITC573 Data and Knowledge Engineering subject
for the Master of Information Technology post-graduate degree at Charles Sturt University.
Get the resources:
GPL3-licensed source code on Github: https://github.com/darrenyatesau/DataLearner
Quick video on YouTube: https://youtu.be/H-7pETJZf-g
Research paper on arXiv: https://arxiv.org/abs/1906.03773
AusDM 2018 conference paper that initiated DataLearner: https://www.researchgate.net/publication/331126867
Researchers, if you use this app in research applications, please cite the research papers above. Thanks.
Machine-learning algorithms include:
• Bayes – BayesNet, NaiveBayes
• Functions – Logistic, SimpleLogistic, MultiLayerPerceptron (Neural Network)
• Lazy – IBk (K Nearest Neighbours), KStar
• Meta – AdaBoostM1, Bagging, LogitBoost, MultiBoostAB, Random Committee, RandomSubSpace, RotationForest
• Rules – Conjunctive Rule, Decision Table, DTNB, JRip, OneR, PART, Ridor, ZeroR
• Trees – ADTree, BFTree, DecisionStump, ForestPA, J48 (C4.5), LADTree, Random Forest, RandomTree, REPTree, SimpleCART, SPAARC, SysFor.
• Clusterers – DBSCAN, Expectation Maximisation (EM), Farthest-First, FilteredClusterer, SimpleKMeans
• Associations – Apriori, FilteredAssociator, FPGrowth
>> Where to find dataset files? <<
DataLearner comes with a built-in demo dataset called 'rain.csv', but you'll also find plenty of datasets at the OpenML website - including the popular 'ecoli' set (https://www.openml.org). Download the ARFF versions to your phone and load them into DataLearner to build models from. Watch our new video tutorial - https://www.youtube.com/watch?v=81tSbclMVT8
This software is supplied AS-IS - while it has been tested, no warranty is implied or given.