Study of WEKA tool

Practical - 6
Study of WEKA tool.

Download Practical

Introduction:
Weka is open source software under the GNU General Public License. System is developed at the University of Waikato in New Zealand.”Weka”stands for the Waikato Environment for Knowledge Analysis.

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.

Weka has extensive help facilities built in and comes with a comprehensive manual. Weka supports several standard data mining tasks, more specifically, data preprocessing, clusteringclassificationregression, visualization, and feature selection. All of Weka's techniques are predicated on the assumption that the data is available as one flat file or relation, where each data point is described by a fixed number of attributes. Weka provides access to SQL databases using Java Database Connectivityand can process the result returned by a database query. It is not capable of multi-relational data mining.

Advantages of Weka include:
· Free availability under the GNU General Public License.
· Portability, since it is fully implemented in the Java programming language and thus runs on almost any modern computing platform.
· A comprehensive collection of data preprocessing and modeling techniques.
·  Ease of use due to its graphical user interfaces.

Weka's main user interface is the Explorer, but essentially the same functionality can be accessed through the component-based Knowledge Flow interface and from the command line. There is also the Experimenter, which allows the systematic comparison of the predictive performance of Weka's machine learning algorithms on a collection of datasets.

Installing Weka

The main task is to install and run Weka, a widely used, FREE, Data Mining Software Toolbox in Java. Following are the basic steps of installing, running the software, building classifiers, and labeling test cases.

Step 1: Installing Weka Go to the Weka website, http://www.cs.waikato.ac.nz/ml/weka/, and download the software. On the left hand side, click on the link that says download. Select the appropriate link corresponding to the version of the software based on your operating system. Save the self-extracting executable to disk and then double click on it to install Weka. Answer yes or next to the questions during the installation. Click yes to accept the Java agreement if necessary. After you install the program Weka should appear on your start menu under Programs.

Step 2: Running Weka From the start menu select Programs, then Weka, then Weka 3*. You will see the Weka GUI Chooser. Select Explorer. The Weka Explorer will then launch.




Step 3: Load Demo Set You will find the training set, Weather-numeric.arff on the course website. The Weather-numeric.arff contains the following data:



On the Weka Explorer, push the button that says open file. Open Weather-numeric.arff.

Step 4: Constructing the Initial Decision Tree
Select the tab that says Classify. In the box that says classifier, you can choose a classifier. Click on the Choose button and you will be presented with a hierarchy of methods. Pick weka, classifiers, trees, J48. Click on the text box in the classifer box (which says J48 and some cryptic options instead of ZeroR which is the default classifier). In the popup, more setting are given. Then Click OK.





Step 6: Results You may have to scroll up and down in the classifier output box to see all the results.

Features of Weka:
· 49 data preprocessing  tools
· 76 classification/ regression algorithms
· 8 clustering algorithms
· 15 attribute/subset evaluators +10 search algorithms for feature selection.
· 3 algorithms for finding association rules
· 3 graphical user interfaces
            “The Explorer “(exploratory data analysis)
            “The Experimenter”(experimental environment)

 “The Knowledge Flow”(new process model inspired interface)

Comments

Post a Comment

Popular posts from this blog

Study of DB Miner Tool

Create calculated member using arithmetic operators and member property of dimension member