Generate Decision Tree
Clicking Create Decision Tree after you have configured the settings on the Decision Tree screen navigates you to the generated model screen.
All the available network Views are shown in a series of tabs on the left panel. The following actions are available at the top right:
-
Conditional Formatting (applies only to Tree and Paths views): The options that can be configured are displayed under the Threshold heading. Click the relevant box under the Colour heading and either type the hex code or use the pop-up colour picker.
Click New Threshold to add a new condition and use the percentage slider to specify its range. To delete a threshold, click the x icon next to the relevant colour.
Click Apply to save changes and close the window.
Click New Threshold to create an entry, then use the slider to select a desired percentage. Then choose a colour either by typing in the hex code or using the pop-up colour selector. When all thresholds are set, click Apply to save the changes.
- New Decision Tree: Restarts the generation procecss
Display Options
The nodes used to build the decision tree are displayed under the Nodes heading on the left panel on the Tree and Paths views. The following options are available:
- Sort: Changes the sorting of the nodes as follows:
- Name A-Z (default)
- Name Z-A
- Largest Percentage
- Smallest Percentage
- Percentage: Changes the percentage type for each node.
- Show Percentage of Parent Node:Displays the percentage of the records in the data that match the outcomes of the node and all the nodes above it in the tree.
- Show Percentage of Total: Displays the percentage of a total for the node divided by the percentage of total for the node’s immediate parent.
-
Colour Nodes: Allows you to colour the background of any nodes selected in the drop-down list. Click Apply to close the pop-up window and view your colour selections.
Views
You can access the views through the following tabs:

This view displays the model as a series of interactive nodes allowing the exploration of connected variables and how outcomes are directly and indirectly linked.
Initially, the tree will be displayed as the classifier and the first row of nodes. Click a node to expand to the next row, and click the same node again to collapse the row. Following a specific branch will return a leaf node when all other connected variables have been displayed.
Click anywhere in the background and move the cursor to move the tree into the desired position. You can use the mouse wheel to zoom in and zoom out the tree of nodes. Additionally the following display options are available on the left panel:
- Centre View: Applies an optimal zoom that displays the entire tree in the middle of the screen
- Collapse/ Expand Row: Quickly expands and collapses entire rows at a time
- Expand / Collapse All: Quickly expands and collapses the entire tree

Using the default algorithm settings, 80% of the data is used to generate the network and the remaining 20% is used for testing the performance of the network. In this scenario, the Evaluation view will display the results of the analysis detailed below.
If the network has been generated from 100% of the data, there are two options for evaluation:
-
Cross validation first divides the dataset into five parts. Four of the pieces are used for training and the last piece is used for testing. This is repeated for all five pieces of the dataset and the results of the evaluation are averaged.
To run this evaluation, click Run Cross Validation. The results of the analysis will then be displayed.
Note: Running cross validation takes approximately five times longer to generate the network.
- Rerunning the Network with 80% of the data is also an option, using the remaining 20% for training and to test the model for accuracy. Click Rerun Network with 80/20 Split to generate the model with this analysis.
Once one of these options has been selected, statistics evaluating the classifier node are available.

For each row of data in the test set, the value of the classifier variable predicted by the network is compared with the actual value. The variables correctly classified will be in a diagonal line from the top left to the bottom right of the table.

The following analysis is performed on the network:
Analysis | Description |
---|---|
Correctly Classified | The number of rows in the test set for which the classifier variable was correctly classified by the network. |
Incorrectly Classified | The number of rows in the test set for which the classifier variable was classified incorrectly by the network. |
Kappa Statistic |
Classification accuracy normalised by the imbalance of the classes in the data. An alternative to simple percentage agreement calculation that takes into account the possibility of the agreement occurring by chance. The closer the result is to 1, the more accurately the network has classified the variables. |
Mean Absolute Error |
Measures how close forecasts or predictions are to the eventual outcomes regardless of direction. The closer the result is to 1, the less accurately the network has scored. A score of 0 indicates no errors. |
Root Mean Squared Error |
Represents the same standard deviation of the differences between predicted values and observed values. The greater difference between the Root Mean Squared error and the Mean Absolute Error is the greater the variance in the individual errors in the sample. If two measures are equal, then all the errors are of the same magnitude. The closer the result is to 1, the less accurately the network has scored. A score of 0 indicates no errors. |
Relative Absolute Error | Takes the total absolute error and normalises it by dividing by the total absolute error of the simple predictor that classifies variables randomly. |
ROC Curve | Select a value from the Outcome drop-down list. The resulting ROC curve plots the true positive rate against the false positive rate for varying threshold values on the probability estimates. For example, a threshold value of 0.5 means that the predicted probability of ‘positive’ must be higher than 0.5 for the instance to be predicted as ‘positive’. This displays the model’s ability to predict the outcome compared to a random classifier. |