Key Terminology

The following table describes the most important terms used in the Analytics Model functionality.

Term Description
Algorithm The process that is used to learn the structure of the network.
AdaBoost

Adaptive Boosting. Designed to improve the performance of the model generation process.

Creates ten classifiers with different strengths which are combined into a weighted sum representing the final output of the boosted classifier.

Arc A directional link between two nodes. These are represented by arrows in the Bayesian network Model view.
Attribute Selection Removes redundant or irrelevant fields from the data, reducing overfitting and simplifying the model with a small impact to its performance.
Child Node An event that has a dependency to its parent node. The outcomes of the parent node have a causal effect on the outcomes of the child node.
Classifier The classifier node in a Bayesian network is the field of interest that is to be predicted. While predictions can be made on other outcomes of the network, the structure of the network can be affected by which node is the classifier. The accuracy of the network can be evaluated by how accurately the classifier is predicted.
DAG Directed acyclic graph. Finite directed graph with no directed cycles.
Leaf Node Represents a classification or decision in a decision tree.
Markov Blanket The Markov blanket is the only knowledge needed to predict the behaviour of a particular node. They are the neighbouring nodes in the network to the node of interest and can be either the node’s parents, children or any other parents. Any nodes outside the Markov Blanket do not directly affect the node itself.
Node Represents an event and all the possible outcomes and their related probabilities of the event.
Parent Node An event that has a dependency to its child node. The outcomes of the parent node have a causal effect on the outcomes of the child node.
Positive Rate The true positive rate is the proportion of data classified correctly for an outcome while the false positive rate is the proportion of data classified incorrectly for an outcome.
Random Classifier

Useful for assessing a model’s accuracy, a random classifier can be used as a direct comparison to the predictions made by the model.

Chooses the outcome of the classifier at random, with probabilities adjusted according to the distribution of potential outcomes in the data.

ROC Curve Plots the true positive rate against the false positive rate for varying threshold values on the probability estimates.