Tree-Based Classification and Regression Models Clause Samples
Tree-Based Classification and Regression Models. The decision tree model is probably the most widely known classification and regression model. Originally, it was studied in the fields of decision theory and statistics however, it was found to be effective in other disciplines such as data mining, machine learning, and pattern recognition. Decision trees are also implemented in many real-world applications and considering the long history and the high level of interest in this approach, several surveys on decision trees are available in the literature, such as [70, 71]. A decision tree is a classifier expressed as a recursive partition of the instance space. It consists of nodes that form a rooted tree, meaning it is a directed tree with a node called root that has no incoming edges. All other nodes have exactly one incoming edge. A node with outgoing edges is called an internal or test node. All other nodes are called leaves (also known as terminal nodes or decision nodes). In a decision tree, each internal node splits the instance space into two or more subspaces according to a certain discrete function of the input attributes values. In the simplest and most frequent case, each test considers a single attribute, such that the instance space is partitioned according to the attribute’s value. In the case of numeric attributes, the condition refers to a range. Each leaf is assigned to one class representing the most appropriate target value. Alternatively, the leaf may hold a probability vector indicating the probability of the target value having a certain value. Instances are classified by navigating them from the root of the tree down to a leaf, according to the outcome of the tests along the path. Algorithms that automatically construct a decision tree from a given data set are called decision tree inducers. Typically, the goal is to find the optimal decision tree by minimizing the generalization error. There are various top-down decision trees inducers such as ID3 [72], C4.5 (j48 in Weka) [73], and CART [74]. C4.5 is an evolution of ID3 which is considered as a very simple decision tree algorithm, presented by the same author [73]. ID3 uses information gain as splitting criteria: the growing stops when all instances belong to a single value of target feature or when the best information gain is not greater than zero. Contrary to ID3, C4.5 uses gain ratio as splitting criteria: the splitting ceases when the number of instances to be split is below a certain threshold, and error-based pruning is per...
