|This class implements the Classification And Regression Trees algorithm by Breiman et al for decision tree learning. A CART tree is a binary decision tree that is constructed by splitting a node into two child nodes repeatedly, beginning with the root node that contains the whole dataset. |
TREE GROWING PROCESS :
During the tree growing process, we recursively split a node into left child and right child so that the resulting nodes are "purest". We do this until any of the stopping criteria is met. To find the best split, we scan through all possible splits in all predictive attributes. The best split is one that maximises some splitting criterion. For classification tasks, ie. when the dependent attribute is categorical, the Gini index is used. For regression tasks, ie. when the dependent variable is continuous, least squares deviation is used. The algorithm uses two stopping criteria : if node becomes completely "pure", ie. all its members have identical dependent variable, or all of them have identical predictive attributes (independent variables).