#### World's Best AI Learning Platform with profoundly Demanding Certification Programs

Designed by IITian's, only for AI Learners.

Download our e-book of Introduction To Python

How to leave/exit/deactivate a Python virtualenvironment Exception Type: JSONDecodeError at /update/ Exception Value: Expecting value: line 1 column 1 (char 0) How to extracting text from PDF file using python What is Ensemble Learning? Which are different modes to open a file ? How to integrate Sales force and Django? How to Unpacking dictionaries using the ** operator? How to plot Bubble plot with Encircling? Join Discussion

4 (4,001 Ratings)

218 Learners

Dec 4th (7:00 PM) 213 Registered

Chiranjivi Viru

16 days ago

Classification or Regression can be categorized in two steps that is learning and predicting. In the first stage learning, the algorithm will learn from the training data and predict the unseen data in the second stage that is prediction. For these prediction part we are having plenty of algorithms, in all these algorithms Decision tree algorithm has its own value and demand in these set of algorithms because of its high interpret-ability and systematic representation.

Decision tree comes under the family of supervised learning algorithms. unlike other algorithms decision tree has capability to handle classification and regression based problems.

The goal of decision tree helps to find the value or class of the target variable from the algorithm which has learned by the prior data or training data.

In general Decision tree has two types:

Let me take a example and explain you clearly:-

If they are handling a problem like loan payments. Customer will be paying his loan amount or not. That is here we are trying to predict (yes/no). This comes under problem of Categorical Decision Tree. In second stage we have to predict, how much amount will be payed to bank, as here the target variable is continues it comes for Continues Decision tree.

The decision of making strategic splits affects on the accuracy of the model. The decision criteria is different for classification and regression.

It uses entropy or Information gain to select the root node and from there the splitting starts. For the next split it selects the node which has low entropy or high Information gain. Entropy helps in identifying the randomness of the model. While coming to pruning part as Machine Learning expert we should decide the pruning part. That is up-to which part of the tree have to pruned. If we prune too much model will be affected with under fitting or If we allowed to make more splits, model will be suffered with over fitting.

Entropy is the measure of randomness. If we are having the higher entropy, it is difficult to draw conclusions out of that information. So, that always we should select entropy that is having less value.

From the above graph, we can clearly say that H(X), the entropy is zero when the probability value is 0 or 1. We can find the maximum entropy when the probability value is 0.5. That is the reason why, we can't make any conclusion, when we have high entropy.

In simple terminology we can say Information gain is (Entropy before split) - (Entropy after split). Information Gain, or IG for short, measures the reduction in entropy or surprise by splitting a dataset according to a given value of a random variable.

A larger information gain suggests a lower entropy group or groups of samples, and hence less surprise.