Ada Boost -Explained Clearly!

Hi, guys! Welcome to the boosting series. This article helps you to understand the AdaBoost algorithm simply. At the end of this article, you (and I) will get a deeper knowledge of AdaBoost and its implementation as well.

First, what is boosting? Boosting (originally called hypothesis boosting). This will combine multiple weak learners (wrongly predicted model) into strong learners. This trains each model sequentially. There are many boosting methods available by the way we start with the Adaboost algorithm.

Notations:

Base Learners - Decision tree
Stump - decision tree with one node and two leaves.

Ada Boost:

It is also called adaptive boosting, used for classification problems, and it is the supervised learning algorithm.

This is mainly used to reduce the Bias and variance in our model.

It comprises various steps, le me break through it!

The common method to describe the AdaBoost is with a decision tree and random forest.

Consider this is our dataset. We need to predict the person is having heart disease or not. 1- having heart disease and vice versa. If you take a decision tree, it will take all the features and make it a model. But stump, it will take a single feature at a time to make a decision. Stumps are weak learners. That's why AdaBoost likes the stumps. Adaboost combines multiple weak learners for classification.

STEP 1: (Assign a Sample Weight)

W = 1/ n

n - number of observations in our data

Sample Weights are equal to all. This is our first stump.

STEP2: (Creating Base learners and selecting Base learners)

To find the weak learners, we need to create multiple decision trees (random forest) and select the weak stump for our model to train well.

We need to create a stump for each feature in our data.
Then we need to select the good stump based on Gini or entropy score.
The stump has a very lesser Gini score, that's the first stump in the forest.

STEP3: (Calculating the total error loss)

If we find the stump is having the wrong values. We need to find the total error for the particular feature.
We calculate the total error by summing the all total errors.

STEP4: (Calculate the performance of the stump)

After calculating the total error, we need to calculate the performance of the stump.

For example, to build an AdaBoost classifier, a first base classifier (such as a Decision Tree) is trained and used to make predictions on the training set. The relative weight of misclassified training instances is then increased. A second classifier is trained using the updated weights and again it makes predictions on the training set, weights are updated, and so on.

STEP5: (Update the weight)

After calculating the performance and total error, we need to update the weight for the next stump learning criteria. In order to update the weight, we need to use a simple formula.

W - weight . e.0.895 = updated weight

STEP6: (Normalize)

Our updated weight is not equal to 1, so we need to normalize the updated weight, that we use a simple formula.

Sum of updated weight ÷ updated weight[i]

STEP7: (Separate them into Buckets)

After separating into buckets, then we test your new dataset.
So we take the random variable as 0.91.
The value falls in the 2nd bucket ( 2nd row is wrongly classified row by decision tree ) so we take the horizontal data of wrongly predicted and make a new dataset
It will go until a number of data we have and the new dataset will form side by side.

Then We have a new Dataset and we will create a new stump for a new dataset and all the processes will go on.

All the steps will again restart (loop) until a less error in data.

After we need to create a decision tree for each iterated process test the data and take the majority vote!

This process will continue until the decision tree predicts everything correctly.
The main thing is that we combine multiple weak learners and make them good learners.

from sklearn.ensemble import AdaBoostClassifier 
ada_clf = AdaBoostClassifier( DecisionTreeClassifier(max_depth=1),     
        n_estimators=200, algorithm="SAMME.R", learning_rate=0.5 )  

ada_clf.fit(X_train, y_train)

Hope you like it!

Thank you!

Ada Boost -Explained Clearly!

Artificial Neurons.AI

Where Data meets Intelligence