Adaptive Boosting (In-depth intuition)
This post assumes you are already familiar with the decision trees, and bootstrapping.
Adaboost is a boosting technique. In simple words, boosting is a sequential (next to next) model. It is a powerful model for classification problems because sequentially tree growing considering past mistakes. It adapts the weights to its own and changes the weights based on past mistakes.
Adaboost likes weak learners. It combines multiple weak learners and change weak learners to super learners. ❤️🔥
This AdaBoost will build based on the decision trees. So, decision trees' knowledge is highly acceptable.
Terminology!!
Stump - A tree with one node and two leaves!
- Stumps are not good at making accurate classifications!
- Stumps are technically "weak learners".
- Why? It takes one feature to make a decision on the dataset at a particular time!
- That's why AdaBoost likes weak learners.
In contrast, in a forest of trees made with adaboost, the trees are usually a node and two leaves (stumps).
Order of Stump is important?
- Usually, the order of the nodes is not important to a random forest, decision tree because they are independent of each other.
- But in our case, the Order of stump is too important!
In contrast, in the Forest of Stumps made with AdaBoost, the order is important, but in the random forest it's not important.
- Because the error that the first stump makes influences how the second stump is made, then the error that the second stump makes influences how the third stump is made. Etc, etc, etc..
Three Main Ideas Behind the Ada boost!
- Adaboost Combines a lot of weak learners to make classifications. Then weak learners are always stumps.
- Some stumps get more say in the classification than others.
- All stumps are made by taking the previous stump's mistake into account.
Creating Adaboost!
Consider we need to classify patient is having heart disease or not?
Step1: ( Initialize sample weight )
- Give sample weight to each row in the dataset.
- It shows how important it is to be correctly Classified.
Formula:
Sample Weight = 1 / Total number of samples!
Step 2: ( Create First Stump )
- Selecting the first stump is a complicated process if you don't know about Gini impurity or entropy. It's very easy if you know about that. In simple words, you need to calculate the Gini impurity for each feature in your dataset and find which features is having a very low Gini index we will take as for our first stump node.
Calculate say for the first stump!
- Where ever we have a wrong classification, we can find the say using the wrong classification in our stump.
- In our stump, we have 1 wrong classification (Incorrect classification).
- The patient weight stump has 1 wrong sample.
- The total error of the stump is the sum of the weights associated with Incorrect classified samples.
- The total error will always be between 0 for a perfect stump and 1 for a bad stump/
- We use Total error to determine the amount of say in the final classification in the stump.
Formula:
Amount of SAY = 1/2log (1 - Total error) / (Total Error)
Now, we need to learn how to change the weight. So, that the next stump will take the errors that the current stump made.
Step 3: (Increase and decrease the sample weight)
In simple words, we need to increase the sample weight for wrongly classified value and we need to decrease the sample weight for correctly classified value.
The formula for increasing the sample weight
Increase the Sample weight (Formula) = sample weight x e amount of say ( Incorrectly classified values)
The formula for decreasing the sample weight
Decrease the Sample weight (Formula) = sample weight x e - amount of say ( Correctly classified values)
The major difference is that the exponential is having a negative term here, it helps to reduce the correct value.
If you see carefully, the wrongly classified sample values get increased, and correctly classified sample weight gets decreased.
These values are not normalized. It means if you add everything, it must be 1, but the sum of the new weight is not equal to 1. So, we normalized the data.
The formula for normalizing:
= New weight[i] / sum (new weight)
Then we will update the new weights to the old sample weight, for creating a second stump! Then we can use the updated sample weight to make the second stump in the AdaBoost forest (repeated words)
Norm.weight -> Sample weight
Now we can use the modified sample weights to make the second stump in the AdaBoost forest.
In theory, to create the next stump, we will use the weighted Gini Index, but instead of using the weighted Gini Index, we can make a new collection of samples that contains duplicate copies of the samples with the largest sample weights.
Process:
- In simple words, we will create a new empty dataset with the same shape as an old dataset, then we pick a random number between 0 and 1, if the random number is between the buckets, we take that row to the newly created empty dataset.
- For creating buckets like 1st bucket (0.07 + 0.07 = 0.14), 2nd bucket ( 0.14 + 0.07 = 0.21 ), 3rd bucket ( 0.21+0.07=0.28 ), 4th bucket ( 0.28 + 0.49 = 0.77 ) like this etc,. . .
For Example:
1st random number is 0.72. It is in the 5 buckets, right?
2nd random number is 0.42. It is 4 buckets, right?
3rd random number is 0.83 it is in 6 buckets, right?
This random initialization was happened automatically by the algorithm!
We just continue this process until the new empty dataset and old dataset have the same shape.
This is our updated dataset. Now, we can use this to make a new stump from the beginning.
Then we do all the process from the beginning! Then this process will go n number of times.
How does Adaboost find the answer?
- Now, we have AdaBoost stump forest, we will separate the stump classified as having heart disease and not having heart disease.
- Then, we will find the sum of heart disease and sum of not heart disease stumps (predictions), then which total is a higher value, then take it as an answer.
This is how AdaBoost working ❄️😌
Did you like this article?
+
Name: R.Aravindan
Position: Content Writer
Company: Artificial Neurons.AI