Adaboost (Adaptive Boosting) Homework

Adaboost (Adaptive Boosting) Homework

Question 1. Use 3 weak learners (decision stumps) to design an AdaBoost classifier trained with the following data (with Play Baseball Today as the target attribute) showing calculations worked out by hand. Provide the final decision function for the classifier.

Number of Previous Days of Rain Play Baseball Today
0 Yes
1 Yes
2 Yes
3 No
4 No
5 No
6 Yes
7 Yes
8 Yes
9 No

Solution:

AdaBoost is a boosting algorithm that combines multiple weak learners to form a strong learner. The weak learner is a decision stump which is a one level decision tree.

  1. for the first weak learner: (1) Initialize the weights of all instances to 1/10. (2) To find the best decision stump (i.e., the best attribute and threshold) that minimizes the weighted error, I test each threshold from 0 to 9 by calculating the weighted error. The weighted error is calculated by summing the weights of all instances that are misclassified by the decision stump. The math formula is: \(\epsilon = \sum_{i=1}^{n}w_iI(y_i \neq h(x_i))\) where $w_i$ is the weight of instance $i$, $y_i$ is the true label of instance $i$, $h(x_i)$ is the predicted label of instance $i$, and $I(y_i \neq h(x_i))$ is the indicator function that returns 1 if $y_i \neq h(x_i)$ and 0 otherwise.
Threshold Weighted Error
0 0.5
1 0.4
2 0.3
3 0.4
4 0.5
5 0.4
6 0.5
7 0.4
8 0.3
9 0.4

(3) The threshold with the minimum weighted error is 2 and 8. I choose 2 as the threshold which is the first threshold.

(4) The decision stump is: if Number of Previous Days of Rain <= 2, then Play Baseball Today = Yes, else Play Baseball Today = No.

(5) Calculate the Alpha value (performance of the stump) for the decision stump. The math formula is: \(\alpha = \frac{1}{2}ln(\frac{1-\epsilon}{\epsilon})\) where $\epsilon$ is the weighted error of the decision stump.

\[\alpha = \frac{1}{2}ln(\frac{1-0.3}{0.3}) = 0.4237\]

(6) Update the weights of all instances. The math formula is: \(w_i = w_i * e^{-\alpha y_i h(x_i)}\) where $w_i$ is the weight of instance $i$, $y_i$ is the true label of instance $i$, $h(x_i)$ is the predicted label of instance $i$, and $\alpha$ is the performance of the decision stump. Correctly classified instances will have $-\alpha y_i h(x_i)=1$, and misclassified instances will have $-\alpha y_i h(x_i)=-1$.

Number of Previous Days of Rain Play Baseball Today Weight Updated Weight
0 Yes 0.1 0.06546
1 Yes 0.1 0.06546
2 Yes 0.1 0.06546
3 No 0.1 0.06546
4 No 0.1 0.06546
5 No 0.1 0.06546
6 Yes 0.1 0.1528
7 Yes 0.1 0.1528
8 Yes 0.1 0.1528
9 No 0.1 0.06546

(7) Normalize the weights of all instances. The math formula is: \(w_i = \frac{w_i}{\sum_{i=1}^{n}w_i}\) where $w_i$ is the weight of instance $i$.

Number of Previous Days of Rain Play Baseball Today Normalized Weight
0 Yes 0.0714
1 Yes 0.0714
2 Yes 0.0714
3 No 0.0714
4 No 0.0714
5 No 0.0714
6 Yes 0.1667
7 Yes 0.1667
8 Yes 0.1667
9 No 0.0714

The incorrectly classified instances have higher weights than the correctly classified instances.

  1. for the second weak learner: (1) The weights of all instances are updated in the first weak learner. (2) Calculate the weighted error for each threshold from 0 to 9.
Threshold Weighted Error
0 0.642857
1 0.571429
2 0.500000
3 0.571429
4 0.642857
5 0.714286
6 0.547619
7 0.380952
8 0.214286
9 0.285714

(3) The threshold with the minimum weighted error is 8. I choose 8 as the threshold which is the second threshold.

(4) The decision stump is: if Number of Previous Days of Rain <= 8, then Play Baseball Today = Yes, else Play Baseball Today = No.

(5) Calculate the Alpha value. \(\alpha = \frac{1}{2}ln(\frac{1-0.214286}{0.0.214286}) = 0.650\)

(6) Update the weights of all instances and normalize the weights of all instances.

Number of Previous Days of Rain Play Baseball Today Weight Updated Weight
0 Yes 0.0714 0.045455
1 Yes 0.0714 0.045455
2 Yes 0.0714 0.045455
3 No 0.0714 0.166667
4 No 0.0714 0.166667
5 No 0.0714 0.166667
6 Yes 0.1667 0.106061
7 Yes 0.1667 0.106061
8 Yes 0.1667 0.106061
9 No 0.0714 0.045455
  1. for the third weak learner: (1) The weights of all instances are updated in the second weak learner.

(2) Calculate the weighted error for each threshold from 0 to 9.

Threshold Weighted Error
0 0.409091
1 0.363636
2 0.318182
3 0.484848
4 0.651515
5 0.818182
6 0.712121
7 0.606061
8 0.500000
9 0.545455

(3) The threshold with the minimum weighted error is 2. I choose 2 as the threshold which is the third threshold.

(4) The decision stump is: if Number of Previous Days of Rain <= 2, then Play Baseball Today = Yes, else Play Baseball Today = No.

(5) Calculate the Alpha value. \(\alpha = \frac{1}{2}ln(\frac{1-0.318182}{0.318182}) = 0.381\)

(6) Update the weights of all instances and normalize the weights of all instances. | Number of Previous Days of Rain | Play Baseball Today | Weight | Updated Weight | | — | — | — | — | | 0 | Yes | 0.045455 | 0.033333 | 1 | Yes | 0.045455 | 0.033333 | 2 | Yes | 0.045455 | 0.033333 | 3 | No | 0.166667 | 0.122222 | 4 | No | 0.166667 | 0.122222 | 5 | No | 0.166667 | 0.122222 | 6 | Yes | 0.106061 | 0.166667 | 7 | Yes | 0.106061 | 0.166667 | 8 | Yes | 0.106061 | 0.166667 | 9 | No | 0.045455 | 0.033333

  1. In summary (1) The Final Weight Table
Data Points Initial Weights Weight of 1st Weak Learner Weight of 2nd Weak Learner Weight of 3rd Weak Learner
0 0.1 0.071429 0.045455 0.033333
1 0.1 0.071429 0.045455 0.033333
2 0.1 0.071429 0.045455 0.033333
3 0.1 0.071429 0.166667 0.122222
4 0.1 0.071429 0.166667 0.122222
5 0.1 0.071429 0.166667 0.122222
6 0.1 0.166667 0.106061 0.166667
7 0.1 0.166667 0.106061 0.166667
8 0.1 0.166667 0.106061 0.166667
9 0.1 0.071429 0.045455 0.033333

(2) Alphas for Weak Learners: For the 1st decision stump (threshold=2): Alpha = 0.424 For the 2nd decision stump (threshold=8): Alpha = 0.650 For the 3rd decision stump (threshold=2): Alpha = 0.381

(3) Final Decision Function:

\(f(x) = sign(\sum_{t=1}^{T}\alpha_th_t(x))\) where $T$ is the number of weak learners, $\alpha_t$ is the performance of the $t$th weak learner, and $h_t(x)$ is the prediction of the $t$th weak learner.

\(f(x) = sign(0.424h_1(x) + 0.650h_2(x) + 0.381h_3(x))\) The sign of this sum will give us the final classification.




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • ANN Backpropagation Process
  • Python Review Notes
  • IAM and IAM Identity Center