본문 바로가기

짧은 영어 글들

Boosting vs Bagging: The Battle Against Overfitting in the Forest of Trees

영웅*^%&$ 2023. 7. 6. 13:37

728x90

Overfitting is generally more of a concern with boosting algorithms than with bagging algorithms when increasing the number of trees.

Boosting algorithms like Gradient Boosting and AdaBoost train models sequentially, where each new model is trained to correct the mistakes made by the previous ones. This process can create complex models that fit the training data very well. However, if too many trees are used, the model can become excessively complex, leading to overfitting, where it starts to fit the noise in the data and perform poorly on unseen data.

Bagging algorithms, on the other hand, like Random Forests, train each tree independently on different subsets of the original data (with replacement, a process known as bootstrapping). The final prediction is made by averaging the predictions of all trees (or majority voting for classification). This approach generally reduces variance without increasing bias, making bagging less prone to overfitting. Each individual decision tree might overfit its own bootstrapped sample, but when averaged together, their individual overfitting tendencies statistically cancel out, resulting in a better generalization on unseen data.

728x90

저작자표시

'짧은 영어 글들' 카테고리의 다른 글

책을 읽는다는 건 (0)	2023.07.11
The death of Mr.B (0)	2023.07.07
주펄찜에 담긴 인간의 쾌락과 욕망 (0)	2023.07.05
my team already forgave you, kudos friend (0)	2023.07.04
The joy of lifelong learning in today's world (0)	2023.07.02

티스토리툴바