In order to solve the problem of overfitting in AdaBoost, we propose a novel AdaBoost algorithm using K-means clustering.
AdaBoost is known as an effective method for improving the performance of base classifiers both theoretically and empirically. However, previous studies have shown that AdaBoost is prone to overfitting in overlapped classes. In order to overcome the overfitting problem of AdaBoost, the proposed method uses Kmeans clustering to remove hard-to-learn samples that exist on overlapped region. Since the proposed method does not consider hard-to-learn samples, it suffers less from the overfitting problem compared to conventional AdaBoost. Both synthetic and real world data were tested to confirm the validity of the proposed method.