Repository at Hanyang University: 클래스 불균형 문제가 있는 다중클래스 텍스트 분류에서의 특징 선택 방법

279 0

클래스 불균형 문제가 있는 다중클래스 텍스트 분류에서의 특징 선택 방법

Other Titles: Feature Selection Method from Multiclass Text with Class Imbalance Problem

Keywords: Text Classification; Class Imbalance; Multi-Class Text Data; Feature Selection

Abstract: A text classification model in which one of the class variables is biased to the majority class typically classifiesmost documents into the majority class to enhance the overall classification accuracy. It is called a classimbalance problem. This study proposes a feature selection method based on simplified chi-square statistics toselect features in each class for developing a robust model to the problem. Proposed method and typical featureselection methods are compared by Reuter21578 data. Experiment shows that the proposed method is superior totypical feature selection methods in terms of naïve Bayes and support vector machine which are robust to theclass imbalance problem.

URI: http://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE08000390&language=ko_KR https://repository.hanyang.ac.kr/handle/20.500.11754/112975

Appears in Collections:: COLLEGE OF ENGINEERING SCIENCES[E](공학대학) > INDUSTRIAL AND MANAGEMENT ENGINEERING(산업경영공학과) > Articles

한양대학교 리포지터리는 국립중앙도서관 OAK 보급사업으로 구축되었습니다. Feedback 개인정보처리방침

Hanyang University repository