239 0

Text mining using automatic classification and summarization algorithm for news reports

Title
Text mining using automatic classification and summarization algorithm for news reports
Author
서강원
Advisor(s)
배석주
Issue Date
2016-02
Publisher
한양대학교
Degree
Master
Abstract
As services for news searching engine have been popular, interest in automatic classification and summarization becomes growing. Especially this automatic system will be of use for mobile or navigation users. This research developed a methodology to quickly monitor key intelligence areas, provided a method that consolidates information into an understandable, concise groups of topics and sentences of interest. This research evaluated and altered some existing analysis methods, and developed an overall framework for classification and summarization. Clustering analysis is commonly used for document classification. Among clustering methods, K-means algorithm is well known for effectively classifying large documents. However, as computerized database has been growing exponentially, the accuracy of clustering algorithm falls and time for algorithm increases highly. This research studied for classifying news reports in large data and extracting key sentences for a certain topic. This proposed algorithm does not just assign categories by the frequency of the words and extract sentences involving frequent words like existing algorithms. It adopted association analysis to increase accuracy for classification and a modified version of K-means algorithm to reduce clustering time. Also, it extracted key sentences within the specific areas according to the number of sentences in a report. The proposed algorithm was applied to a real news report data containing 21974 articles from October 7th 2014 to October 20th 2014. The results showed that this algorithm has a better performance than many of other popular existing algorithms.
URI
https://repository.hanyang.ac.kr/handle/20.500.11754/127185http://hanyang.dcollection.net/common/orgView/200000428190
Appears in Collections:
GRADUATE SCHOOL[S](대학원) > INDUSTRIAL ENGINEERING(산업공학과) > Theses (Master)
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE