282 0

Web Document Clustering By Using Automatic Keyphrase Extraction

Title
Web Document Clustering By Using Automatic Keyphrase Extraction
Author
최중민
Keywords
Kea-means Clustering; Key phrases
Issue Date
2007-11
Publisher
IEEE
Citation
2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops, Page. 56-59
Abstract
In most traditional techniques of document clustering, the number of total clusters is not known in advance and the cluster that contain the target information cannot be determined since the semantic nature is not associated with the cluster. The well-known K-means clustering algorithm partially solves these problems by allowing users to specify the number of clusters. However, if the pre-specified number of clusters is modified, the precision of each result also changes. To solve this problem, this paper proposes a new clustering algorithm based on the Kea keyphrase extraction algorithm which returns several keyphrases from the source documents by using some machine learning techniques. In this paper, documents are grouped into several clusters like K-means, but the number of clusters is automatically determined by the algorithm with some heuristics using the extracted keyphrases. Our Kea-means clustering algorithm provides easy and efficient ways to extract test documents from massive quantities of resources.
URI
https://ieeexplore.ieee.org/document/4427539https://repository.hanyang.ac.kr/handle/20.500.11754/107362
ISBN
0-7695-3028-1
DOI
10.1109/WI-IATW.2007.46
Appears in Collections:
COLLEGE OF ENGINEERING SCIENCES[E](공학대학) > COMPUTER SCIENCE AND ENGINEERING(컴퓨터공학과) > Articles
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE