Block Classification of a Web Page by Using a Combination of Multiple Classifiers
- Title
- Block Classification of a Web Page by Using a Combination of Multiple Classifiers
- Author
- 최중민
- Keywords
- web block classification; web data mining; combining multiple classifiers
- Issue Date
- 2008-09
- Publisher
- IEEE
- Citation
- 2008 Fourth International Conference on Networked Computing and Advanced Information Management, Page. 290 - 295
- Abstract
- Recently, researchers have been actively studying on web mining with various data in the World Wide Web. Since Web pages are generally semi-structured, which makes it difficult to identify informative blocks, techniques of content detection by removing unnecessary data (e.g. advertisements) from the Web pages become important. Generally a Web page consists of many blocks containing various data and structural information. In this paper, we propose a method that classifies the blocks of a web page into an appropriate category by building a Tree Alignment model representing HTML structure and a Vector model representing the features of the blocks. Web sites normally have their own templates and the blocks may be related to different categories even though they are located in the same position in the Web browser or are structurally similar. Hence it is difficult to classify the blocks into accurate categories through building one classifier. To solve the problem, in our approach, multiple classifiers are built, one for each training domain, and the block classification proceeds through combining them. © 2008 IEEE.
- URI
- https://ieeexplore.ieee.org/document/4624157https://repository.hanyang.ac.kr/handle/20.500.11754/104868
- ISBN
- 978-076953322-3
- DOI
- 10.1109/NCM.2008.170
- Appears in Collections:
- COLLEGE OF ENGINEERING SCIENCES[E](공학대학) > COMPUTER SCIENCE AND ENGINEERING(컴퓨터공학과) > Articles
- Files in This Item:
There are no files associated with this item.
- Export
- RIS (EndNote)
- XLS (Excel)
- XML