최중민
2021-01-26T01:31:33Z
2021-01-26T01:31:33Z
2002-08
PRICAI 2002: Trends in Artificial Intelligence. PRICAI 2002. Lecture Notes in Computer Science, v. 2417, page. 472-481
978-3-540-44038-3
978-3-540-45683-4
https://link.springer.com/chapter/10.1007/3-540-45683-X_51
https://repository.hanyang.ac.kr/handle/20.500.11754/157478
This paper discusses some of the issues in Web information extraction, focusing on automatic extraction methods that exploit wrapper induction. In particular, we point out the limitations of traditional heuristic-based wrapper generation systems, and as a solution, emphasize the importance of the domain knowledge in the process of wrapper generation.
We demonstrate the effectiveness of domain knowledge by presenting our scheme of knowledge-based wrapper generation for semi-structured and labeled documents. Our agent-oriented information extraction system, XTROS, represents both the domain knowledge and the wrappers by XML documents to increase modularity, flexibility, and interoperability. XTROS shows good performance on several Web sites in the domain of real estate, and it is expected to be easily adaptable to different domains by plugging in appropriate XML-based domain knowledge.
en_US
SPRINGER-VERLAG BERLIN
Wrapper Generation by Using XML-Based Domain Knowledge for Intelligent Information Extraction
Article
10.1007/3-540-45683-X_51
LECTURE NOTES IN ARTIFICIAL INTELLIGENCE
Yang, Jaeyoung
Kim, Jungsun
Doh, Kyoung-Goo
Choi, Joongmin
2007206326
E
COLLEGE OF ENGINEERING SCIENCES[E]
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
jmchoi