Intelligent Data Engineering and Automated Learning — IDEAL 2002. IDEAL 2002. Lecture Notes in Computer Science, v. 2412, page. 105-110
Abstract
This paper presents a scheme of knowledge-based wrapper generation for semi-structured and labeled documents. The implementation of an agent-oriented information extraction system, XTROS, is described. In contrast with previous wrapper learning agents, XTROS represents both the domain knowledge and the wrappers by XML documents to increase modularity, flexibility, and interoperability. XTROS shows good performance on several Web sites in the domain of real estate, and it is expected to be easily adaptable to different domains by plugging in appropriate XML-based domain knowledge.