377 0

Agents for Intelligent Information Extraction By Using Domain Knowledge and Token-based Morphological Patterns

Title
Agents for Intelligent Information Extraction By Using Domain Knowledge and Token-based Morphological Patterns
Author
최중민
Keywords
Domain Knowledge; Information Extraction; Target Item; Ontological Term; Morphological Pattern
Issue Date
2003-11
Publisher
SPRINGER-VERLAG BERLIN
Citation
Intelligent Agents and Multi-Agent Systems. PRIMA 2003. Lecture Notes in Computer Science, v. 2891, page. 74-85
Abstract
Knowledge-based information extraction is known to have flexibility in recognizing various kinds of target information by exploiting the domain knowledge to automatically generate information-extraction rules. However, most of previous knowledge-based information-extraction systems are only applicable to labeled documents, and as a result, ontology terms must appear in the document in order to guide the system to determine the existence of the target information. To make a knowledge-based information-extraction system to be more general enough to handle both labeled and unlabeled documents, this paper proposes an enhanced scheme of knowledge-based wrapper generation by using token-based morphological patterns. Each document is represented as a sequence of tokens, rather than a sequence of logical lines, in order to capture the meaning of data fragments more correctly and recognize the target information contextually. The newly implemented system XTROS+ is presented and its performance is demonstrated.
URI
https://link.springer.com/chapter/10.1007/978-3-540-39896-7_7https://repository.hanyang.ac.kr/handle/20.500.11754/156478
ISBN
978-3-540-20460-2; 978-3-540-39896-7
DOI
10.1007/978-3-540-39896-7_7
Appears in Collections:
COLLEGE OF ENGINEERING SCIENCES[E](공학대학) > COMPUTER SCIENCE AND ENGINEERING(컴퓨터공학과) > Articles
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE