Training set; Information Extraction; Relation & Event Extraction
Lecture Notes in Electrical Engineering, v. 330, page. 1101-1107
While training set design has received less attention from academia compared to its significance, it becomes crucial in big data environments. We propose a novel way to construct a training set for information extraction. An effective data collection considering the trade-off between system quality and annotation difficulty is the core of the proposed approach. Instead of a random collection of data like usual systems, well-defined key expressions are used as sampling queries. This work is a part of an on-going R&D project and now in process of manual annotation that would be evaluated via final system quality.