208 0

자연어 처리를 이용한 특허 유사도 분석 프로그램

Title
자연어 처리를 이용한 특허 유사도 분석 프로그램
Other Titles
A PATENT SIMILARITY ANALYSIS PROGRAM USING NATURAL LANGUAGE PROCESS
Author
유지원
Advisor(s)
남상원
Issue Date
2007-08
Publisher
한양대학교
Degree
Master
Abstract
Many inventions have been filed yearly. Now, to find patent documents having similar technical content among many patent documents, a man having an ordinary skill in the art restricts the searching range of patent document by serching keywords and investigating the content of the patent document. However, the work to examine a great deal of patent documents annually published requires a lot of time, cost and effort. This thesis suggests a model to analyze the contents of the invention from the patent documents mechanically and sort out the patent documents in the order of high similarity. Regarding to the mechanical analysis of the invention, the proposed model extracts each component and technical characteristic from the invention using natural language processing method and calculates the similarity amongs the inventions using the extracted values. Each compoent of the invention is extracted from the claim by the formal characteristic of patent document description and natural language processing methods such as morpheme analysis and link parsing. The similarity of the invention is calculated by matching the extracted components and finding the existence of the matched component. The matching of the component is a logical inference process which decides the correspondency between the components. It should decide the substantial correspondency if linguistic expressions are different, and it accompanies a probabilitical calculating process. To decide the substantial correspondency, not only the main-component but also sub-component and technical features of the component are added to the weighting factor in the formula of calculating the similarity. If a patent document is selected by a user, a preprocessor discriminates the independent claim and the dependent claim in the claim and each component of the invention is extracted by analyzing the grammar and structure of the sentence. After comparing the similarity value which is calculated by the program with the content of the International Search Report(ISR), it is shown that about 67% of accuracy for the invention belongs to X category in the ISR.
URI
https://repository.hanyang.ac.kr/handle/20.500.11754/149179http://hanyang.dcollection.net/common/orgView/200000407512
Appears in Collections:
GRADUATE SCHOOL OF ENGINEERING[S](공학대학원) > ELECTRONIC & ELECTRICAL ENGINEERING(전기 및 전자공학과) > Theses(Master)
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE