Repository at Hanyang University: code2vec을 이용한 유사도 감정 도구의 성능 개선

Browse

My Repository

Repository at Hanyang UniversityCOLLEGE OF COMPUTING[E](소프트웨어융합대학)COMPUTER SCIENCE(소프트웨어학부)Articles

Full metadata record

DC Field	Value	Language
dc.contributor.author	도경구	-
dc.date.accessioned	2023-08-22T01:52:48Z	-
dc.date.available	2023-08-22T01:52:48Z	-
dc.date.issued	2021-01	-
dc.identifier.citation	Journal of Software Assessment and Valuation, v. 17, NO. 1, Page. 31-40	-
dc.identifier.issn	2092-8114;2733-4384	-
dc.identifier.uri	https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002726049	en_US
dc.identifier.uri	https://repository.hanyang.ac.kr/handle/20.500.11754/185679	-
dc.description.abstract	소스코드 표절은 원본 자료의 출처를 분명히 밝히지 않고 자신의 것처럼 사용하는 행위를 말한다. 소스코드 표절로 인한 문제는 법적인 분쟁을 다투는 경우까지 다양한 문제를 일으킨다. 소스코드의 표절 여부는 일반적으로 비교 대상 소프트웨어 프로젝트 내의 각 소스코드를 전수 비교하여 유사도를 측정하여 결정한다. 전수 비교는 표절 가능성이 전혀 없는 코드도 비교 대상에 포함하기 때문에 그만큼의 시간을 헛되이 소모한다. 소스코드 표절로 의심되는 비교 쌍만 선별하여 비교할 수 있으면 그만큼 비교 횟수는 줄어들게 되어 탐지 도구의 실행 속도를 향상시킬 수 있을 뿐만 아니라, 표절 가능성이 높은 부분만을 대상으로 탐지의 정확도를 높이는데 집중할 수도 있다. 본 논문에서는 code2vec 이라는 기계학습 모델을 활용하여 코드 클론으로의심되는 소스코드들을 미리 분류하여 비교 횟수를 줄임으로써 소스코드 표절 탐지의 성능을 개선할 수 있음을 보인다.;Plagiarism refers to the act of using the original data as if it were one’s own without revealing the source. The plagiarism of source code causes a variety of problems, including legal disputes. Plagiarism in software projects is usually determined by measuring similarity by comparing every pair of source code within two projects. However, blindly comparing every pair has been a huge computational burden, causing a major factor of not using tools of better accuracy. If we can only compare pairs that are probable to be clones, eliminating pairs that are impossible to be clones, we can concentrate more on improving the accuracy of detection. In this paper, we propose a method of selecting highly probable candidates of clone pairs by pre-classifying suspected source-codes using a machine-learning model called code2vec.	-
dc.description.sponsorship	"본 연구는 과학기술정보통신부 및 정보통신기획평가원의 SW중심대학지원사업의 연구결과로 수행되었음"(2018-0-00192)	-
dc.language	ko	-
dc.publisher	한국소프트웨어감정평가학회	-
dc.subject	program similarity	-
dc.subject	program plagiarism	-
dc.subject	machine learning	-
dc.subject	code clone	-
dc.subject	code comparison	-
dc.subject	프로그램 유사도	-
dc.subject	프로그램 표절	-
dc.subject	머신 러닝	-
dc.subject	코드클론	-
dc.subject	코드 비교	-
dc.title	code2vec을 이용한 유사도 감정 도구의 성능 개선	-
dc.title.alternative	Enhancing the performance of code-clone detection tools using code2vec	-
dc.type	Article	-
dc.relation.no	1	-
dc.relation.volume	17	-
dc.relation.page	31-40	-
dc.relation.journal	Journal of Software Assessment and Valuation	-
dc.contributor.googleauthor	엄태호	-
dc.contributor.googleauthor	홍성문	-
dc.contributor.googleauthor	양준혁	-
dc.contributor.googleauthor	장효석	-
dc.contributor.googleauthor	도경구	-
dc.sector.campus	E	-
dc.sector.daehak	소프트웨어융합대학	-
dc.sector.department	소프트웨어학부	-
dc.identifier.pid	doh	-

Appears in Collections:: COLLEGE OF COMPUTING[E](소프트웨어융합대학) > COMPUTER SCIENCE(소프트웨어학부) > Articles

Files in This Item:

Export: RIS (EndNote); XLS (Excel); XML

Show simple item record

한양대학교 리포지터리는 국립중앙도서관 OAK 보급사업으로 구축되었습니다. Feedback 개인정보처리방침

Hanyang University repository

Browse

My Repository

BROWSE