Repository at Hanyang University: Selectivity Estimation Using Frequent Itemset Mining

Browse

My Repository

Repository at Hanyang UniversityCOLLEGE OF ENGINEERING[S](공과대학)COMPUTER SCIENCE AND ENGINEERING(컴퓨터공학부)Articles

421 241

Full metadata record

DC Field	Value	Language
dc.contributor.author	이춘화	-
dc.date.accessioned	2016-09-12T01:52:58Z	-
dc.date.available	2016-09-12T01:52:58Z	-
dc.date.issued	2015-02	-
dc.identifier.citation	한국지식정보기술학회 논문지(Journal of Knowledge Information Technology and Systems), v. 10, NO 1, Page. 69-78	en_US
dc.identifier.issn	1975-7700	-
dc.identifier.uri	http://www.kkits.or.kr/pds/2015/2015-10-1-07.pdf	-
dc.identifier.uri	http://hdl.handle.net/20.500.11754/23097	-
dc.description.abstract	In query processing, query optimization is an important function of a database management system since overall query execution time can be significantly affected by the quality of the plan chosen by the query optimizer. Under cost-based optimization, a query optimizer estimates the cost for every possible query plans based on the underlying data distribution in synopses of database relations. The most common synopses in commercial databases have been histograms. However, when there is correlation among datum, one-dimensional histograms can provide poor estimation quality. Motivated by this, we propose a new approach to perform more accurate selectivity estimation, even for correlated data. To deal with the correlation that may exist among datum, we adopt well-known techniques in data mining and extract attribute values that occur together frequently using frequent itemsets mining. Through experimentation, we found that our approach is effective in modeling correlations and that this method approximates intermediate relations more accurately. In fact, it gives precise estimates, particularly for the correlated data. 쿼리 최적화기에 의해 선택된 쿼리 계획은 전체 쿼리 실행 속도에 지대한 영향을 미치기 때문에, 데이터 베이스관리시스템의 쿼리 최적화 기능은 쿼리 처리과정에 있어 중요하다. Cost 기반 최적화에서 쿼리 최적화기는 모든 가능한 쿼리 계획들의 비용을 데이터베이스의 데이터 분포 정보를 기반으로 추정한다. 일반적으로 상용화되고 있는 DBMS에서 가장 흔하게 사용되는 데이터 분포 통계 정보는 히스토그램 방식으로 구축된 형태이다. 그러나 각각의 데이터들에 상호현관성이 있는 경우, 일차원 히스트그램 방법은 형편없는 추정치를 계산해낸다. 본 논문에서는 쿼리 최적화 과정에서 보다 정확한 쿼리 비용을 계산하여 쿼리계획을 선출할 수 있도록 하기 위해 데이터마이닝의 기술 중 하나인 빈발항목 마이닝Frequent Itemsets Mining) 방법을 적용하였다. 실험을 통해 제안하는 방법이 상호 연관관계 있는 데이터들에 있어 히스토그램보다 좋은 추정치를 보여 줌을 확인하였다.	en_US
dc.description.sponsorship	ICT R&D program of MSIP/IITP. [2014-044-042-001,Development of Open Screen Service Platform with Cooperative and Distributed Multiple Irregular Screens] and by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(No. 2013R1A1A2007616)	en_US
dc.language.iso	en	en_US
dc.publisher	한국지식정보기술학회	en_US
dc.subject	Query optimization	en_US
dc.subject	Correlated data	en_US
dc.subject	Database management system	en_US
dc.subject	Frequent itemsets	en_US
dc.title	Selectivity Estimation Using Frequent Itemset Mining	en_US
dc.type	Article	en_US
dc.relation.no	2	-
dc.relation.volume	10	-
dc.relation.page	69-78	-
dc.relation.journal	한국지식정보기술학회 논문지	-
dc.contributor.googleauthor	Eom, Boyun	-
dc.contributor.googleauthor	Jermaine, Christopher	-
dc.contributor.googleauthor	Lee, Choonhwa	-
dc.relation.code	2015041354	-
dc.sector.campus	S	-
dc.sector.daehak	COLLEGE OF ENGINEERING[S]	-
dc.sector.department	DEPARTMENT OF COMPUTER SCIENCE	-
dc.identifier.pid	lee	-

Appears in Collections:: COLLEGE OF ENGINEERING[S](공과대학) > COMPUTER SCIENCE AND ENGINEERING(컴퓨터공학부) > Articles

Files in This Item:: Selectivity Estimation Using Frequent Itemset Mining.pdf Download

Export: RIS (EndNote); XLS (Excel); XML

Show simple item record

한양대학교 리포지터리는 국립중앙도서관 OAK 보급사업으로 구축되었습니다. Feedback 개인정보처리방침

Hanyang University repository

Browse

My Repository

BROWSE