210 0

추천 시스템 개선을 위한 무관심 아이템, 신뢰 네트워크, 카테고리 전문가 활용 방안

Title
추천 시스템 개선을 위한 무관심 아이템, 신뢰 네트워크, 카테고리 전문가 활용 방안
Other Titles
Improving Recommendation Systems by Exploiting Notions of Uninteresting Items, Trust Networks, and Category Experts
Author
황원석
Alternative Author(s)
Hwang, Won-Seok
Advisor(s)
김상욱
Issue Date
2016-02
Publisher
한양대학교
Degree
Doctor
Abstract
추천 시스템 (recommendation system)은 유저가 가장 선호할만한 아이템을 자동으로 제공하는 기술으로, 그 중에서 협업 필터링 (collaborative filtering)은 가장 널리 연구되고 있는 방법이다. 협업 필터링은 유저들이 남긴 평점을 이용하여, 추천 대상인 액티브 유저와 취향이 유사한 이웃 유저들을 찾고, 이웃들이 선호하는 아이템을 추천한다. 그러나 대부분의 유저들은 소수의 아이템만을 평가하기 때문에 협업 필터링에서는 적은 수의 평점만을 이용할 수 밖에 없으며, 이로 인하여 정확도가 낮아지는 데이터 희소성 문제 (data sparsity problem)가 발생한다. 또한, 지속적으로 유저와 아이템이 증가하기 때문에 성능상의 문제가 발생한다. 본 학위 논문에서는 협업 필터링의 정확도 및 성능을 향상 시키기 위하여 (1) 무관심 아이템 (uninteresting item), (2) 신뢰 네트워크 (trust network), 그리고 (3) 카테고리 전문가 (category expert)를 이용하는 네 가지 방안을 제안한다. 무관심 아이템은 유저가 별다른 매력을 느끼지 못하여 이용조차 하지 않고 무시한 아이템이다. 유저는 이 아이템들에게 직접적으로 평점을 남기지 않았기 때문에 기존의 협업 필터링은 무관심 아이템들을 활용하지 못하였다. 우리는 협업 필터링에서 무관심 아이템을 고려하도록 함으로써 더 정확한 추천 결과를 도출하도록 한다. 또한, 무관심 아이템을 추천할 아이템 후보에서 완전히 제외하여 분석할 아이템의 수를 감소시킴으로써 성능 또한 향상한다. 신뢰 네트워크는 일종의 사회 연결망으로써 유저간의 신뢰 관계를 나타내는 네트워크이다. 기존의 협업 필터링에서는 액티브 유저의 이웃 유저를 찾는 대신 신뢰 네트워크에서 타겟 유저와 연결된 유저들을 활용하였다. 이 때, 액티브 유저가 신뢰하는 유저들 (trustees)만이 타겟 유저와 유사하다고 간주하였다. 우리는 액티브 유저를 신뢰하는 유저들 (trustors) 또한 액티브 유저와 유사한 취향을 가지고 있다고 가정한다. 이 가정을 바탕으로 Trustees 뿐만 아니라 trustors도 함께 이용함으로써 더 정확한 추천 결과를 도출할 수 있다. 이 아이디어를 확장하여, 본 논문은 신뢰 네트워크를 활용하여 데이터 대치 (data imputation) 방안을 제안한다. 기존의 데이터 대치 방법은 평점만을 이용하였으나, 평점과 상이한 신뢰 네트워크를 활용하면 더 정확한 대치 결과를 도출할 수 있을 것이다. 또한, 기존 대치 방법들과 달리, 정확하게 채울 수 있는 값만을 선별하여 채운다. 그 결과, 유저가 부여한 평점과 함께 대치 방법으로 추가된 평점을 이용하여 협업 필터링의 정확도를 향상할 수 있다. 협업 필터링에서 성능상의 문제를 야기하는 병목 중 하나는 액티브 유저의 이웃을 찾아내는 과정이다. 이를 해결하기 위하여 각 유저의 이웃을 찾는 대신 카테고리 전문가를 카테고리 별로 선별하고, 이들을 통해 추천하는 방안을 새롭게 제안한다. 우리는 카테고리 전문가를 해당 카테고리의 아이템을 가장 많이 평가한 유저로 정의한다. 그 결과, 카테고리 전문가를 선출 및 유지에 드는 계산 비용이 크게 감소하여 높은 성능을 유지할 수 있다. 또한, 실생활에서 유저는 전문가의 의견을 따르는 경향이 있기 때문에 정확도 측면에서도 손실이 없을 것이다. 우리는 각 접근 방안을 실제 데이터를 이용하여 정확도 및 성능을 평가하였다. 그 결과, 첫 번째 방안은 기존 방안에 비하여 약 5배 높은 정확도를 보였으며, 1.2 - 2.3배 짧은 시간에 추천 결과를 도출하였다. 두 번째 방안은 기존 방안보다 최대 1.9배 높은 coverage를 보였으며, 정확도 또한 2% 향상되었다. 세 번째 방안은 기존 방안보다 정확도가 약 3% 향상되었으며, 특히 추천이 어려운 cold-start 유저에 대한 추천의 정확도가 6% 향상되었다. 마지막으로 네 번째 방안은 기존 방안보다 5%의 정확도 향상이 있었으며, 9배 더 빠르게 수행되었을 뿐만 아니라, coverage 또한 10% 향상되었다. |The recommendation system is a technique that provides an active user with a few items that she would like. Among various recommendation systems, the collaborative filtering (CF) is one of the most popular and effective techniques. The CF approaches recommend items given with high ratings by neighbors who have taste similar to the active user based on users’ ratings. Meanwhile, most users evaluate only a few items, leading to the data sparsity problem, which occurs low accuracy and coverage in the CF approaches. In addition, as the numbers of users, items, and ratings increase, the time for analysis and prediction gets longer in CF approaches. To improve accuracy and performance of the CF approaches, this dissertation proposes four approaches with three notions: (1) uninteresting items, (2) trust networks, and (3) category experts. The uninteresting items indicate those items that are not attractive and thus unlikely to be purchased or used by users. Because the users do not give ratings to those uninteresting items, the existing CF approaches ignore them in their recommendations. Our first approach improves the accuracy by applying the notion of uninteresting items to the framework of any existing CF approaches. In addition, it improves the performance by reducing the number of items whose preferences have to be predicted because it completely prevents those uninteresting items from being recommended as top-N items. The trust network is a kind of social network implying the trust relationships between users. Several existing approaches examine the trust network to find users who have taste similar to an active user. They assume that only trustees (i.e., users whom a certain user trusts) are similar to the active user, but we think trustors (i.e., users who trust a certain user) also are similar to her. Based on this assumption, our second approach uses trustors as well as trustees, and improves the accuracy and coverage in the CF approaches. Furthermore, by extending this idea, we propose a novel imputation approach that exploits the trust network to enrich the collaborative information. Unlike the existing approaches that only utilize the users’ ratings, our approach additionally utilizes the trust network implying the information different from the ratings. In addition, it imputes only some missing ratings whose values are likely to be inferred correctly rather than all missing ratings. Thus, because of careful imputation with abundant information, our approach improves accuracy of the CF approaches. One of the bottleneck of CF approach is finding neighbors of an active user. To alleviate this performance problem, our fourth approach utilizes the concept of category experts who have a large amount of knowledge in their own category. Our approach defines the category experts as those users who evaluate items more than others. Thus, using the category experts, instead of neighbors, is more efficient in terms of performance because finding (or maintaining category) experts needs a computational cost much less than finding (or maintaining) neighbors. Furthermore, our approach produces accurate results because the users would follow the opinions of experts in a real world. Through comprehensive experiments with several real-world dataset, this dissertation demonstrates that our approaches exploiting uninteresting items and trust networks improve the accuracies and coverages compared to existing CF approaches. The fourth approach is more efficient in terms of execution time because maintaining category experts needs a computational cost much less than finding (or maintaining) neighbors.; The recommendation system is a technique that provides an active user with a few items that she would like. Among various recommendation systems, the collaborative filtering (CF) is one of the most popular and effective techniques. The CF approaches recommend items given with high ratings by neighbors who have taste similar to the active user based on users’ ratings. Meanwhile, most users evaluate only a few items, leading to the data sparsity problem, which occurs low accuracy and coverage in the CF approaches. In addition, as the numbers of users, items, and ratings increase, the time for analysis and prediction gets longer in CF approaches. To improve accuracy and performance of the CF approaches, this dissertation proposes four approaches with three notions: (1) uninteresting items, (2) trust networks, and (3) category experts. The uninteresting items indicate those items that are not attractive and thus unlikely to be purchased or used by users. Because the users do not give ratings to those uninteresting items, the existing CF approaches ignore them in their recommendations. Our first approach improves the accuracy by applying the notion of uninteresting items to the framework of any existing CF approaches. In addition, it improves the performance by reducing the number of items whose preferences have to be predicted because it completely prevents those uninteresting items from being recommended as top-N items. The trust network is a kind of social network implying the trust relationships between users. Several existing approaches examine the trust network to find users who have taste similar to an active user. They assume that only trustees (i.e., users whom a certain user trusts) are similar to the active user, but we think trustors (i.e., users who trust a certain user) also are similar to her. Based on this assumption, our second approach uses trustors as well as trustees, and improves the accuracy and coverage in the CF approaches. Furthermore, by extending this idea, we propose a novel imputation approach that exploits the trust network to enrich the collaborative information. Unlike the existing approaches that only utilize the users’ ratings, our approach additionally utilizes the trust network implying the information different from the ratings. In addition, it imputes only some missing ratings whose values are likely to be inferred correctly rather than all missing ratings. Thus, because of careful imputation with abundant information, our approach improves accuracy of the CF approaches. One of the bottleneck of CF approach is finding neighbors of an active user. To alleviate this performance problem, our fourth approach utilizes the concept of category experts who have a large amount of knowledge in their own category. Our approach defines the category experts as those users who evaluate items more than others. Thus, using the category experts, instead of neighbors, is more efficient in terms of performance because finding (or maintaining category) experts needs a computational cost much less than finding (or maintaining) neighbors. Furthermore, our approach produces accurate results because the users would follow the opinions of experts in a real world. Through comprehensive experiments with several real-world dataset, this dissertation demonstrates that our approaches exploiting uninteresting items and trust networks improve the accuracies and coverages compared to existing CF approaches. The fourth approach is more efficient in terms of execution time because maintaining category experts needs a computational cost much less than finding (or maintaining) neighbors.
URI
https://repository.hanyang.ac.kr/handle/20.500.11754/126400http://hanyang.dcollection.net/common/orgView/200000428899
Appears in Collections:
GRADUATE SCHOOL[S](대학원) > ELECTRONICS AND COMPUTER ENGINEERING(전자컴퓨터통신공학과) > Theses (Ph.D.)
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE