earticle

논문검색

협업필터링에서 고객의 평가치를 이용한 선호도 예측의 사전평가에 관한 연구

원문정보

Pre-Evaluation for Prediction Accuracy by Using the Customer's Ratings in Collaborative Filtering

이석준, 김선옥

피인용수 : 0(자료제공 : 네이버학술정보)

초록

영어

The development of computer and information technology has been combined with the information superhighway internet infrastructure, so information widely spreads not only in special fields but also in the daily lives of people. Information ubiquity influences the traditional way of transaction, and leads a new E-commerce which distinguishes from the existing E-commerce. Not only goods as physical but also service as non-physical come into E-commerce. As the scale of E-Commerce is being enlarged as well. It keeps people from finding information they want. Recommender systems are now becoming the main tools for E-Commerce to mitigate the information overload.Recommender systems can be defined as systems for suggesting some Items (goods or service) considering customers' interests or tastes. They are being used by E-commerce web sites to suggest products to their customers who want to find something for them and to provide them with information to help them decide which to purchase. There are several approaches of recommending goods to customer in recommender system but in this study, the main subject is focused on collaborative filtering technique. This study presents a possibility of pre-evaluation for the prediction performance of customer's preference in collaborative filtering before the process of customer's preference prediction. Pre-evaluation for the pre-diction performance of each customer having low performance is classified by using the statistical features of ratings rated by each customer is conducted before the prediction process.In this study, MovieLens 100K dataset is used to analyze the accuracy of classification. The classification criteria are set by using the training sets divided 80% from the 100K dataset. In the process of classification, the customers are divided into two groups, classified group and non classified group. To compare the prediction performance of classified group and non classified group, the prediction process runs the 20% test set through the Neighborhood Based Collaborative Filtering Algorithm and Correspondence Mean Algorithm. The prediction errors from those prediction algorithm are allocated to each customer and compared with each user's error.Research hypothesisTwo research hypotheses are formulated in this study to test the accuracy of the classification criterion as follows.Hypothesis 1: The estimation accuracy of groups classified according to the standard deviation of each user's ratings has significant difference.To test the Hypothesis 1, the standard deviation is calculated for each user in training set which is divided 80% from MovieLens 100K dataset. Four groups are classified according to the quartile of the each user's standard deviations. It is compared to test the estimation errors of each group which results from test set are significantly different.Hypothesis 2: The estimation accuracy of groups that are classified according to the distribution of each user's ratings have significant differences.To test the Hypothesis 2, the distributions of each user's ratings are compared with the distribution of ratings of all customers in training set which is divided 80% from MovieLens 100K dataset. It assumes that the customers whose ratings' distribution are different from that of all customers would have low performance, so six types of different distributions are set to be compared. The test groups are classified into fit group or non-fit group according to the each type of different distribution assumed. The degrees in accordance with each type of distribution and each customer's distributions are tested by the test of    goodness-of-fit and classified two groups for testing the difference of the mean of errors. Also, the degree of goodness-of-fit with the distribution of each user's ratings and the average distribution of the ratings in the training set are closely related to the prediction errors from those prediction algorithms. Through this study, the customers who have lower performance of prediction than the rest in the system are classified by those two criteria, which are set by statistical features of customers ratings in the training set, before the prediction process.

목차

abstract
 Ⅰ. 서 론
 Ⅱ. 이론적 배경
  2.1 추천시스템
  2.2 협업필터링
  2.3 선호도 예측 알고리즘
  2.4 선호도 예측 정확도 평가척도
 Ⅲ. 가설설정 및 실험설계
  3.1 실험 dataset
  3.2 연구가설
  3.3 가설검정을 위한 실험설계
 Ⅳ. 실험을 통한 가설검정
  4.1 가설검정을 위한 선호도 예측
  4.2 가설검정
  4.3 가설검정 결과를 이용한 분석
  4.4 선별 고객의 예측 성능 향상을 위한예측 방법의 제안
  4.5 실험 결과의 요약
 Ⅴ. 결론 및 시사점
 <참 고 문 헌>
 저자소개

저자정보

  • 이석준 Seok Jun Lee. 상지대학교 경상대학 경영학과 겸임교수
  • 김선옥 Sun Ok Kim. 한라대학교 정보통신공학부 교수

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 기관로그인 시 무료 이용이 가능합니다.

      • 5,500원

      0개의 논문이 장바구니에 담겼습니다.