DIGITAL LIBRARY ARCHIVE
HOME > DIGITAL LIBRARY ARCHIVE
< Previous   List   Next >  
Product Evaluation Criteria Extraction through Online Review Analysis: Using LDA and k-Nearest Neighbor Approach
Full-text Download
Ji Hyeon Lee (Department of Philosophy, College of Humanities, Hanyang University)
Sang Hyung Jung (School of Business, Hanyang University)
Jun Ho Kim (Department of Mathematics, College of Natural Sciences, Hanyang University)
Eun Joo Min (School of Finance, Hanyang University)
Un Yeong Yeo (School of Finance, Hanyang University ***** School of Business Informatics, Hanyang University)
Jong Woo Kim (School of Business, Hanyang University)
Vol. 26, No. 1, Page: 97 ~ 117
10.13088/jiis.2020.26.1.097
Keywords
product evaluation criteria, review analysis, extracting evaluation criteria, LDA, k-NN

Abstract
Product evaluation criteria is an indicator describing attributes or values of products, which enable users or manufacturers measure and understand the products. When companies analyze their products or compare them with competitors, appropriate criteria must be selected for objective evaluation. The criteria should show the features of products that consumers considered when they purchased, used and evaluated the products. However, current evaluation criteria do not reflect different consumers’ opinion from product to product. Previous studies tried to used online reviews from e-commerce sites that reflect consumer opinions to extract the features and topics of products and use them as evaluation criteria. However, there is still a limit that they produce irrelevant criteria to products due to extracted or improper words are not refined. To overcome this limitation, this research suggests LDA-k-NN model which extracts possible criteria words from online reviews by using LDA and refines them with k-nearest neighbor.
Proposed approach starts with preparation phase, which is constructed with 6 steps. At first, it collects review data from e-commerce websites. Most e-commerce websites classify their selling items by high-level, middle-level, and low-level categories. Review data for preparation phase are gathered from each middle-level category and collapsed later, which is to present single high-level category. Next, nouns, adjectives, adverbs, and verbs are extracted from reviews by getting part of speech information using morpheme analysis module. After preprocessing, words per each topic from review are shown with LDA and only nouns in topic words are chosen as potential words for criteria. Then, words are tagged based on possibility of criteria for each middle-level category. Next, every tagged word is vectorized by pre-trained word embedding model. Finally, k-nearest neighbor case-based approach is used to classify each word with tags.
After setting up preparation phase, criteria extraction phase is conducted with low-level categories.
This phase starts with crawling reviews in the corresponding low-level category. Same preprocessing as preparation phase is conducted using morpheme analysis module and LDA. Possible criteria words are extracted by getting nouns from the data and vectorized by pre-trained word embedding model. Finally, evaluation criteria are extracted by refining possible criteria words using k-nearest neighbor approach and reference proportion of each word in the words set.
To evaluate the performance of the proposed model, an experiment was conducted with review on ‘11st’, one of the biggest e-commerce companies in Korea. Review data were from ‘Electronics/Digital’ section, one of high-level categories in 11st. For performance evaluation of suggested model, three other models were used for comparing with the suggested model; actual criteria of 11st, a model that extracts nouns by morpheme analysis module and refines them according to word frequency, and a model that extracts nouns from LDA topics and refines them by word frequency. The performance evaluation was set to predict evaluation criteria of 10 low-level categories with the suggested model and 3 models above.
Criteria words extracted from each model were combined into a single words set and it was used for survey questionnaires. In the survey, respondents chose every item they consider as appropriate criteria for each category. Each model got its score when chosen words were extracted from that model. The suggested model had higher scores than other models in 8 out of 10 low-level categories. By conducting paired t-tests on scores of each model, we confirmed that the suggested model shows better performance in 26 tests out of 30. In addition, the suggested model was the best model in terms of accuracy.
This research proposes evaluation criteria extracting method that combines topic extraction using LDA and refinement with k-nearest neighbor approach. This method overcomes the limits of previous dictionary-based models and frequency-based refinement models. This study can contribute to improve review analysis for deriving business insights in e-commerce market.
Show/Hide Detailed Information in Korean
온라인 리뷰 분석을 통한 상품 평가 기준 추출: LDA 및 k-최근접 이웃 접근법을 활용하여
이지현 (한양대학교)
정상형 (한양대학교)
김준호 (한양대학교)
민은주 (한양대학교)
여운영 (한양대학교)
김종우 (한양대학교)
Keywords
상품 평가 기준, 리뷰 분석, 평가기준 추출, LDA, k-NN
Abstract
상품 평가 기준은 상품에 대한 속성, 가치 등을 표현한 지표로써 사용자나 기업이 상품을 측정하고 파악할수 있게 한다. 기업이 자사 제품에 대한 객관적인 평가와 비교를 수행하기 위해서는 적절한 기준을 선정하는것이 필수적이다. 이때, 평가 기준은 소비자들이 제품을 실제로 구매 및 사용 후 평가할 때 고려하는 제품의 특징을 반영하여야 한다. 그러나 기존에 사용되던 평가 기준은 제품마다 상이한 소비자의 의견을 반영하지 못하고 있다. 기존 연구에서는 소비자 의견이 반영된 온라인 리뷰를 통해 상품의 특징, 주제를 추출하고 이를 평가기준으로 사용했다. 하지만 여전히 상품과 연관성이 낮은 평가 기준이 추출되거나 부적절한 단어가 정제되지않는 한계가 있다.
본 연구에서는 이를 극복하기 위해 잠재 디리클레 할당(Latent Dirichlet Allocation, LDA) 기법으로 리뷰로부터 평가 기준 후보군을 추출하고 이를 k-최근접 이웃 접근법(k-Nearest Neighbor Approach, k-NN)을 이용해 정제하는 모델을 개발하고 검증했다. 제시하는 방법은 준비 단계와 추출 단계로 이루어진다. 준비 단계에서는 워드임베딩(Word Embedding) 모델과 평가 기준 후보군을 정제하기 위한 k-NN 분류기를 생성한다. 추출 단계에서는k-NN 분류기와 언급 비율을 이용해 평가 기준 후보군을 정제하고 최종 결과를 도출한다.
제안 모델의 성능 평가를 위해 명사 빈도 추출 모델, LDA 빈도 추출 모델, 실제 전자상거래 사이트가 제공하는 평가 기준을 세 비교 모델로 선정했다. 세 모델과의 비교를 위해 설문을 진행하고 점수화하여 결과를 검정했다. 30번의 검정 결과 26번의 결과에서 제안 모델이 우수함을 확인했다. 본 연구의 제안 모델은 전자상거래 사이트에서 리뷰 특성을 반영한 상품군 별 차원을 도출하는데 활용될 수 있고 이를 기초로 인사이트 발굴을 위한리뷰 분석 및 활용에 크게 기여할 것이다.
Cite this article
JIIS Style
Lee, J. H., S. H. Jung, J. H. Kim, E. J. Min, U. Y. Yeo, and J. W. Kim, "Product Evaluation Criteria Extraction through Online Review Analysis: Using LDA and k-Nearest Neighbor Approach", Journal of Intelligence and Information Systems, Vol. 26, No. 1 (2020), 97~117.

IEEE Style
Ji Hyeon Lee, Sang Hyung Jung, Jun Ho Kim, Eun Joo Min, Un Yeong Yeo, and Jong Woo Kim, "Product Evaluation Criteria Extraction through Online Review Analysis: Using LDA and k-Nearest Neighbor Approach", Journal of Intelligence and Information Systems, vol. 26, no. 1, pp. 97~117, 2020.

ACM Style
Lee, J. H., Jung, S. H., Kim, J. H., Min, E. J., Yeo, U. Y., and Kim, J. W., 2020. Product Evaluation Criteria Extraction through Online Review Analysis: Using LDA and k-Nearest Neighbor Approach. Journal of Intelligence and Information Systems. 26, 1, 97--117.
Export Formats : BiBTeX, EndNote

Warning: include(/home/hosting_users/ev_jiisonline/www/admin/archive/advancedSearch.php) [function.include]: failed to open stream: No such file or directory in /home/hosting_users/ev_jiisonline/www/archive/detail.php on line 429

Warning: include() [function.include]: Failed opening '/home/hosting_users/ev_jiisonline/www/admin/archive/advancedSearch.php' for inclusion (include_path='.:/usr/local/php/lib/php') in /home/hosting_users/ev_jiisonline/www/archive/detail.php on line 429
@article{Lee:JIIS:2020:803,
author = {Lee, Ji Hyeon and Jung, Sang Hyung and Kim, Jun Ho and Min, Eun Joo and Yeo, Un Yeong and Kim, Jong Woo},
title = {Product Evaluation Criteria Extraction through Online Review Analysis: Using LDA and k-Nearest Neighbor Approach},
journal = {Journal of Intelligence and Information Systems},
issue_date = {March 2020},
volume = {26},
number = {1},
month = Mar,
year = {2020},
issn = {2288-4866},
pages = {97--117},
url = {http://dx.doi.org/10.13088/jiis.2020.26.1.097 },
doi = {10.13088/jiis.2020.26.1.097},
publisher = {Korea Intelligent Information System Society},
address = {Seoul, Republic of Korea},
keywords = { product evaluation criteria, review analysis, extracting evaluation criteria, LDA and k-NN

},
}
%0 Journal Article
%1 803
%A Ji Hyeon Lee
%A Sang Hyung Jung
%A Jun Ho Kim
%A Eun Joo Min
%A Un Yeong Yeo
%A Jong Woo Kim
%T Product Evaluation Criteria Extraction through Online Review Analysis: Using LDA and k-Nearest Neighbor Approach
%J Journal of Intelligence and Information Systems
%@ 2288-4866
%V 26
%N 1
%P 97-117
%D 2020
%R 10.13088/jiis.2020.26.1.097
%I Korea Intelligent Information System Society