< Previous   List   Next >  
Anomaly Detection for User Action with Generative Adversarial Networks
Full-text Download
Namwoong Choi (Department of Industrial Engineering, Yonsei University)
Wooju Kim (Graduate School of Industrial Engineering, Yonsei University)
Vol. 25, No. 3, Page: 43 ~ 62
Autoencoder, Anomaly Score, Feature matching, Generative Adversarial Nets-Anomaly Detection, Optimizing latent variable
At one time, the anomaly detection sector dominated the method of determining whether there was an abnormality based on the statistics derived from specific data. This methodology was possible because the dimension of the data was simple in the past, so the classical statistical method could work effectively.
However, as the characteristics of data have changed complexly in the era of big data, it has become more difficult to accurately analyze and predict the data that occurs throughout the industry in the conventional way. Therefore, SVM and Decision Tree based supervised learning algorithms were used.
However, there is peculiarity that supervised learning based model can only accurately predict the test data, when the number of classes is equal to the number of normal classes and most of the data generated in the industry has unbalanced data class. Therefore, the predicted results are not always valid when supervised learning model is applied. In order to overcome these drawbacks, many studies now use the unsupervised learning-based model that is not influenced by class distribution, such as autoencoder or generative adversarial networks.
In this paper, we propose a method to detect anomalies using generative adversarial networks.
AnoGAN, introduced in the study of Thomas et al (2017), is a classification model that performs abnormal detection of medical images. It was composed of a Convolution Neural Net and was used in the field of detection. On the other hand, sequencing data abnormality detection using generative adversarial network is a lack of research papers compared to image data. Of course, in Li et al (2018), a study by Li et al (LSTM), a type of recurrent neural network, has proposed a model to classify the abnormities of numerical sequence data, but it has not been used for categorical sequence data, as well as feature matching method applied by salans et al.(2016). So it suggests that there are a number of studies to be tried on in the ideal classification of sequence data through a generative adversarial Network. In order to learn the sequence data, the structure of the generative adversarial networks is composed of LSTM, and the 2 stacked-LSTM of the generator is composed of 32-dim hidden unit layers and 64-dim hidden unit layers. The LSTM of the discriminator consists of 64-dim hidden unit layer were used.
In the process of deriving abnormal scores from existing paper of Anomaly Detection for Sequence data, entropy values of probability of actual data are used in the process of deriving abnormal scores. but in this paper, as mentioned earlier, abnormal scores have been derived by using feature matching techniques. In addition, the process of optimizing latent variables was designed with LSTM to improve model performance. The modified form of generative adversarial model was more accurate in all experiments than the autoencoder in terms of precision and was approximately 7% higher in accuracy.
In terms of Robustness, Generative adversarial networks also performed better than autoencoder.
Because generative adversarial networks can learn data distribution from real categorical sequence data, Unaffected by a single normal data. But autoencoder is not. Result of Robustness test showed that he accuracy of the autocoder was 92%, the accuracy of the hostile neural network was 96%, and in terms of sensitivity, the autocoder was 40% and the hostile neural network was 51%.
In this paper, experiments have also been conducted to show how much performance changes due to differences in the optimization structure of potential variables. As a result, the level of 1% was improved in terms of sensitivity. These results suggest that it presented a new perspective on optimizing latent variable that were relatively insignificant.
Show/Hide Detailed Information in Korean
적대적 생성 모델을 활용한 사용자 행위 이상 탐지 방법
최남웅 (연세대학교 산업공학과)
김우주 (연세대학교 산업공학과)
오토 인코더, 이상 점수, 잠재 변수 최적화, 자질 매칭, 이상 탐지 적대적 생성 신경망
한때, 이상 탐지 분야는 특정 데이터로부터 도출한 기초 통계량을 기반으로 이상 유무를 판단하는 방법이 지배적이었다. 이와 같은 방법론이 가능했던 이유는 과거엔 데이터의 차원이 단순하여 고전적 통계 방법이 효과적으로 작용할 수 있었기 때문이다. 하지만 빅데이터 시대에 접어들며 데이터의 속성이 복잡하게 변화함에 따라 더는 기존의 방식으로 산업 전반에 발생하는 데이터를 정확하게 분석, 예측하기 어렵게 되었다. 따라서 기계학습 방법을 접목한 SVM, Decision Tree와 같은 모형을 활용하게 되었다.
하지만 지도 학습 기반의 모형은 훈련 데이터의 이상과 정상의 클래스 수가 비슷할 때만 테스트 과정에서 정확한 예측을 할 수 있다는 특수성이 있고 산업에서 생성되는 데이터는 대부분 정답 클래스가 불균형하기에 지도 학습 모형을 적용할 경우, 항상 예측되는 결과의 타당성이 부족하다는 문제점이 있다. 이러한 단점을 극복하고자 현재는 클래스 분포에 영향을 받지 않는 비지도 학습 기반의 모델을 바탕으로 이상 탐지 모형을 구성하여실제 산업에 적용하기 위해 시행착오를 거치고 있다.
본 연구는 이러한 추세에 발맞춰 적대적 생성 신경망을 활용하여 이상 탐지하는 방법을 제안하고자 한다. 시퀀스 데이터를 학습시키기 위해 적대적 생성 신경망의 구조를 LSTM으로 구성하고 생성자의 LSTM은 2개의 층으로 각각 32차원과 64차원의 은닉유닛으로 구성, 판별자의 LSTM은 64차원의 은닉유닛으로 구성된 1개의 층을 사용하였다.
기존 시퀀스 데이터의 이상 탐지 논문에서는 이상 점수를 도출하는 과정에서 판별자가 실제데이터일 확률의엔트로피 값을 사용하지만 본 논문에서는 자질 매칭 기법을 활용한 함수로 변경하여 이상 점수를 도출하였다.
또한, 잠재 변수를 최적화하는 과정을 LSTM으로 구성하여 모델 성능을 향상시킬 수 있었다. 변형된 형태의 적대적 생성 모델은 오토인코더의 비해 모든 실험의 경우에서 정밀도가 우세하였고 정확도 측면에서는 대략 7% 정도 높음을 확인할 수 있었다.
Cite this article
JIIS Style
Choi, N., and W. Kim, "Anomaly Detection for User Action with Generative Adversarial Networks", Journal of Intelligence and Information Systems, Vol. 25, No. 3 (2019), 43~62.

IEEE Style
Namwoong Choi, and Wooju Kim, "Anomaly Detection for User Action with Generative Adversarial Networks", Journal of Intelligence and Information Systems, vol. 25, no. 3, pp. 43~62, 2019.

ACM Style
Choi, N., and Kim, W., 2019. Anomaly Detection for User Action with Generative Adversarial Networks. Journal of Intelligence and Information Systems. 25, 3, 43--62.
Export Formats : BiBTeX, EndNote

Warning: include(/home/hosting_users/ev_jiisonline/www/admin/archive/advancedSearch.php) [function.include]: failed to open stream: No such file or directory in /home/hosting_users/ev_jiisonline/www/archive/detail.php on line 429

Warning: include() [function.include]: Failed opening '/home/hosting_users/ev_jiisonline/www/admin/archive/advancedSearch.php' for inclusion (include_path='.:/usr/local/php/lib/php') in /home/hosting_users/ev_jiisonline/www/archive/detail.php on line 429
author = {Choi, Namwoong and Kim, Wooju},
title = {Anomaly Detection for User Action with Generative Adversarial Networks},
journal = {Journal of Intelligence and Information Systems},
issue_date = {September 2019},
volume = {25},
number = {3},
month = Sep,
year = {2019},
issn = {2288-4866},
pages = {43--62},
url = {},
doi = {},
publisher = {Korea Intelligent Information System Society},
address = {Seoul, Republic of Korea},
keywords = { Autoencoder, Anomaly Score, Feature matching, Generative Adversarial Nets-Anomaly Detection and Optimizing latent variable
%0 Journal Article
%1 781
%A Namwoong Choi
%A Wooju Kim
%T Anomaly Detection for User Action with Generative Adversarial Networks
%J Journal of Intelligence and Information Systems
%@ 2288-4866
%V 25
%N 3
%P 43-62
%D 2019
%I Korea Intelligent Information System Society