DIGITAL LIBRARY ARCHIVE
HOME > DIGITAL LIBRARY ARCHIVE
< Previous   List   Next >  
Ensemble Learning for Solving Data Imbalance in Bankruptcy Prediction
Full-text Download
Myong-Jong Kim (Division of Business, Dong Seo University)
Vol. 15, No. 3, Page: 1 ~ 15
Keywords
Support Vector Machine, Under-Sampling, Over-Sampling, Bankruptcy Prediction, Geometric Mean-based Boosting
Abstract
In a classification problem, data imbalance occurs when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. This paper proposes a Geometric Mean-based Boosting (GM-Boost) to resolve the problem of data imbalance. Since GM-Boost introduces the notion of geometric mean, it can perform learning process considering both majority and minority sides, and reinforce the learning on misclassified data. An empirical study with bankruptcy prediction on Korea companies shows that GM-Boost has the higher classification accuracy than previous methods including Under-sampling, Over-Sampling, and AdaBoost, used in imbalanced data and robust learning performance regardless of the degree of data imbalance.
Show/Hide Detailed Information in Korean
기업부실 예측 데이터의 불균형 문제 해결을 위한 앙상블 학습
김명종 (동서대학교 경영학부)
Abstract
데이터 불균형 문제는 분류 및 예측 문제에서 하나의 범주에 속하는 표본의 수가 다른 범주들에 속하는 표본 수에 비하여 현저하게 적을 경우 나타난다. 데이터 불균형이 심화됨에 따라 범주 사이의 분류 경계영역이 왜곡되고 결과적으로 분류자의 학습성과가 저하되는 문제가 발생한다. 본 연구에서는 데이터 불균형 문제를 해결하기 위하여 Geometric Mean-based Boosting (GM-Boost) 알고리즘을 제안하고자 한다. GM-Boost 알고리즘은 기하평균 개념에 기초하고 있어 다수 범주와 소수 범주를 동시에 고려한 학습이 가능하고 오분류된 표본에 집중하여 학습을 강화할 수 있는 장점이 있다. 기업부실 예측문제를 활용하여 GM-Boost 알고리즘의 성과를 검증한 결과 기존의Under-Sampling, Over-Sampling 및 AdaBoost 알고리즘에 비하여 우수한 분류 정확성을 보여주었고 데이터 불균형 정도에 관계없이 견고한 학습성과를 나타냈다.
Cite this article
JIIS Style
Kim, M.-J., , "Ensemble Learning for Solving Data Imbalance in Bankruptcy Prediction ", Journal of Intelligence and Information Systems, Vol. 15, No. 3 (2009), 1~15.

IEEE Style
Myong-Jong Kim, "Ensemble Learning for Solving Data Imbalance in Bankruptcy Prediction ", Journal of Intelligence and Information Systems, vol. 15, no. 3, pp. 1~15, 2009.

ACM Style
Kim, M.-J.,, 2009. Ensemble Learning for Solving Data Imbalance in Bankruptcy Prediction . Journal of Intelligence and Information Systems. 15, 3, 1--15.
Export Formats : BiBTeX, EndNote

Warning: include(/home/hosting_users/ev_jiisonline/www/admin/archive/advancedSearch.php) [function.include]: failed to open stream: No such file or directory in /home/hosting_users/ev_jiisonline/www/archive/detail.php on line 429

Warning: include() [function.include]: Failed opening '/home/hosting_users/ev_jiisonline/www/admin/archive/advancedSearch.php' for inclusion (include_path='.:/usr/local/php/lib/php') in /home/hosting_users/ev_jiisonline/www/archive/detail.php on line 429
@article{Kim:JIIS:2009:370,
author = {Kim, Myong-Jong},
title = {Ensemble Learning for Solving Data Imbalance in Bankruptcy Prediction },
journal = {Journal of Intelligence and Information Systems},
issue_date = {September 2009},
volume = {15},
number = {3},
month = Sep,
year = {2009},
issn = {2288-4866},
pages = {1--15},
url = {},
doi = {},
publisher = {Korea Intelligent Information System Society},
address = {Seoul, Republic of Korea},
keywords = { Support Vector Machine, Under-Sampling, Over-Sampling, Bankruptcy Prediction and Geometric Mean-based Boosting },
}
%0 Journal Article
%1 370
%A Myong-Jong Kim
%T Ensemble Learning for Solving Data Imbalance in Bankruptcy Prediction
%J Journal of Intelligence and Information Systems
%@ 2288-4866
%V 15
%N 3
%P 1-15
%D 2009
%R
%I Korea Intelligent Information System Society