< Previous   List   Next >  
Predicting stock movements based on financial news with systematic group identification
Full-text Download
NohYoon Seong (College of Business, KAIST)
Kihwan Nam (College of Business, KAIST)
Vol. 25, No. 3, Page: 1 ~ 17
Online News, Stock prediction, Random matrix theory, hierarchical clustering
Because stock price forecasting is an important issue both academically and practically, research in stock price prediction has been actively conducted. The stock price forecasting research is classified into using structured data and using unstructured data. With structured data such as historical stock price and financial statements, past studies usually used technical analysis approach and fundamental analysis. In the big data era, the amount of information has rapidly increased, and the artificial intelligence methodology that can find meaning by quantifying string information, which is an unstructured data that takes up a large amount of information, has developed rapidly. With these developments, many attempts with unstructured data are being made to predict stock prices through online news by applying text mining to stock price forecasts.
The stock price prediction methodology adopted in many papers is to forecast stock prices with the news of the target companies to be forecasted. However, according to previous research, not only news of a target company affects its stock price, but news of companies that are related to the company can also affect the stock price. However, finding a highly relevant company is not easy because of the market-wide impact and random signs. Thus, existing studies have found highly relevant companies based primarily on pre-determined international industry classification standards. However, according to recent research, global industry classification standard has different homogeneity within the sectors, and it leads to a limitation that forecasting stock prices by taking them all together without considering only relevant companies can adversely affect predictive performance.
To overcome the limitation, we first used random matrix theory with text mining for stock prediction.
Wherever the dimension of data is large, the classical limit theorems are no longer suitable, because the statistical efficiency will be reduced. Therefore, a simple correlation analysis in the financial market does not mean the true correlation. To solve the issue, we adopt random matrix theory, which is mainly used in econophysics, to remove market-wide effects and random signals and find a true correlation between companies. With the true correlation, we perform cluster analysis to find relevant companies.
Also, based on the clustering analysis, we used multiple kernel learning algorithm, which is an ensemble of support vector machine to incorporate the effects of the target firm and its relevant firms simultaneously. Each kernel was assigned to predict stock prices with features of financial news of the target firm and its relevant firms.
The results of this study are as follows. The results of this paper are as follows. (1) Following the existing research flow, we confirmed that it is an effective way to forecast stock prices using news from relevant companies. (2) When looking for a relevant company, looking for it in the wrong way can lower AI prediction performance. (3) The proposed approach with random matrix theory shows better performance than previous studies if cluster analysis is performed based on the true correlation by removing market-wide effects and random signals.
The contribution of this study is as follows. First, this study shows that random matrix theory, which is used mainly in economic physics, can be combined with artificial intelligence to produce good methodologies. This suggests that it is important not only to develop AI algorithms but also to adopt physics theory. This extends the existing research that presented the methodology by integrating artificial intelligence with complex system theory through transfer entropy. Second, this study stressed that finding the right companies in the stock market is an important issue. This suggests that it is not only important to study artificial intelligence algorithms, but how to theoretically adjust the input values. Third, we confirmed that firms classified as Global Industrial Classification Standard (GICS) might have low relevance and suggested it is necessary to theoretically define the relevance rather than simply finding it in the GICS.
Show/Hide Detailed Information in Korean
시스템적인 군집 확인과 뉴스를 이용한 주가 예측
성노윤 (한국과학기술원 경영공학부)
남기환 (한국과학기술원 경영공학부)
온라인 뉴스, 주가 예측, 무작위 행렬 이론, 계층적 군집 분석
빅데이터 시대에 정보의 양이 급증하고, 그중 많은 부분을 차지하는 문자열 정보를 정량화하여 의미를 찾아낼 수 있는 인공지능 방법론이 함께 발전하면서, 텍스트 마이닝을 통해 주가 예측에 적용해 온라인 뉴스로 주가를 예측하려는 시도가 다양해지고 있다. 이러한 주가 예측의 방법은 대개 예측하고자 하는 기업의 뉴스로 주가를 예측하는 방식이다. 하지만 특정 회사의 뉴스만이 그 회사의 주가에 영향을 주는 것이 아니라, 그 회사와 관련성이 높은 회사들의 뉴스 또한 주가에 영향을 줄 수 있다. 그러나 관련성이 높은 기업을 찾는 것은 시장 전반의 공통적인 영향과 무작위 신호 때문에 쉽지 않다. 따라서 기존 연구들은 주로 미리 정해진 국제 산업 분류표준에 기반을 둬 관련성이 높은 기업을 찾았다. 하지만 최근 연구에 따르면, 국제 산업 분류 표준은 섹터에 따라 동질성이 다르며, 동질성이 낮은 섹터는 그들을 모두 함께 고려하여 주가를 예측하는 것이 성능에 악영향을줄 수 있다는 한계점을 가진다.
이러한 한계점을 극복하기 위해, 본 논문에서는 주가 예측 연구에서 처음으로 경제물리학에서 주로 사용되는 무작위 행렬 이론을 사용하여 시장 전반 효과와 무작위 신호를 제거하고 군집 분석을 시행하여 관련성이 높은 회사를 찾는 방법을 제시하였다. 또한, 이를 기반으로 관련성이 높은 회사의 뉴스를 함께 고려하며 다중 커널 학습을 사용하는 인공지능 모형을 제시한다. 본 논문의 결과는 무작위 행렬 이론을 통해 시장 전반의 효과와무작위 신호를 제거하여 정확한 상관 계수를 찾아 군집 분석을 시행한다면 기존 연구보다 더 좋은 성능을 보여준다는 것을 보여준다.
Cite this article
JIIS Style
Seong, N., and K. Nam, "Predicting stock movements based on financial news with systematic group identification", Journal of Intelligence and Information Systems, Vol. 25, No. 3 (2019), 1~17.

IEEE Style
NohYoon Seong, and Kihwan Nam, "Predicting stock movements based on financial news with systematic group identification", Journal of Intelligence and Information Systems, vol. 25, no. 3, pp. 1~17, 2019.

ACM Style
Seong, N., and Nam, K., 2019. Predicting stock movements based on financial news with systematic group identification. Journal of Intelligence and Information Systems. 25, 3, 1--17.
Export Formats : BiBTeX, EndNote
Advanced Search
Date Range

author = {Seong, NohYoon and Nam, Kihwan},
title = {Predicting stock movements based on financial news with systematic group identification},
journal = {Journal of Intelligence and Information Systems},
issue_date = {September 2019},
volume = {25},
number = {3},
month = Sep,
year = {2019},
issn = {2288-4866},
pages = {1--17},
url = {},
doi = {},
publisher = {Korea Intelligent Information System Society},
address = {Seoul, Republic of Korea},
keywords = { Online News, Stock prediction, Random matrix theory and hierarchical clustering
%0 Journal Article
%1 779
%A NohYoon Seong
%A Kihwan Nam
%T Predicting stock movements based on financial news with systematic group identification
%J Journal of Intelligence and Information Systems
%@ 2288-4866
%V 25
%N 3
%P 1-17
%D 2019
%I Korea Intelligent Information System Society