DIGITAL LIBRARY ARCHIVE
HOME > DIGITAL LIBRARY ARCHIVE
Journal of Intelligence and Information Systems,
Vol. 22, No. 4, December 2016
|Recommender Systems using SVD with Social Network Information
Vol. 22, No. 4, Page: 1 ~ 18
Keywords : Recommender systems, Social network information, Collaborative filtering, Singular value decomposition, Business analytics
Collaborative Filtering (CF) predicts the focal user’s preference for particular item based on user’s preference rating data and recommends items for the similar users by using them. It is a popular technique for the personalization in e-commerce to reduce information overload. However, it has some limitations including sparsity and scalability problems. In this paper, we use a method to integrate social network information into collaborative filtering in order to mitigate the sparsity and scalability problems which are major limitations of typical collaborative filtering and reflect the user's qualitative and emotional information in recommendation process. In this paper, we use a novel recommendation algorithm which is integrated with collaborative filtering by using Social SVD++ algorithm which considers social network information in SVD++, an extension algorithm that can reflect implicit information in singular value decomposition (SVD). In particular, this study will evaluate the performance of the model by reflecting the real-world user's social network information in the recommendation process.
|The Prediction of Currency Crises through Artificial Neural Networks
Vol. 22, No. 4, Page: 19 ~ 43
Keywords : Financial crises, exchange rate, datamining, structural equation model, neural network
This study examines the causes of the Asian exchange rate crisis and compares it to the European Monetary System crisis. In 1997, emerging countries in Asia experienced financial crises. Previously in 1992, currencies in the European Monetary System had undergone the same experience. This was followed by Mexico in 1994. The objective of this paper lies in the generation of useful insights from these crises. This research presents a comparison of South Korea, United Kingdom and Mexico, and then compares three different models for prediction.
Previous studies of economic crisis focused largely on the manual construction of causal models using linear techniques. However, the weakness of such models stems from the prevalence of nonlinear factors in reality. This paper uses a structural equation model to analyze the causes, followed by a neural network model to circumvent the linear model’s weaknesses. The models are examined in the context of predicting exchange rates In this paper, data were quarterly ones, and Consumer Price Index, Gross Domestic Product, Interest Rate, Stock Index, Current Account, Foreign Reserves were independent variables for the prediction. However, time periods of each country’s data are different.
Lisrel is an emerging method and as such requires a fresh approach to financial crisis prediction model design, along with the flexibility to accommodate unexpected change. This paper indicates the neural network model has the greater prediction performance in Korea, Mexico, and United Kingdom. However, in Korea, the multiple regression shows the better performance. In Mexico, the multiple regression is almost indifferent to the Lisrel.
Although Lisrel doesn’t show the significant performance, the refined model is expected to show the better result.
The structural model in this paper should contain the psychological factor and other invisible areas in the future work. The reason of the low hit ratio is that the alternative model in this paper uses only the financial market data.
Thus, we cannot consider the other important part. Korea’s hit ratio is lower than that of United Kingdom. So, there must be the other construct that affects the financial market. So does Mexico. However, the United Kingdom’s financial market is more influenced and explained by the financial factors than Korea and Mexico.
|A Method for Evaluating News Value based on Supply and Demand of Information Using Text Analysis
Vol. 22, No. 4, Page: 45 ~ 67
Keywords : Big Data, News Value Index, SNS, Text Mining, Topic Modeling
Given the recent development of smart devices, users are producing, sharing, and acquiring a variety of information via the Internet and social network services (SNSs). Because users tend to use multiple media simultaneously according to their goals and preferences, domestic SNS users use around 2.09 media concurrently on average. Since the information provided by such media is usually textually represented, recent studies have been actively conducting textual analysis in order to understand users more deeply.
Earlier studies using textual analysis focused on analyzing a document's contents without substantive consideration of the diverse characteristics of the source medium. However, current studies argue that analytical and interpretive approaches should be applied differently according to the characteristics of a document's source.
Documents can be classified into the following types: informative documents for delivering information, expressive documents for expressing emotions and aesthetics, operational documents for inducing the recipient's behavior, and audiovisual media documents for supplementing the above three functions through images and music. Further, documents can be classified according to their contents, which comprise facts, concepts, procedures, principles, rules, stories, opinions, and descriptions.
Documents have unique characteristics according to the source media by which they are distributed.
In terms of newspapers, only highly trained people tend to write articles for public dissemination. In contrast, with SNSs, various types of users can freely write any message and such messages are distributed in an unpredictable way. Again, in the case of newspapers, each article exists independently and does not tend to have any relation to other articles. However, messages (original tweets) on Twitter, for example, are highly organized and regularly duplicated and repeated through replies and retweets.
There have been many studies focusing on the different characteristics between newspapers and SNSs. However, it is difficult to find a study that focuses on the difference between the two media from the perspective of supply and demand. We can regard the articles of newspapers as a kind of information supply, whereas messages on various SNSs represent a demand for information. By investigating traditional newspapers and SNSs from the perspective of supply and demand of information, we can explore and explain the information dilemma more clearly. For example, there may be superfluous issues that are heavily reported in newspaper articles despite the fact that users seldom have much interest in these issues.
Such overproduced information is not only a waste of media resources but also makes it difficult to find valuable, in-demand information. Further, some issues that are covered by only a few newspapers may be of high interest to SNS users.
To alleviate the deleterious effects of information asymmetries, it is necessary to analyze the supply and demand of each information source and, accordingly, provide information flexibly. Such an approach would allow the value of information to be explored and approximated on the basis of the supply-demand balance. Conceptually, this is very similar to the price of goods or services being determined by the supply-demand relationship. Adopting this concept, media companies could focus on the production of highly in-demand issues that are in short supply.
In this study, we selected Internet news sites and Twitter as representative media for investigating information supply and demand, respectively. We present the notion of News Value Index (NVI), which evaluates the value of news information in terms of the magnitude of Twitter messages associated with it. In addition, we visualize the change of information value over time using the NVI. We conducted an analysis using 387,014 news articles and 31,674,795 Twitter messages. The analysis results revealed interesting patterns: most issues show lower NVI than average of the whole issue, whereas a few issues show steadily higher NVI than the average.
|Prediction of Commitment and Persistence in Heterosexual Involvements according to the Styles of Loving using a Datamining Technique
Vol. 22, No. 4, Page: 69 ~ 85
Keywords : Type of love, Commitment to a lover, Prediction of breakup, Decision tree, Regression
Successful relationship with loving partners is one of the most important factors in life. In psychology, there have been some previous researches studying the factors influencing romantic relationships. However, most of these researches were performed based on statistical analysis; thus they have limitations in analyzing complex non-linear relationships or rules based reasoning.
This research analyzes commitment and persistence in heterosexual involvement according to styles of loving using a datamining technique as well as statistical methods. In this research, we consider six different styles of loving - 'eros', 'ludus', 'stroge', 'pragma', 'mania' and 'agape' which influence romantic relationships between lovers, besides the factors suggested by the previous researches. These six types of love are defined by Lee (1977) as follows: 'eros' is romantic, passionate love; 'ludus' is a game-playing or uncommitted love; 'storge' is a slow developing, friendship-based love; 'pragma' is a pragmatic, practical, mutually beneficial relationship; 'mania' is an obsessive or possessive love and, lastly, 'agape' is a gentle, caring, giving type of love, brotherly love, not concerned with the self.
In order to do this research, data from 105 heterosexual couples were collected. Using the data, a linear regression method was first performed to find out the important factors associated with a commitment to partners. The result shows that 'satisfaction', 'eros' and 'agape' are significant factors associated with the commitment level for both male and female. Interestingly, in male cases, 'agape' has a greater effect on commitment than 'eros'. On the other hand, in female cases, 'eros' is a more significant factor than 'agape' to commitment. In addition to that, 'investment' of the male is also crucial factor for male commitment.
Next, decision tree analysis was performed to find out the characteristics of high commitment couples and low commitment couples. In order to build decision tree models in this experiment, 'decision tree' operator in the datamining tool, Rapid Miner was used. The experimental result shows that males having a high satisfaction level in relationship show a high commitment level. However, even though a male may not have a high satisfaction level, if he has made a lot of financial or mental investment in relationship, and his partner shows him a certain amount of 'agape', then he also shows a high commitment level to the female. In the case of female, a women having a high 'eros' and 'satisfaction' level shows a high commitment level. Otherwise, even though a female may not have a high satisfaction level, if her partner shows a certain amount of 'mania' then the female also shows a high commitment level.
Finally, this research built a prediction model to establish whether the relationship will persist or break up using a decision tree. The result shows that the most important factor influencing to the break up is a 'narcissistic tendency' of the male. In addition to that, 'satisfaction', 'investment' and 'mania' of both male and female also affect a break up. Interestingly, while the 'mania' level of a male works positively to maintain the relationship, that of a female has a negative influence.
The contribution of this research is adopting a new technique of analysis using a datamining method for psychology. In addition, the results of this research can provide useful advice to couples for building a harmonious relationship with each other.
This research has several limitations. First, the experimental data was sampled based on oversampling technique to balance the size of each classes. Thus, it has a limitation of evaluating performances of the predictive models objectively. Second, the result data, whether the relationship persists of not, was collected relatively in short periods - 6 months after the initial data collection. Lastly, most of the respondents of the survey is in their 20's. In order to get more general results, we would like to extend this research to general populations.
|Discovery of Market Convergence Opportunity Combining Text Mining and Social Network Analysis: Evidence from Large-Scale Product Databases
Vol. 22, No. 4, Page: 87 ~ 107
Keywords : Market Convergence, Association Analysis, Social Network Analysis, Structural Hole, B2B
Understanding market convergence has became essential for small and mid-size enterprises.
Identifying convergence items among heterogeneous markets could lead to product innovation and successful market introduction. Previous researches have two limitations. First, traditional researches focusing on patent databases are suitable for detecting technology convergence, however, they have failed to recognize market demands. Second, most researches concentrate on identifying the relationship between existing products or technology. This study presents a platform to identify the opportunity of market convergence by using product databases from a global B2B marketplace. We also attempt to identify convergence opportunity in different industries by applying Structural Hole theory. This paper shows the mechanisms for market convergence: attributes extraction of products and services using text mining and association analysis among attributes, and network analysis based on structural hole. In order to discover market demand, we analyzed 240,002 e-catalog from January 2013 to July 2016.
|Design of Client-Server Model For Effective Processing and Utilization of Bigda
Vol. 22, No. 4, Page: 109 ~ 122
Keywords : Big Data, Client, Server, Spark, Pre-Analysis
Recently, big data analysis has developed into a field of interest to individuals and non-experts as well as companies and professionals. Accordingly, it is utilized for marketing and social problem solving by analyzing the data currently opened or collected directly. In Korea, various companies and individuals are challenging big data analysis, but it is difficult from the initial stage of analysis due to limitation of big data disclosure and collection difficulties.
Nowadays, the system improvement for big data activation and big data disclosure services are variously carried out in Korea and abroad, and services for opening public data such as domestic government 3.0 (data.go.kr) are mainly implemented. In addition to the efforts made by the government, services that share data held by corporations or individuals are running, but it is difficult to find useful data because of the lack of shared data. In addition, big data traffic problems can occur because it is necessary to download and examine the entire data in order to grasp the attributes and simple information about the shared data.
Therefore, We need for a new system for big data processing and utilization. First, big data pre-analysis technology is needed as a way to solve big data sharing problem. Pre-analysis is a concept proposed in this paper in order to solve the problem of sharing big data, and it means to provide users with the results generated by pre-analyzing the data in advance. Through preliminary analysis, it is possible to improve the usability of big data by providing information that can grasp the properties and characteristics of big data when the data user searches for big data. In addition, by sharing the summary data or sample data generated through the pre-analysis, it is possible to solve the security problem that may occur when the original data is disclosed, thereby enabling the big data sharing between the data provider and the data user.
Second, it is necessary to quickly generate appropriate preprocessing results according to the level of disclosure or network status of raw data and to provide the results to users through big data distribution processing using spark.
Third, in order to solve the problem of big traffic, the system monitors the traffic of the network in real time. When preprocessing the data requested by the user, preprocessing to a size available in the current network and transmitting it to the user is required so that no big traffic occurs. In this paper, we present various data sizes according to the level of disclosure through pre - analysis. This method is expected to show a low traffic volume when compared with the conventional method of sharing only raw data in a large number of systems.
In this paper, we describe how to solve problems that occur when big data is released and used, and to help facilitate sharing and analysis. The client-server model uses SPARK for fast analysis and processing of user requests. Server Agent and a Client Agent, each of which is deployed on the Server and Client side. The Server Agent is a necessary agent for the data provider and performs preliminary analysis of big data to generate Data Descriptor with information of Sample Data, Summary Data, and Raw Data. In addition, it performs fast and efficient big data preprocessing through big data distribution processing and continuously monitors network traffic. The Client Agent is an agent placed on the data user side. It can search the big data through the Data Descriptor which is the result of the pre-analysis and can quickly search the data. The desired data can be requested from the server to download the big data according to the level of disclosure. It separates the Server Agent and the client agent when the data provider publishes the data for data to be used by the user. In particular, we focus on the Big Data Sharing, Distributed Big Data Processing, Big Traffic problem, and construct the detailed module of the client - server model and present the design method of each module.
The system designed on the basis of the proposed model, the user who acquires the data analyzes the data in the desired direction or preprocesses the new data. By analyzing the newly processed data through the server agent, the data user changes its role as the data provider. The data provider can also obtain useful statistical information from the Data Descriptor of the data it discloses and become a data user to perform new analysis using the sample data. In this way, raw data is processed and processed big data is utilized by the user, thereby forming a natural shared environment. The role of data provider and data user is not distinguished, and provides an ideal shared service that enables everyone to be a provider and a user. The client-server model solves the problem of sharing big data and provides a free sharing environment to securely big data disclosure and provides an ideal shared service to easily find big data.
|Intents of Acquisitions in Information Technology Industries
Vol. 22, No. 4, Page: 123 ~ 138
Keywords : Mergers and acquisitions, intents of acquisitions, information technology industries
This study investigates intents of acquisitions in information technology industries. Mergers and acquisitions are a strategic decision at corporate-level and have been an important tool for a firm to grow.
Plenty of firms in information technology industries have acquired startups to increase production efficiency, expand customer base, or improve quality over the last decades. For example, Google has made about 200 acquisitions since 2001, Cisco has acquired about 210 firms since 1993, Oracle has made about 125 acquisitions since 1994, and Microsoft has acquired about 200 firms since 1987. Although there have been many existing papers that theoretically study intents or motivations of acquisitions, there are limited papers that empirically investigate them mainly because it is challenging to measure and quantify intents of M&As. This study examines the intent of acquisitions by measuring specific intents for M&A transactions. Using our measures of acquisition intents, we compare the intents by four acquisition types: (1) the acquisition where a hardware firm acquires a hardware firm, (2) the acquisition where a hardware firm acquires a software/IT service firm, (3) the acquisition where a software/IT service firm acquires a hardware firm, and (4) the acquisition where a software /IT service firm acquires a software/IT service firm.
We presume that there are difference in reasons why a hardware firm acquires another hardware firm, why a hardware firm acquires a software firm, why a software/IT service firm acquires a hardware firm, and why a software/IT service firm acquires another software/IT service firm.
Using data of the M&As in US IT industries, we identified major intents of the M&As. The acquisition intents are identified based on the press release of M&A announcements and measured with four categories. First, an acquirer may have intents of cost saving in operations by sharing common resources between the acquirer and the target. The cost saving can accrue from economies of scope and scale. Second, an acquirer may have intents of product enhancement/development. Knowledge and skills transferred from the target may enable the acquirer to enhance the product quality or to expand product lines. Third, an acquirer may have intents of gain additional customer base to expand the market, to penetrate the market, or to enter a foreign market. Fourth, a firm may acquire a target with intents of expanding customer channels. By complementing existing channel to the customer, the firm can increase its revenue. Our results show that acquirers have had intents of cost saving more in acquisitions between hardware companies than in acquisitions between software companies. Hardware firms are more likely to acquire with intents of product enhancement or development than software firms. Overall, the intent of product enhancement/development is the most frequent intent in all of the four acquisition types, and the intent of customer base expansion is the second.
We also analyze our data with the classification of production-side intents and customer-side intents, which is based on activities of the value chain of a firm. Intents of cost saving operations and those of product enhancement/development can be viewed as production-side intents and intents of customer base expansion and those of expanding customer channels can be viewed as customer-side intents. Our analysis shows that the ratio between the number of customer-side intents and that of production-side intents is higher in acquisitions where a software firm is an acquirer than in the acquisitions where a hardware firm is an acquirer.
This study can contribute to IS literature. First, this study provides insights in understanding M&As in IT industries by answering for question of why an IT firm intends to another IT firm. Second, this study also provides distribution of acquisition intents for acquisition types.
|A Study on the Intelligence Information System's Research Identity Using the Keywords Profiling and Co-word Analysis
Vol. 22, No. 4, Page: 139 ~ 155
Keywords : Profiling Methods, Keyword, Research Identity, Research Expansion, Intelligence System
The purpose of this study is to find the research identity of the Korea Intelligent Information Systems Society through the profiling methods and co-word analysis in the most recent three-year('2014~'2016) study to collect keyword. In order to understand the research identity for intelligence information system, we need that the relative position of the study will be to compare identity by collecting keyword and research methodology of The korea Society of Management Information Systems and Korea Association of Information Systems, as well as Korea Intelligent Information Systems Society for the similar. Also, Korea Intelligent Information Systems Society is focusing on the four research areas such as artificial intelligence/data mining, Intelligent Internet, knowledge management and optimization techniques. So, we analyze research trends with a representative journals for the focusing on the four research areas. A journal of the data-related will be investigated with the keyword and research methodology in Korean Society for Big Data Service and the Korean Journal of Big Data. Through this research, we will find to research trends with research keyword in recent years and compare against the study methodology and analysis tools.
Finally, it is possible to know the position and orientation of the current research trends in Korea Intelligent Information Systems Society. As a result, this study revealed a study area that Korea Intelligent Information Systems Society only be pursued through a unique reveal its legitimacy and identity. So, this research can suggest future research areas to intelligent information systems specifically. Furthermore, we will predict convergence possibility of the similar research areas and Korea Intelligent Information Systems Society in overall ecosystem perspectives.
|Determinants of Mobile Application Use: A Study Focused on the Correlation between Application Categories
Vol. 22, No. 4, Page: 157 ~ 176
Keywords : smartphone, mobile app, app category, app usage, multivariate probit model
For a long time, mobile phone had a sole function of communication. Recently however, abrupt innovations in technology allowed extension of the sphere in mobile phone activities. Development of technology enabled realization of almost computer-like environment even on a very small device. Such advancement yielded several forms of new high-tech devices such as smartphone and tablet PC, which quickly proliferated. Simultaneously with the diffusion of the mobile devices, mobile applications for those devices also prospered and soon became deeply penetrated in consumers’ daily lives.
Numerous mobile applications have been released in app stores yielding trillions of cumulative downloads. However, a big majority of the applications are disregarded from consumers. Even after the applications are purchased, they do not survive long in consumers’ mobile devices and are soon abandoned.
Nevertheless, it is imperative for both app developers and app-store operators to understand consumer behaviors and to develop marketing strategies aiming to make sustainable business by first increasing sales of mobile applications and by also designing surviving strategy for applications. Therefore, this research analyzes consumers’ mobile application usage behavior in a frame of substitution/supplementary of application categories and several explanatory variables.
Considering that consumers of mobile devices use multiple apps simultaneously, this research adopts multivariate probit models to explain mobile application usage behavior and to derive correlation between categories of applications for observing substitution/supplementary of application use. The research adopts several explanatory variables including sociodemographic data, user experiences of purchased applications that reflect future purchasing behavior of paid applications as well as consumer attitudes toward marketing efforts, variables representing consumer attitudes toward rating of the app and those representing consumer attitudes toward app-store promotion efforts (i.e., top developer badge and editor’s choice badge).
Results of this study can be explained in hedonic and utilitarian framework. Consumers who use hedonic applications, such as those of game and entertainment-related, are of young age with low education level. However, consumers who are old and have received higher education level prefer utilitarian application category such as life, information etc. There are disputable arguments over whether the users of SNS are hedonic or utilitarian. In our results, consumers who are younger and those with higher education level prefer using SNS category applications, which is in a middle of utilitarian and hedonic results. Also, applications that are directly related to tangible assets, such as banking, stock and mobile shopping, are only negatively related to experience of purchasing of paid app, meaning that consumers who put weights on tangible assets do not prefer buying paid application.
Regarding categories, most correlations among categories are significantly positive. This is because someone who spend more time on mobile devices tends to use more applications. Game and entertainment category shows significant and positive correlation; however, there exists significantly negative correlation between game and information, as well as game and e-commerce categories of applications. Meanwhile, categories of game and SNS as well as game and finance have shown no significant correlations. This result clearly shows that mobile application usage behavior is quite clearly distinguishable – that the purpose of using mobile devices are polarized into utilitarian and hedonic purpose.
This research proves several arguments that can only be explained by second-hand real data, not by survey data, and offers behavioral explanations of mobile application usage in consumers’ perspectives. This research also shows substitution/supplementary patterns of consumer application usage, which then explain consumers’ mobile application usage behaviors. However, this research has limitations in some points.
Classification of categories itself is disputable, for classification is diverged among several studies.
Therefore, there is a possibility of change in results depending on the classification. Lastly, although the data are collected in an individual application level, we reduce its observation into an individual level.
Further research will be done to resolve these limitations.
|VKOSPI Forecasting and Option Trading Application Using SVM
Vol. 22, No. 4, Page: 177 ~ 192
Keywords : Machine Learning, Support Vector Machine, VKOSPI, Option Trading
Machine learning is a field of artificial intelligence. It refers to an area of computer science related to providing machines the ability to perform their own data analysis, decision making and forecasting. For example, one of the representative machine learning models is artificial neural network, which is a statistical learning algorithm inspired by the neural network structure of biology. In addition, there are other machine learning models such as decision tree model, naive bayes model and SVM(support vector machine) model.
Among the machine learning models, we use SVM model in this study because it is mainly used for classification and regression analysis that fits well to our study. The core principle of SVM is to find a reasonable hyperplane that distinguishes different group in the data space. Given information about the data in any two groups, the SVM model judges to which group the new data belongs based on the hyperplane obtained from the given data set. Thus, the more the amount of meaningful data, the better the machine learning ability.
In recent years, many financial experts have focused on machine learning, seeing the possibility of combining with machine learning and the financial field where vast amounts of financial data exist.
Machine learning techniques have been proved to be powerful in describing the non-stationary and chaotic stock price dynamics. A lot of researches have been successfully conducted on forecasting of stock prices using machine learning algorithms. Recently, financial companies have begun to provide Robo-Advisor service, a compound word of Robot and Advisor, which can perform various financial tasks through advanced algorithms using rapidly changing huge amount of data. Robo-Adviser's main task is to advise the investors about the investor's personal investment propensity and to provide the service to manage the portfolio automatically.
In this study, we propose a method of forecasting the Korean volatility index, VKOSPI, using the SVM model, which is one of the machine learning methods, and applying it to real option trading to increase the trading performance. VKOSPI is a measure of the future volatility of the KOSPI 200 index based on KOSPI 200 index option prices. VKOSPI is similar to the VIX index, which is based on S&P 500 option price in the United States. The Korea Exchange(KRX) calculates and announce the real-time VKOSPI index. VKOSPI is the same as the usual volatility and affects the option prices. The direction of VKOSPI and option prices show positive relation regardless of the option type (call and put options with various striking prices). If the volatility increases, all of the call and put option premium increases because the probability of the option's exercise possibility increases. The investor can know the rising value of the option price with respect to the volatility rising value in real time through Vega, a Black-Scholes's measurement index of an option's sensitivity to changes in the volatility. Therefore, accurate forecasting of VKOSPI movements is one of the important factors that can generate profit in option trading.
In this study, we verified through real option data that the accurate forecast of VKOSPI is able to make a big profit in real option trading. To the best of our knowledge, there have been no studies on the idea of predicting the direction of VKOSPI based on machine learning and introducing the idea of applying it to actual option trading.
In this study predicted daily VKOSPI changes through SVM model and then made intraday option strangle position, which gives profit as option prices reduce, only when VKOSPI is expected to decline during daytime. We analyzed the results and tested whether it is applicable to real option trading based on SVM's prediction. The results showed the prediction accuracy of VKOSPI was 57.83% on average, and the number of position entry times was 43.2 times, which is less than half of the benchmark (100 times).
A small number of trading is an indicator of trading efficiency. In addition, the experiment proved that the trading performance was significantly higher than the benchmark.