Journal of Intelligence and Information Systems,
Vol. 19, No. 1, March 2013
Comparison Studies of Hybrid and Non-hybrid Forecasting Models for Seasonal and Trend Time Series Data
Chulwoo Jeong, and Myung Suk Kim
Vol. 19, No. 1, Page: 1 ~ 17
Keywords : Forecasting, Generalized Additive Models, Seasonal ARIMA, Hybrid Models, Hybrid Models
In this article, several types of hybrid forecasting models are suggested. In particular, hybrid models using the generalized additive model (GAM) are newly suggested as an alternative to those using neural networks (NN). The prediction performances of various hybrid and non-hybrid models are evaluated using simulated time series data. Five different types of seasonal time series data related to an additive or multiplicative trend are generated over different levels of noise, and applied to the forecasting evaluation. For the simulated data with only seasonality, the autoregressive (AR) model and the hybrid AR-AR model performed equivalently very well. On the other hand, if the time series data employed a trend, the SARIMA model and some hybrid SARIMA models equivalently outperformed the others. In the comparison of GAMs and NNs, regarding the seasonal additive trend data, the SARIMA-GAM evenly performed well across the full range of noise variation, whereas the SARIMA-NN showed good performance only when the noise level was trivial.
Dynamic Virtual Ontology using Tags with Semantic Relationship on Social-web to Support Effective Search
Hyun Jung Lee, and Mye Sohn
Vol. 19, No. 1, Page: 19 ~ 33
Keywords : Tag, Search, Dynamic Virtual Ontology
In this research, a proposed Dynamic Virtual Ontology using Tags (DyVOT) supports dynamic search of resources depending on user’s requirements using tags from social web driven resources. It is general that the tags are defined by annotations of a series of described words by social users who usually tags social information resources such as web-page, images, u-tube, videos, etc. Therefore, tags are characterized and mirrored by information resources. Therefore, it is possible for tags as meta-data to match into some resources. Consequently, we can extract semantic relationships between tags owing to the dependency of relationships between tags as representatives of resources. However, to do this, there is limitation because there are allophonic synonym and homonym among tags that are usually marked by a series of words. Thus, research related to folksonomies using tags have been applied to classification of words by semantic-based allophonic synonym. In addition, some research are focusing on clustering and/or classification of resources by semantic-based relationships among tags. In spite of, there also is limitation of these research because these are focusing on semantic-based hyper/hypo relationships or clustering among tags without consideration of conceptual associative relationships between classified or clustered groups. It makes difficulty to effective searching resources depending on user requirements. In this research, the proposed DyVOT uses tags and constructs ontologyfor effective search. We assumed that tags are extracted from user requirements, which are used to construct multi sub-ontology as combinations of tags that are composed of a part of the tags or all. In addition, the proposed DyVOT constructs ontology which is based on hierarchical and associative relationships among tags for effective search of a solution. The ontology is composed of static- and dynamic-ontology. The static-ontology defines semantic-based hierarchical hyper/hypo relationships among tags as in ( with a tree structure. From the static-ontology, the DyVOT extracts multi sub-ontology using multi sub-tag which are constructed by parts of tags. Finally, sub-ontology are constructed by hierarchy paths which contain the sub-tag. To create dynamic-ontology by the proposed DyVOT, it is necessary to define associative relationships among multi sub-ontology that are extracted from hierarchical relationships of static- ontology. The associative relationship is defined by shared resources between tags which are linked by multi sub-ontology. The association is measured by the degree of shared resources that are allocated into the tags of sub-ontology. If the value of association is larger than threshold value, then associative relationship among tags is newly created. The associative relationships are used to merge and construct new hierarchy the multi sub-ontology. To construct dynamic-ontology, it is essential to defined new class which is linked by two more sub-ontology, which is generated by merged tags which are highly associative by proving using shared resources. Thereby, the class is applied to generate new hierarchy with extracted multi sub-ontology to create a dynamic-ontology. The new class is settle down on the ontology. So, the newly created class needs to be belong to the dynamic-ontology. So, the class used to new hyper/hypo hierarchy relationship between the class and tags which are linked to multi sub-ontology. At last, DyVOT is developed by newly defined associative relationships which are extracted from hierarchical relationships among tags. Resources are matched into the DyVOT which narrows down search boundary and shrinks the search paths. Finally, we can create the DyVOT using the newly defined associative relationships.
While static data catalog (Dean and Ghemawat, 2004; 2008) statically searches resources depending on user requirements, the proposed DyVOT dynamically searches resources using multi sub- ontology by parallel processing. In this light, the DyVOT supports improvement of correctness and agility of search and decreasing of search effort by reduction of search path.
A study on the Success Factors and Strategy of Information Technology Investment Based on Intelligent Economic Simulation Modeling
Do-Hyung Park
Vol. 19, No. 1, Page: 35 ~ 55
Keywords : Information Technology Investment, Information Technology Value, Agent-Based Simulation, Information Diffusion Effect, Information Diffusion Effect
Information technology is a critical resource necessary for any company hoping to support and realize its strategic goals, which contribute to growth promotion and sustainable development. The selection of information technology and its strategic use are imperative for the enhanced performance of every aspect of company management, leading a wide range of companies to have invested continuously in information technology. Despite researchers, managers, and policy makers’ keen interest in how information technology contributes to organizational performance, there is uncertainty and debate about the result of information technology investment. In other words, researchers and managers cannot easily identify the independent factors that can impact the investment performance of information technology. This is mainly owing to the fact that many factors, ranging from the internal components of a company, strategies, and external customers, are interconnected with the investment performance of information technology. Using an agent-based simulation technique, this research extracts factors expected to affect investment performance on information technology, simplifies the analyses of their relationship with economic modeling, and examines the performance dependent on changes in the factors. In terms of economic modeling, I expand the model that highlights the way in which product quality moderates the relationship between information technology investments and economic performance (Thatcher and Pingry, 2004) by considering the cost of information technology investment and the demand creation resulting from product quality enhancement. For quality enhancement and its consequences for demand creation, I apply the concept of information quality and decision-maker quality (Raghunathan, 1999). This concept implies that the investment on information technology improves the quality of information, which, in turn, improves decision quality and performance, thus enhancing the level of product or service quality. Additionally, I consider the effect of word of mouth among consumers, which creates new demand for a product or service through the information diffusion effect. This demand creation is analyzed with an agent-based simulation model that is widely used for network analyses. Results show that the investment on information technology enhances the quality of a company’s product or service, which indirectly affects the economic performance of that company, particularly with regard to factors such as consumer surplus, company profit, and company productivity. Specifically, when a company makes its initial investment in information technology, the resultant increase in the quality of a company’s product or service immediately has a positive effect on consumer surplus, but the investment cost has a negative effect on company productivity and profit. As time goes by, the enhancement of the quality of that company’s product or service creates new consumer demand through the information diffusion effect. Finally, the new demand positively affects the company’s profit and productivity. In terms of the investment strategy for information technology, this study’s results also reveal that the selection of information technology needs to be based on analysis of service and the network effect of customers, and demonstrate that information technology implementation should fit into the company’s business strategy. Specifically, if a company seeks the short-term enhancement of company performance, it needs to have a one-shot strategy (making a large investment at one time). On the other hand, if a company seeks a long-term sustainable profit structure, it needs to have a split strategy (making several small investments at different times). The findings from this study make several contributions to the literature. In terms of methodology, the study integrates both economic modeling and simulation technique in order to overcome the limitations of each methodology. It also indicates the mediating effect of product quality on the relationship between information technology and the performance of a company. Finally, it analyzes the effect of information technology investment strategies and information diffusion among consumers on the investment performance of information technology.
Clustering Method based on Genre Interest for Cold-Start Problem in Movie Recommendation
Tithrottanak You, Ahmad Nurzid Rosli, Inay Ha, and Geun-Sik Jo
Vol. 19, No. 1, Page: 57 ~ 77
Social media has become one of the most popular media in web and mobile application. In 2011, social networks and blogs are still the top destination of online users, according to a study from Nielsen Company. In their studies, nearly 4 in 5active users visit social network and blog. Social Networks and Blogs sites rule Americans’ Internet time, accounting to 23 percent of time spent online. Facebook is the main social network that the U.S internet users spend time more than the other social network services such as Yahoo, Google, AOL Media Network, Twitter, Linked In and so on. In recent trend, most of the companies promote their products in the Facebook by creating the “Facebook Page” that refers to specific product. The “Like” option allows user to subscribed and received updates their interested on from the page. The film makers which produce a lot of films around the world also take part to market and promote their films by exploiting the advantages of using the “Facebook Page”. In addition, a great number of streaming service providers allows users to subscribe their service to watch and enjoy movies and TV program. They can instantly watch movies and TV program over the internet to PCs, Macs and TVs. Netflix alone as the world’s leading subscription service have more than 30 million streaming members in the United States, Latin America, the United Kingdom and the Nordics. As the matter of facts, a million of movies and TV program with different of genres are offered to the subscriber. In contrast, users need spend a lot time to find the right movies which are related to their interest genre. Recent years there are many researchers who have been propose a method to improve prediction the rating or preference that would give the most related items such as books, music or movies to the target user or the group of users that have the same interest in the particular items.
One of the most popular methods to build recommendation system is traditional Collaborative Filtering (CF). The method compute the similarity of the target user and other users, which then are cluster in the same interest on items according which items that users have been rated. The method then predicts other items from the same group of users to recommend to a group of users. Moreover, There are many items that need to study for suggesting to users such as books, music, movies, news, videos and so on. However, in this paper we only focus on movie as item to recommend to users. In addition, there are many challenges for CF task. Firstly, the “sparsity problem”; it occurs when user information preference is not enough. The recommendation accuracies result is lower compared to the neighbor who composed with a large amount of ratings. The second problem is “cold-start problem”; it occurs whenever new users or items are added into the system, which each has norating or a few rating. For instance, no personalized predictions can be made for a new user without any ratings on the record.
In this research we propose a clustering method according to the users’ genre interest extracted from social network service (SNS) and user’s movies rating information system to solve the “cold-start problem.” Our proposed method will clusters the target user together with the other users by combining the user genre interest and the rating information. It is important to realize a huge amount of interesting and useful users’information from Facebook Graph, we can extract information from the “Facebook Page” which “Like” by them. Moreover, we use the Internet Movie Database (IMDb) as the main dataset. The IMDbis online databases that consist of a large amount of information related to movies, TV programs and including actors. This dataset not only used to provide movie information in our Movie Rating System, but also as resources to provide movie genre information which extracted from the “Facebook Page”. Formerly, the user must login with their Facebook account to login to the Movie Rating System, at the same time our system will collect the genre interest from the “Facebook Page”.
We conduct many experiments with other methods to see how our method performs and we also compare to the other methods. First, we compared our proposed method in the case of the normal recommendation to see how our system improves the recommendation result. Then we experiment method in case of cold-start problem. Our experiment show that our method is outperform than the other methods. In these two cases of our experimentation, we see that our proposed method produces better result in case both cases.
A Study of the Reactive Movement Synchronization for Analysis of Group Flow
Joon Mo Ryu, Seung-Bo Park, and Jae Kyeong Kim
Vol. 19, No. 1, Page: 79 ~ 94
Keywords : Group Audience, Synchronization, Flow, Group Flow, Group Flow
Recently, the high value added business is steadily growing in the culture and art area. To generate high value from a performance, the satisfaction of audience is necessary. The flow is a critical factor for satisfaction, and it should be induced from audience and measured. To evaluate interest and emotion of audience on contents, producers or investors need a kind of index for the measurement of the flow. But it is neither easy to define the flow quantitatively, nor to collect audience’s reaction immediately.
The previous studies of the group flow were evaluated by the sum of the average value of each person’s reaction. The flow or “good feeling” from each audience was extracted from his face, especially, the change of his (or her) expression and body movement. But it was not easy to handle the large amount of real-time data from each sensor signals. And also it was difficult to set experimental devices, in terms of economic and environmental problems. Because, all participants should have their own personal sensor to check their physical signal. Also each camera should be located in front of their head to catch their looks. Therefore we need more simple system to analyze group flow.
This study provides the method for measurement of audiences flow with group synchronization at same time and place. To measure the synchronization, we made real-time processing system using the Differential Image and Group Emotion Analysis (GEA) system. Differential Image was obtained from camera and by the previous frame was subtracted from present frame. So the movement variation on audience’s reaction was obtained.
And then we developed a program, GEA (Group Emotion Analysis), for flow judgment model. After the measurement of the audience’s reaction, the synchronization is divided as Dynamic State Synchronization and Static State Synchronization. The Dynamic State Synchronization accompanies audience’s active reaction, while the Static State Synchronization means no movement of audience.
The Dynamic State Synchronization can be caused by the audience’s surprise action such as scary, creepy or reversal scene. And the Static State Synchronization was triggered by impressed or sad scene. Therefore we showed them several short movies containing various scenes mentioned previously. And these kind of scenes made them sad, clap, and creepy, etc.
To check the movement of audience, we defined the critical point, αand β. Dynamic State Synchronization was meaningful when the movement value was over critical point β, while Static State Synchronization was effective under critical point α. β is made by audience’clapping movemnet of 10 teams in stead of using average number of movement.
After checking the reactive movement of audience, the percentage (%) ratio was calculated from the division of “people having reaction” by “total people.” Total 37 teams were made in “2012 Seoul DMC Culture Open” and they involved the experiments. First, they followed induction to clap by staff. Second, basic scene for neutralize emotion of audience. Third, flow scene was displayed to audience. Forth, the reversal scene was introduced. And then 24 teams of them were provided with amuse and creepy scenes. And the other 10 teams were exposed with the sad scene. There were clapping and laughing action of audience on the amuse scene with shaking their head or hid with closing eyes. And also the sad or touching scene made them silent. If the results were over about 80%, the group could be judged as the synchronization and the flow were achieved.
As a result, the audience showed similar reactions about similar stimulation at same time and place. Once we get an additional normalization and experiment, we can obtain find the flow factor through the synchronization on a much bigger group and this should be useful for planning contents
Predicting the Direction of the Stock Index byUsing a Domain-Specific Sentiment Dictionary
Eunji Yu, Yoosin Kim, Namgyu Kim, and Seung Ryul Jeong
Vol. 19, No. 1, Page: 95 ~ 110
Keywords : Big Data Analysis, Opinion Mining, Sentiment Dictionary Construction, Text Mining
Recently, the amount of unstructured data being generated through a variety of social media has been increasing rapidly, resulting in the increasing need to collect, store, search for, analyze, and visualize this data. This kind of data cannot be handled appropriately by using the traditional methodologies usually used for analyzing structured data because of its vast volume and unstructured nature. In this situation, many attempts are being made to analyze unstructured data such as text files and log files through various commercial or noncommercial analytical tools.
Among the various contemporary issues dealt with in the literature of unstructured text data analysis, the concepts and techniques of opinion mining have been attracting much attention from pioneer researchers and business practitioners. Opinion mining or sentiment analysis refers to a series of processes that analyze participants’ opinions, sentiments, evaluations, attitudes, and emotions about selected products, services, organizations, social issues, and so on. In other words, many attempts based on various opinion mining techniques are being made to resolve complicated issues that could not have otherwise been solved by existing traditional approaches.
One of the most representative attempts using the opinion mining technique may be the recent research that proposed an intelligent model for predicting the direction of the stock index. This model works mainly on the basis of opinions extracted from an overwhelming number of economic news reports. News content published on various media is obviously a traditional example of unstructured text data. Every day, a large volume of news content is created, digitalized, and subsequently distributed to us via online or offline channels. Many studies have revealed that we make better decisions on political, economic, and social issues by analyzing news and other related information. In this sense, we expect to predict the fluctuation of stock markets partly by analyzing the relationship between economic news reports and the pattern of stock prices.
So far, in the literature on opinion mining, most studies including ours have utilized a sentiment dictionary to elicit sentiment polarity or sentiment value from a large number of documents. A sentiment dictionary consists of pairs of selected words and their sentiment values. Sentiment classifiers refer to the dictionary to formulate the sentiment polarity of words, sentences in a document, and the whole document. However, most traditional approaches have common limitations in that they do not consider the flexibility of sentiment polarity, that is, the sentiment polarity or sentiment value of a word is fixed and cannot be changed in a traditional sentiment dictionary. In the real world, however, the sentiment polarity of a word can vary depending on the time, situation, and purpose of the analysis. It can also be contradictory in nature.
The flexibility of sentiment polarity motivated us to conduct this study. In this paper, we have stated that sentiment polarity should be assigned, not merely on the basis of the inherent meaning of a word but on the basis of its ad hoc meaning within a particular context. To implement our idea, we presented an intelligent investment decision?support model based on opinion mining that performs the scrapping and parsing of massive volumes of economic news on the web, tags sentiment words, classifies sentiment polarity of the news, and finally predicts the direction of the next day’s stock index. In addition, we applied a domain?specific sentiment dictionary instead of a general purpose one to classify each piece of news as either positive or negative.
For the purpose of performance evaluation, we performed intensive experiments and investigated the prediction accuracy of our model. For the experiments to predict the direction of the stock index, we gathered and analyzed 1,072 articles about stock markets published by “M” and “E” media between July 2011 and September 2011.
The Need for Paradigm Shift in Semantic Similarity and Semantic Relatedness : From Cognitive Semantics Perspective
Youngseok Choi, and Jinsoo Park
Vol. 19, No. 1, Page: 111 ~ 123
Keywords : Semantic Relatedness, Semantic Similarity, Semantic Network
Semantic similarity/relatedness measure between two concepts plays an important role in research on system integration and database integration. Moreover, current research on keyword recommendation or tag clustering strongly depends on this kind of semantic measure. For this reason, many researchers in various fields including computer science and computational linguistics have tried to improve methods to calculating semantic similarity/relatedness measure.
The study of similarity between concepts is meant to discover how a computational process can model the action of a human to determine the relationship between two concepts. Most research on calculating semantic similarity usually uses ready?made reference knowledge such as semantic network and dictionary to measure concept similarity. The topological method is used to calculate relatedness or similarity between concepts based on various forms of a semantic network including a hierarchical taxonomy. This approach assumes that the semantic network reflects the human knowledge well. The nodes in a network represent concepts, and ways to measure the conceptual similarity between two nodes are also regarded as ways to determine the conceptual similarity of two words (i.e., two nodes in a network). Topological method can be categorized as node?based or edge?based, which are also called the information content approach and the conceptual distance approach, respectively. The node?based approach is used to calculate similarity between concepts based on how much information the two concepts share in terms of a semantic network or taxonomy while edge?based approach estimates the distance between the nodes that correspond to the?concepts being compared. Both of two approaches have assumed that the semantic network is static. That means topological approach has not considered the change of semantic relation between concepts in semantic network.
However, as information communication technologies make advantage in sharing knowledge among people, semantic relation between concepts in semantic network may change. To explain the change in semantic relation, we adopt the cognitive semantics. The basic assumption of cognitive semantics is that humans judge the semantic relation based on their cognition and understanding of concepts. This cognition and understanding is called ‘World Knowledge.’ World knowledge can be categorized as personal knowledge and cultural knowledge. Personal knowledge means the knowledge from personal experience. Everyone can have different Personal knowledge of same concept. Cultural knowledge is the knowledge shared by people who are living in the same culture or using the same language. People in the same culture have common understanding of specific concepts. Cultural knowledge can be the starting point of discussion about the change of semantic relation. If the culture shared by people changes for some reasons, the human’s cultural knowledge may also change. Today’s society and culture are changing at a past face, and the change of cultural knowledge is not negligible issues in the research on semantic relationship between concepts.
In this paper, we propose the future directions of research on semantic similarity. In other words, we discuss that how the research on semantic similarity can reflect the change of semantic relation caused by the change of cultural knowledge. We suggest three direction of future research on semantic similarity. First, the research should include the versioning and update methodology for semantic network. Second, semantic network which is dynamically generated can be used for the calculation of semantic similarity between concepts. If the researcher can develop the methodology to extract the semantic network from given knowledge base in real time, this approach can solve many problems related to the change of semantic relation. Third, the statistical approach based on corpus analysis can be an alternative for the method using semantic network. We believe that these proposed research direction can be the milestone of the research on semantic relation.
Personal Information Overload and User Resistance in the Big Data Age
Hwansoo Lee, Dongwon Lim, and Hangjung Zo
Vol. 19, No. 1, Page: 125 ~ 139
Keywords : Big data, Personal Information Overload, Information Privacy Concerns
Big data refers to the data that cannot be processes with conventional contemporary data technologies. As smart devices and social network services produces vast amount of data, big data attracts much attention from researchers. There are strong demands from governments and industries for big data as it can create new values by drawing business insights from data. Since various new technologies to process big data are introduced, academic communities also show much interest to the big data domain.
A notable advance related to the big data technology has been in various fields. Big data technology makes it possible to access, collect, and save individual’s personal data. These technologies enable the analysis of huge amounts of data with lower cost and less time, which is impossible to achieve with traditional methods. It even detects personal information that people do not want to open. Therefore, people using information technology such as the Internet or online services have some level of privacy concerns, and such feelings can hinder continued use of information systems. For example, SNS offers various benefits, but users are sometimes highly exposed to privacy intrusions because they write too much personal information on it. Even though users post their personal information on the Internet by themselves, the data sometimes is not under control of the users. Once the private data is posted on the Internet, it can be transferred to anywhere by a few clicks, and can be abused to create fake identity. In this way, privacy intrusion happens.
This study aims to investigate how perceived personal information overload in SNS affects user’s risk perception and information privacy concerns. Also, it examines the relationship between the concerns and user resistance behavior. A survey approach and structural equation modeling method are employed for data collection and analysis. This study contributes meaningful insights for academic researchers and policy makers who are planning to develop guidelines for privacy protection. The study shows that information overload on the social network services can bring the significant increase of users’ perceived level of privacy risks. In turn, the perceived privacy risks leads to the increased level of privacy concerns. If privacy concerns increase, it can affect users to form a negative or resistant attitude toward system use. The resistance attitude may lead users to discontinue the use of social network services. Furthermore, information overload is mediated by perceived risks to affect privacy concerns rather than has direct influence on perceived risk. It implies that resistance to the system use can be diminished by reducing perceived risks of users. Given that users’resistant behavior become salient when they have high privacy concerns, the measures to alleviate users’ privacy concerns should be conceived.
This study makes academic contribution of integrating traditional information overload theory and user resistance theory to investigate perceived privacy concerns in current IS contexts. There is little big data research which examined the technology with empirical and behavioral approach, as the research topic has just emerged. It also makes practical contributions. Information overload connects to the increased level of perceived privacy risks, and discontinued use of the information system. To keep users from departing the system, organizations should develop a system in which private data is controlled and managed with ease. This study suggests that actions to lower the level of perceived risks and privacy concerns should be taken for information systems continuance.

