Journal of Intelligence and Information Systems,
Vol. 16, No. 4, December 2010
Design and Analysis of Online Advertising Expenditure Model based on Coupon Download
Jung-Ho Jun, and Kyoung-Jun Lee
Vol. 16, No. 4, Page: 1 ~ 19
Keywords : CPCD(Cost Per Coupon Download), CPC(Cost Per Click)
In offline environment, unlike traditional advertising model through TV, newspaper, and radio, online advertising model draws instantaneous responses from potential consumers and it is convenient to assess. This kind of characteristics of Internet advertising model has driven the growth of advertising model among various Internet business models. There are, conventionally classified, CPM (Cost Per Mile), CPC (Cost Per Click), and CPS (Cost Per Sales) models as Internet advertising expenditure model. These can be examined in manners regarding risks that stakeholders should stand and degree of responsibility. CPM model that is based on number of advertisement exposure is mechanically exposed to users but not actually recognized by users resulting in risk of wasted expenditure by advertisers without any advertising effect. While on aspect of media, CPS model that is based on conversion action is the most risky model because of the conversion action such as product purchase is determined by capability of advertisers not that of media. In this regard, while there are issue of CPM and CPS models disadvantageously affecting only one side of Internet advertising business model value network, CPC model has been evaluated as reasonable both to advertisers and media, and occupied the largest segment of Internet advertising market. However, CPC model also can cause fraudulent behavior such as click fraud because of the competition or dishonest amount of advertising expenditure. On the user aspect, unintentionally accessed advertisements can lead to more inappropriate expenditure from advertisers. In this paper, we suggest "CPCD"(Cost Per Coupon Download) model. This goes beyond simple clicking of advertisements and advertising expenditure is exerted when users download a coupon from advertisers, which is a concept in between CPC and CPS models. To achieve the purpose, we describe the scenario of advertiser perspective, processes, participants and their benefits of CPCD model. Especially, we suggest the new value in online coupon; "possibility of storage" and "complement for delivery to the target group". We also analyze the working condition for advertiser by a comparison of CPC and CPCD models through advertising expenditure simulation. The result of simulation implies that the CPCD model suits more properly to advertisers with medium-low price products rather than that of high priced goods. This denotes that since most of advertisers in CPC model are dealing with medium-low priced products, the result is very interesting. At last, we contemplate applicability of CPCD model in ubiquitous environment.
PIRS : Personalized Information Retrieval System using Adaptive User Profiling and Real-time Filtering for Search Results
Ho-Cheol Jeon, and Joong-Min Choi
Vol. 16, No. 4, Page: 21 ~ 41
Keywords : Adaptive User Profile, Real-time Filtering, Personalized Information Retrieval
This paper proposes a system that can serve users with appropriate search results through real time filtering, and implemented adaptive user profiling based personalized information retrieval system(PIRS) using users' implicit feedbacks in order to deal with the problem of existing search systems such as Google or MSN that does not satisfy various user' personal search needs. One of the reasons that existing search systems hard to satisfy various user' personal needs is that it is not easy to recognize users' search intentions because of the uncertainty of search intentions. The uncertainty of search intentions means that users may want to different search results using the same query. For example, when a user inputs "java" query, the user may want to be retrieved "java" results as a computer programming language, a coffee of java, or a island of Indonesia. In other words, this uncertainty is due to ambiguity of search queries. Moreover, if the number of the used words for a query is fewer, this uncertainty will be more increased. Real-time filtering for search results returns only those results that belong to user-selected domain for a given query. Although it looks similar to a general directory search, it is different in that the search is executed for all web documents rather than sites, and each document in the search results is classified into the given domain in real time. By applying information filtering using real time directory classifying technology for search results to personalization, the number of delivering results to users is effectively decreased, and the satisfaction for the results is improved. In this paper, a user preference profile has a hierarchical structure, and consists of domains, used queries, and selected documents. Because the hierarchy structure of user preference profile can apply the context when users perfomed search, the structure is able to deal with the uncertainty of user intentions, when search is carried out, the intention may differ according to the context such as time or place for the same query. Furthermore, this structure is able to more effectively track web documents search behaviors of a user for each domain, and timely recognize the changes of user intentions. An IP address of each device was used to identify each user, and the user preference profile is continuously updated based on the observed user behaviors for search results. Also, we measured user satisfaction for search results by observing the user behaviors for the selected search result. Our proposed system automatically recognizes user preferences by using implicit feedbacks from users such as staying time on the selected search result and the exit condition from the page, and dynamically updates their preferences. Whenever search is performed by a user, our system finds the user preference profile for the given IP address, and if the file is not exist then a new user preference profile is created in the server, otherwise the file is updated with the transmitted information. If the file is not exist in the server, the system provides Google' results to users, and the reflection value is increased/decreased whenever user search. We carried out some experiments to evaluate the performance of adaptive user preference profile technique and real time filtering, and the results are satisfactory. According to our experimental results, participants are satisfied with average 4.7 documents in the top 10 search list by using adaptive user preference profile technique with real time filtering, and this result shows that our method outperforms Google's by 23.2%.
An Empirical Study on the Effect of CRM System on the Performance of Pharmaceutical Companies
Hyun-Jung Kim, and Jong-Woo Park
Vol. 16, No. 4, Page: 43 ~ 65
Keywords : Customer Relationship Management System, Balanced Scorecard, Performance
Facing a complex environment driven by a decade, many companies are adopting new strategic frameworks such as Customer Relationship Management system to achieve sustainable profitability as well as overcome serious competition for survival. In many business areas, CRM system advanced a great deal in a matter of continuous compensating the defect and overall integration. However, pharmaceutical companies in Korea were slow to accept them for usesince they still have a tendency of holding fast to traditional way of sales and marketing based on individual networks of sales representatives. In the circumstance, this article tried to empirically address current status of CRM system as well as the effects of the system on the performance of pharmaceutical companies by applying BSC method's four perspectives, from financial, customer, learning and growth and internal process. Survey by e-mail and post to employers and employees who were working in pharma firms were undergone for the purpose. Total 113 cases among collected 140 ones were used for the statistical analysis by SPSS ver. 15 package. Reliability, Factor analysis, regression were done. This study revealed that CRM system had a significant effect on improving financial and non-financial performance of pharmaceutical companies as expected. Proposed regression model fits well and among them, CRM marketing information system shed the light on substantial impact on companies' outcome given profitability, growth and investment. Useful analytical information by CRM marketing information system appears to enable pharmaceutical firms to set up effective marketing and sales strategies, these result in favorable financial performance by enhancing values for stakeholderseventually, not to mention short-term profit and/or mid-term potential to growth. CRM system depicted its influence on not only financial performance, but also non-financial fruit of pharmaceutical companies. Further analysis for each component showed that CRM marketing information system were able to demonstrate statistically significant effect on the performance like the result of financial outcome. CRM system is believed to provide the companies with efficient way of customers managing by valuable standardized business process prompt coping with specific customers' needs. It consequently induces customer satisfaction and retentionto improve performance for long period. That is, there is a virtuous circle for creating value as the cornerstone for sustainable growth. However, the research failed to put forward to evidence to support hypothesis regarding favorable influence of CRM sales representative's records assessment system and CRM customer analysis system on the management performance. The analysis is regarded to reflect the lack of understanding of sales people and respondents between actual work duties and far-sighted goal in strategic analysis framework. Ordinary salesmen seem to dedicate short-term goal for the purpose of meeting sales target, receiving incentive bonus in a manner-of-fact style, as such, they tend to avail themselves of personal network and sales and promotional expense rather than CRM system. The study finding proposed a link between CRM information system and performance. It empirically indicated that pharmaceutical companies had been implementing CRM system as an effective strategic business framework in order for more balanced achievements based on the grounded understanding of both CRM system and integrated performance. It suggests a positive impact of supportive CRM system on firm performance, especially for pharmaceutical industry through the initial empirical evidence. Also, it brings out unmet needs for more practical system design, improvement of employees' awareness, increase of system utilization in the field. On the basis of the insight from this exploratory study, confirmatory research by more appropriate measurement tool and increased sample size should be further examined.
A Study on Forecasting Accuracy Improvement of Case Based Reasoning Approach Using Fuzzy Relation
In-Ho Lee, and Kyung-Shik Shin
Vol. 16, No. 4, Page: 67 ~ 84
Keywords : Forecasting, Symbolic Data, Fuzzy Relation, Case-Based Reasoning, Similarity Matrix
In terms of business, forecasting is a work of what is expected to happen in the future to make managerial decisions and plans. Therefore, the accurate forecasting is very important for major managerial decision making and is the basis for making various strategies of business. But it is very difficult to make an unbiased and consistent estimate because of uncertainty and complexity in the future business environment. That is why we should use scientific forecasting model to support business decision making, and make an effort to minimize the model's forecasting error which is difference between observation and estimator. Nevertheless, minimizing the error is not an easy task. Case-based reasoning is a problem solving method that utilizes the past similar case to solve the current problem. To build the successful case-based reasoning models, retrieving the case not only the most similar case but also the most relevant case is very important. To retrieve the similar and relevant case from past cases, the measurement of similarities between cases is an important key factor. Especially, if the cases contain symbolic data, it is more difficult to measure the distances. The purpose of this study is to improve the forecasting accuracy of case-based reasoning approach using fuzzy relation and composition. Especially, two methods are adopted to measure the similarity between cases containing symbolic data. One is to deduct the similarity matrix following binary logic(the judgment of sameness between two symbolic data), the other is to deduct the similarity matrix following fuzzy relation and composition. This study is conducted in the following order; data gathering and preprocessing, model building and analysis, validation analysis, conclusion. First, in the progress of data gathering and preprocessing we collect data set including categorical dependent variables. Also, the data set gathered is cross-section data and independent variables of the data set include several qualitative variables expressed symbolic data. The research data consists of many financial ratios and the corresponding bond ratings of Korean companies. The ratings we employ in this study cover all bonds rated by one of the bond rating agencies in Korea. Our total sample includes 1,816 companies whose commercial papers have been rated in the period 1997~2000. Credit grades are defined as outputs and classified into 5 rating categories(A1, A2, A3, B, C) according to credit levels. Second, in the progress of model building and analysis we deduct the similarity matrix following binary logic and fuzzy composition to measure the similarity between cases containing symbolic data. In this process, the used types of fuzzy composition are max-min, max-product, max-average. And then, the analysis is carried out by case-based reasoning approach with the deducted similarity matrix. Third, in the progress of validation analysis we verify the validation of model through McNemar test based on hit ratio. Finally, we draw a conclusion from the study. As a result, the similarity measuring method using fuzzy relation and composition shows good forecasting performance compared to the similarity measuring method using binary logic for similarity measurement between two symbolic data. But the results of the analysis are not statistically significant in forecasting performance among the types of fuzzy composition. The contributions of this study are as follows. We propose another methodology that fuzzy relation and fuzzy composition could be applied for the similarity measurement between two symbolic data. That is the most important factor to build case-based reasoning model.
Analysis of Knowledge Community for Knowledge Creation and Use
Jun-Hyuk Huh, and Jung-Seung Lee
Vol. 16, No. 4, Page: 85 ~ 97
Keywords : Knowledge Community, Knowledge Sharing, Stepwise Regression
Internet communities are a typical space for knowledge creation and use on the Internet as people discuss their common interests within the internet communities. When we define 'Knowledge Communities' as internet communities that are related to knowledge creation and use, they are categorized into 4 different types such as 'Search Engine,' 'Open Communities,' 'Specialty Communities,' and 'Activity Communities.' Each type of knowledge community does not remain the same, for example. Rather, it changes with time and is also affected by the external business environment. Therefore, it is critical to develop processes for practical use of such changeable knowledge communities. Yet there is little research regarding a strategic framework for knowledge communities as a source of knowledge creation and use. The purposes of this study are (1) to find factors that can affect knowledge creation and use for each type of knowledge community and (2) to develop a strategic framework for practical use of the knowledge communities. Based on previous research, we found 7 factors that have considerable impacts on knowledge creation and use. They were 'Fitness,' 'Reliability,' 'Systemicity,' 'Richness,' 'Similarity,' 'Feedback,' and 'Understanding.' We created 30 different questions from each type of knowledge community. The questions included common sense, IT, business and hobbies, and were uniformly selected from various knowledge communities. Instead of using survey, we used these questions to ask users of the 4 representative web sites such as Google from Search Engine, NAVER Knowledge iN from Open Communities, SLRClub from Specialty Communities, and Wikipedia from Activity Communities. These 4 representative web sites were selected based on popularity (i.e., the 4 most popular sites in Korea). They were also among the 4 most frequently mentioned sitesin previous research. The answers of the 30 knowledge questions were collected and evaluated by the 11 IT experts who have been working for IT companies more than 3 years. When evaluating, the 11 experts used the above 7 knowledge factors as criteria. Using a stepwise linear regression for the evaluation of the 7 knowledge factors, we found that each factors affects differently knowledge creation and use for each type of knowledge community. The results of the stepwise linear regression analysis showed the relationship between 'Understanding' and other knowledge factors. The relationship was different regarding the type of knowledge community. The results indicated that 'Understanding' was significantly related to 'Reliability' at 'Search Engine type', to 'Fitness' at 'Open Community type', to 'Reliability' and 'Similarity' at 'Specialty Community type', and to 'Richness' and 'Similarity' at 'Activity Community type'. A strategic framework was created from the results of this study and such framework can be useful for knowledge communities that are not stable with time. For the success of knowledge community, the results of this study suggest that it is essential to ensure there are factors that can influence knowledge communities. It is also vital to reinforce each factor has its unique influence on related knowledge community. Thus, these changeable knowledge communities should be transformed into an adequate type with proper business strategies and objectives. They also should be progressed into a type that covers varioustypes of knowledge communities. For example, DCInside started from a small specialty community focusing on digital camera hardware and camerawork and then was transformed to an open community focusing on social issues through well-known photo galleries. NAVER started from a typical search engine and now covers an open community and a special community through additional web services such as NAVER knowledge iN, NAVER Cafe, and NAVER Blog. NAVER is currently competing withan activity community such as Wikipedia through the NAVER encyclopedia that provides similar services with NAVER encyclopedia's users as Wikipedia does. Finally, the results of this study provide meaningfully practical guidance for practitioners in that which type of knowledge community is most appropriate to the fluctuated business environment as knowledge community itself evolves with time.
Optimal Selection of Classifier Ensemble Using Genetic Algorithms
Myung-Jong Kim
Vol. 16, No. 4, Page: 99 ~ 112
Keywords : Neural Networks, Ensemble, Genetic Algorithms
Ensemble learning is a method for improving the performance of classification and prediction algorithms. It is a method for finding a highly accurateclassifier on the training set by constructing and combining an ensemble of weak classifiers, each of which needs only to be moderately accurate on the training set. Ensemble learning has received considerable attention from machine learning and artificial intelligence fields because of its remarkable performance improvement and flexible integration with the traditional learning algorithms such as decision tree (DT), neural networks (NN), and SVM, etc. In those researches, all of DT ensemble studies have demonstrated impressive improvements in the generalization behavior of DT, while NN and SVM ensemble studies have not shown remarkable performance as shown in DT ensembles. Recently, several works have reported that the performance of ensemble can be degraded where multiple classifiers of an ensemble are highly correlated with, and thereby result in multicollinearity problem, which leads to performance degradation of the ensemble. They have also proposed the differentiated learning strategies to cope with performance degradation problem. Hansen and Salamon (1990) insisted that it is necessary and sufficient for the performance enhancement of an ensemble that the ensemble should contain diverse classifiers. Breiman (1996) explored that ensemble learning can increase the performance of unstable learning algorithms, but does not show remarkable performance improvement on stable learning algorithms. Unstable learning algorithms such as decision tree learners are sensitive to the change of the training data, and thus small changes in the training data can yield large changes in the generated classifiers. Therefore, ensemble with unstable learning algorithms can guarantee some diversity among the classifiers. To the contrary, stable learning algorithms such as NN and SVM generate similar classifiers in spite of small changes of the training data, and thus the correlation among the resulting classifiers is very high. This high correlation results in multicollinearity problem, which leads to performance degradation of the ensemble. Kim,s work (2009) showedthe performance comparison in bankruptcy prediction on Korea firms using tradition prediction algorithms such as NN, DT, and SVM. It reports that stable learning algorithms such as NN and SVM have higher predictability than the unstable DT. Meanwhile, with respect to their ensemble learning, DT ensemble shows the more improved performance than NN and SVM ensemble. Further analysis with variance inflation factor (VIF) analysis empirically proves that performance degradation of ensemble is due to multicollinearity problem. It also proposes that optimization of ensemble is needed to cope with such a problem. This paper proposes a hybrid system for coverage optimization of NN ensemble (CO-NN) in order to improve the performance of NN ensemble. Coverage optimization is a technique of choosing a sub-ensemble from an original ensemble to guarantee the diversity of classifiers in coverage optimization process. CO-NN uses GA which has been widely used for various optimization problems to deal with the coverage optimization problem. The GA chromosomes for the coverage optimization are encoded into binary strings, each bit of which indicates individual classifier. The fitness function is defined as maximization of error reduction and a constraint of variance inflation factor (VIF), which is one of the generally used methods to measure multicollinearity, is added to insure the diversity of classifiers by removing high correlation among the classifiers. We use Microsoft Excel and the GAs software package called Evolver. Experiments on company failure prediction have shown that CO-NN is effectively applied in the stable performance enhancement of NNensembles through the choice of classifiers by considering the correlations of the ensemble. The classifiers which have the potential multicollinearity problem are removed by the coverage optimization process of CO-NN and thereby CO-NN has shown higher performance than a single NN classifier and NN ensemble at 1% significance level, and DT ensemble at 5% significance level. However, there remain further research issues. First, decision optimization process to find optimal combination function should be considered in further research. Secondly, various learning strategies to deal with data noise should be introduced in more advanced further researches in the future.
Trends of Semantic Web Services and Technologies : Focusing on the Business Support
Jin-Sung Kim, and Soon-Jae Kwon
Vol. 16, No. 4, Page: 113 ~ 130
During the decades, considerable human interventions to comprehend the web information were increased continually. The successful expansion of the web services made it more complex and required more contributions of the users. Many researchers have tried to improve the comprehension ability of computers in supporting an intelligent web service. One reasonable approach is enriching the information with machine understandable semantics. They applied ontology design, intelligent reasoning and other logical representation schemes to design an infrastructure of the semantic web. For the features, the semantic web is considered as an intelligent access to understanding, transforming, storing, retrieving, and processing the information gathered from heterogeneous, distributed web resources. The goal of this study is firstly to explore the problems that restrict the applications of web services and the basic concepts, languages, and tools of the semantic web. Then we highlight some of the researches, solutions, and projects that have attempted to combine the semantic web and business support, and find out the pros and cons of the approaches. Through the study, we were able to know that the semantic web technology is trying to offer a new and higher level of web service to the online users. The services are overcoming the limitations of traditional web technologies/services. In traditional web services, too much human interventions were needed to seek and interpret the information. The semantic web service, however, is based on machine-understandable semantics and knowledge representation. Therefore, most of information processing activities will be executed by computers. The main elements required to develop a semantic web-based business support are business logics, ontologies, ontology languages, intelligent agents, applications, and etc. In using/managing the infrastructure of the semantic web services, software developers, service consumers, and service providers are the main representatives. Some researchers integrated those technologies, languages, tools, mechanisms, and applications into a semantic web services framework. Therefore, future directions of the semantic web-based business support should be start over from the infrastructure.
Development of Intelligent ATP System Using Genetic Algorithm
Tai-Young Kim
Vol. 16, No. 4, Page: 131 ~ 145
Keywords : Intelligence Management System, ATP(Available-to-Promise), Genetic Algorithm
The framework for making a coordinated decision for large-scale facilities has become an important issue in supply chain(SC) management research. The competitive business environment requires companies to continuously search for the ways to achieve high efficiency and lower operational costs. In the areas of production/distribution planning, many researchers and practitioners have developedand evaluated the deterministic models to coordinate important and interrelated logistic decisions such as capacity management, inventory allocation, and vehicle routing. They initially have investigated the various process of SC separately and later become more interested in such problems encompassing the whole SC system. The accurate quotation of ATP(Available-To-Promise) plays a very important role in enhancing customer satisfaction and fill rate maximization. The complexity for intelligent manufacturing system, which includes all the linkages among procurement, production, and distribution, makes the accurate quotation of ATP be a quite difficult job. In addition to, many researchers assumed ATP model with integer time. However, in industry practices, integer times are very rare and the model developed using integer times is therefore approximating the real system. Various alternative models for an ATP system with time lags have been developed and evaluated. In most cases, these models have assumed that the time lags are integer multiples of a unit time grid. However, integer time lags are very rare in practices, and therefore models developed using integer time lags only approximate real systems. The differences occurring by this approximation frequently result in significant accuracy degradations. To introduce the ATP model with time lags, we first introduce the dynamic production function. Hackman and Leachman's dynamic production function in initiated research directly related to the topic of this paper. They propose a modeling framework for a system with non-integer time lags and show how to apply the framework to a variety of systems including continues time series, manufacturing resource planning and critical path method. Their formulation requires no additional variables or constraints and is capable of representing real world systems more accurately. Previously, to cope with non-integer time lags, they usually model a concerned system either by rounding lags to the nearest integers or by subdividing the time grid to make the lags become integer multiples of the grid. But each approach has a critical weakness: the first approach underestimates, potentially leading to infeasibilities or overestimates lead times, potentially resulting in excessive work-inprocesses. The second approach drastically inflates the problem size. We consider an optimized ATP system with non-integer time lag in supply chain management. We focus on a worldwide headquarter, distribution centers, and manufacturing facilities are globally networked. We develop a mixed integer programming(MIP) model for ATP process, which has the definition of required data flow. The illustrative ATP module shows the proposed system is largely affected inSCM. The system we are concerned is composed of a multiple production facility with multiple products, multiple distribution centers and multiple customers. For the system, we consider an ATP scheduling and capacity allocationproblem. In this study, we proposed the model for the ATP system in SCM using the dynamic production function considering the non-integer time lags. The model is developed under the framework suitable for the non-integer lags and, therefore, is more accurate than the models we usually encounter. We developed intelligent ATP System for this model using genetic algorithm. We focus on a capacitated production planning and capacity allocation problem, develop a mixed integer programming model, and propose an efficient heuristic procedure using an evolutionary system to solve it efficiently. This method makes it possible for the population to reach the approximate solution easily. Moreover, we designed and utilized a representation scheme that allows the proposed models to represent real variables. The proposed regeneration procedures, which evaluate each infeasible chromosome, makes the solutions converge to the optimum quickly.
Self-Tour Service Technology based on a Smartphone
Kyoung-Yul Bae
Vol. 16, No. 4, Page: 147 ~ 157
Keywords : Smartphone, Self-Tour, Push Service
With the immergence of the iPhone, the interest in Smartphones is getting higher as services can be provided directly between service providers and consumers without the network operators. As the number of international tourists increase, individual tourists are also increasing. According to the WTO's (World Tourism Organization) prediction, the number of international tourists will be 1.56 billion in 2020,and the average growth rate will be 4.1% a year. Chinese tourists, in particular, are increasing rapidly and about 100 million will travel the world in 2020. In 2009, about 7.8 million foreign tourists visited Korea and the Ministry of Culture, Sports and Tourism is trying to attract 12 million foreign tourists in 2014. A research institute carried out a survey targeting foreign tourists and the survey results showed that they felt uncomfortable with communication (about 55.8%) and directional signs (about 21.4%) when they traveled in Korea. To solve this inconvenience for foreign tourists, multilingual servicesfor traffic signs, tour information, shopping information and so forth should be enhanced. The appearance of the Smartphone comes just in time to provide a new service to address these inconveniences. Smartphones are especially useful because every Smartphone has GPS (Global Positioning System) that can provide users' location to the system, making it possible to provide location-based services. For improvement of tourists' convenience, Seoul Metropolitan Government hasinitiated the u-tour service using Kiosks and Smartphones, and several Province Governments have started the u-tourpia project using RFID (Radio Frequency IDentification) and an exclusive device. Even though the u-tour or u-tourpia service used the Smartphone and RFID, the tourist should know the location of the Kiosks and have previous information. So, this service did not give the solution yet. In this paper, I developed a new convenient service which can provide location based information for the individual tourists using GPS, WiFi, and 3G. The service was tested at Insa-dong in Seoul, and the service can provide tour information around the tourist using a push service without user selection. This self-tour service is designed for providing a travel guide service for foreign travelers from the airport to their destination and information about tourist attractions. The system reduced information traffic by constraining receipt of information to tourist themes and locations within a 20m or 40m radius of the device. In this case, service providers can provide targeted, just-in-time services to special customers by sending desired information. For evaluating the implemented system, the contents of 40 gift shops and traditional restaurants in Insa-dong are stored in the CMS (Content Management System). The service program shows a map displaying the current location of the tourist and displays a circle which shows the range to get the tourist information. If there is information for the tourist within range, the information viewer is activated. If there is only a single resultto display, the information viewer pops up directly, and if there are several results, the viewer shows a list of the contents and the user can choose content manually. As aresult, the proposed system can provide location-based tourist information to tourists without previous knowledge of the area. Currently, the GPS has a margin of error (about 10~20m) and this leads the location and information errors. However, because our Government is planning to provide DGPS (Differential GPS) information by DMB (Digital Multimedia Broadcasting) this error will be reduced to within 1m.
Predicting the Performance of Recommender Systems through Social Network Analysis and Artificial Neural Network
Yoon-Ho Cho, and In-Hwan Kim
Vol. 16, No. 4, Page: 159 ~ 172
Keywords : Social Network Analysis, Collaborative Filtering, Neural Network
The recommender system is one of the possible solutions to assist customers in finding the items they would like to purchase. To date, a variety of recommendation techniques have been developed. One of the most successful recommendation techniques is Collaborative Filtering (CF) that has been used in a number of different applications such as recommending Web pages, movies, music, articles and products. CF identifies customers whose tastes are similar to those of a given customer, and recommends items those customers have liked in the past. Numerous CF algorithms have been developed to increase the performance of recommender systems. Broadly, there are memory-based CF algorithms, model-based CF algorithms, and hybrid CF algorithms which combine CF with content-based techniques or other recommender systems. While many researchers have focused their efforts in improving CF performance, the theoretical justification of CF algorithms is lacking. That is, we do not know many things about how CF is done. Furthermore, the relative performances of CF algorithms are known to be domain and data dependent. It is very time-consuming and expensive to implement and launce a CF recommender system, and also the system unsuited for the given domain provides customers with poor quality recommendations that make them easily annoyed. Therefore, predicting the performances of CF algorithms in advance is practically important and needed. In this study, we propose an efficient approach to predict the performance of CF. Social Network Analysis (SNA) and Artificial Neural Network (ANN) are applied to develop our prediction model. CF can be modeled as a social network in which customers are nodes and purchase relationships between customers are links. SNA facilitates an exploration of the topological properties of the network structure that are implicit in data for CF recommendations. An ANN model is developed through an analysis of network topology, such as network density, inclusiveness, clustering coefficient, network centralization, and Krackhardt's efficiency. While network density, expressed as a proportion of the maximum possible number of links, captures the density of the whole network, the clustering coefficient captures the degree to which the overall network contains localized pockets of dense connectivity. Inclusiveness refers to the number of nodes which are included within the various connected parts of the social network. Centralization reflects the extent to which connections are concentrated in a small number of nodes rather than distributed equally among all nodes. Krackhardt's efficiency characterizes how dense the social network is beyond that barely needed to keep the social group even indirectly connected to one another. We use these social network measures as input variables of the ANN model. As an output variable, we use the recommendation accuracy measured by F1-measure. In order to evaluate the effectiveness of the ANN model, sales transaction data from H department store, one of the well-known department stores in Korea, was used. Total 396 experimental samples were gathered, and we used 40%, 40%, and 20% of them, for training, test, and validation, respectively. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. The input variable measuring process consists of following three steps; analysis of customer similarities, construction of a social network, and analysis of social network patterns. We used Net Miner 3 and UCINET 6.0 for SNA, and Clementine 11.1 for ANN modeling. The experiments reported that the ANN model has 92.61% estimated accuracy and 0.0049 RMSE. Thus, we can know that our prediction model helps decide whether CF is useful for a given application with certain data characteristics.

Advanced Search
Date Range