Journal of Intelligence and Information Systems,
Vol. 16, No. 3, September 2010
GA-based Normalization Approach in Back-propagation Neural Network for Bankruptcy Prediction Modeling
Qiu-Yue Tai, and Kyung-Shik Shin
Vol. 16, No. 3, Page: 1 ~ 14
The back-propagation neural network (BPN) has long been successfully applied in bankruptcy prediction problems. Despite its wide application, some major issues must be considered before its use, such as the network topology, learning parameters and normalization methods for the input and output vectors. Previous studies on bankruptcy prediction with BPN have shown that many researchers are interested in how to optimize the network topology and learning parameters to improve the prediction performance. In many cases, however, the benefits of data normalization are often overlooked. In this study, a genetic algorithm (GA)-based normalization transform, which is defined as a linearly weighted combination of several different normalization transforms, will be proposed. GA is used to extract the optimal weight for the generalization. From the results of an experiment, the proposed method was evaluated and compared with other methods to demonstrate the advantage of the proposed method.
A Survey on Public Web Service Repositories
You-Sub Hwang
Vol. 16, No. 3, Page: 15 ~ 35
Web service Technology has been developing rapidly as it provides a flexible application-to-application interaction mechanism. Several ongoing research efforts focus on various aspects of Web service technology, including the modeling, specification, discovery, composition and verification of Web services. The approaches advocated are often conflicting-based as they are differing expectations on the current status of Web services as well as differing models of their future evolution. One way of deciding the relative relevance of the various research directions is to look at their applicability to the currently available Web services. To this end, we conducted a survey on currently publicly available Web service repositories. Our aim is to get an idea of the number, complexity and composability of these Web services and see if this analysis provides useful information about the near-term fruitful research directions.
Gaze Mirroring-based Intelligent Information System for Making User's Latent Interest
Hye-Sun Park, Takatsugu Hirayama, and Takashi Matsuyama
Vol. 16, No. 3, Page: 37 ~ 54
Keywords : Gaze Estimation, Gaze-Mirroring, Intelligent Information System, Avatar
The information system that preserves and presents information collections, records, processes, retrievals, is applied in various fields recently and is supporting man's many activities. Conventional information systems are based on the reactive interaction model. Such reactive systems respond to only specific instructions, i.e. the defined commands, from the user. To go beyond the reactive interaction, it is necessary that the interactive dynamic interaction based information system which understands human's action and intention autonomously and then provides sensible information adapted to the user. Therefore, we propose a Gaze Mirroring-based intelligent information system for making user's latent interest using the internal state estimation methods based on the interactive dynamic interaction. Then, the proposed Gaze Mirroring method is that an anthropomorphic agent(avatar) actively established the joint attention with the user by imitating user's eye-gaze behavior. We verify that the Gaze Mirroring can elicit the user's behavior reflecting the latent interestand contribute to improving the accuracy of interest estimation. We also have confidence that the Gaze Mirroring promotes the self-awareness of interest. Such a Gaze Mirroring-based intelligent information system also provides suitable information to user by making user's latent interest using the internal state estimation.
Finding Weighted Sequential Patterns over Data Streams via a Gap-based Weighting Approach
Joong-Hyuk Chang
Vol. 16, No. 3, Page: 55 ~ 75
Keywords : Weighted Sequential Pattern, Gap-based Weight, Sequential Pattern, Data Mining, Data Mining
Sequential pattern mining aims to discover interesting sequential patterns in a sequence database, and it is one of the essential data mining tasks widely used in various application fields such as Web access pattern analysis, customer purchase pattern analysis, and DNA sequence analysis. In general sequential pattern mining, only the generation order of data element in a sequence is considered, so that it can easily find simple sequential patterns, but has a limit to find more interesting sequential patterns being widely used in real world applications. One of the essential research topics to compensate the limit is a topic of weighted sequential pattern mining. In weighted sequential pattern mining, not only the generation order of data element but also its weight is considered to get more interesting sequential patterns. In recent, data has been increasingly taking the form of continuous data streams rather than finite stored data sets in various application fields, the database research community has begun focusing its attention on processing over data streams. The data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. In data stream processing, each data element should be examined at most once to analyze the data stream, and the memory usage for data stream analysis should be restricted finitely although new data elements are continuously generated in a data stream. Moreover, newly generated data elements should be processed as fast as possible to produce the up-to-date analysis result of a data stream, so that it can be instantly utilized upon request. To satisfy these requirements, data stream processing sacrifices the correctness of its analysis result by allowing some error. Considering the changes in the form of data generated in real world application fields, many researches have been actively performed to find various kinds of knowledge embedded in data streams. They mainly focus on efficient mining of frequent itemsets and sequential patterns over data streams, which have been proven to be useful in conventional data mining for a finite data set. In addition, mining algorithms have also been proposed to efficiently reflect the changes of data streams over time into their mining results. However, they have been targeting on finding naively interesting patterns such as frequent patterns and simple sequential patterns, which are found intuitively, taking no interest in mining novel interesting patterns that express the characteristics of target data streams better. Therefore, it can be a valuable research topic in the field of mining data streams to define novel interesting patterns and develop a mining method finding the novel patterns, which will be effectively used to analyze recent data streams. This paper proposes a gap-based weighting approach for a sequential pattern and amining method of weighted sequential patterns over sequence data streams via the weighting approach. A gap-based weight of a sequential pattern can be computed from the gaps of data elements in the sequential pattern without any pre-defined weight information. That is, in the approach, the gaps of data elements in each sequential pattern as well as their generation orders are used to get the weight of the sequential pattern, therefore it can help to get more interesting and useful sequential patterns. Recently most of computer application fields generate data as a form of data streams rather than a finite data set. Considering the change of data, the proposed method is mainly focus on sequence data streams.
Rough Set Analysis for Stock Market Timing
Jin-Nyung Huh, Kyoung-Jae Kim, and Ingoo Han
Vol. 16, No. 3, Page: 77 ~ 97
Keywords : Rough Set, Market Timing, Discretization, Profitability
Market timing is an investment strategy which is used for obtaining excessive return from financial market. In general, detection of market timing means determining when to buy and sell to get excess return from trading. In many market timing systems, trading rules have been used as an engine to generate signals for trade. On the other hand, some researchers proposed the rough set analysis as a proper tool for market timing because it does not generate a signal for trade when the pattern of the market is uncertain by using the control function. The data for the rough set analysis should be discretized of numeric value because the rough set only accepts categorical data for analysis. Discretization searches for proper "cuts" for numeric data that determine intervals. All values that lie within each interval are transformed into same value. In general, there are four methods for data discretization in rough set analysis including equal frequency scaling, expert's knowledge-based discretization, minimum entropy scaling, and na$\ddot{i}$ 수식 이미지ve and Boolean reasoning-based discretization. Equal frequency scaling fixes a number of intervals and examines the histogram of each variable, then determines cuts so that approximately the same number of samples fall into each of the intervals. Expert's knowledge-based discretization determines cuts according to knowledge of domain experts through literature review or interview with experts. Minimum entropy scaling implements the algorithm based on recursively partitioning the value set of each variable so that a local measure of entropy is optimized. Na$\ddot{i}$ 수식 이미지ve and Booleanreasoning-based discretization searches categorical values by using Na$\ddot{i}$ 수식 이미지ve scaling the data, then finds the optimized dicretization thresholds through Boolean reasoning. Although the rough set analysis is promising for market timing, there is little research on the impact of the various data discretization methods on performance from trading using the rough set analysis. In this study, we compare stock market timing models using rough set analysis with various data discretization methods. The research data used in this study are the KOSPI 200 from May 1996 to October 1998. KOSPI 200 is the underlying index of the KOSPI 200 futures which is the first derivative instrument in the Korean stock market. The KOSPI 200 is a market value weighted index which consists of 200 stocks selected by criteria on liquidity and their status in corresponding industry including manufacturing, construction, communication, electricity and gas, distribution and services, and financing. The total number of samples is 660 trading days. In addition, this study uses popular technical indicators as independent variables. The experimental results show that the most profitable method for the training sample is the na$\ddot{i}$ 수식 이미지ve and Boolean reasoning but the expert's knowledge-based discretization is the most profitable method for the validation sample. In addition, the expert's knowledge-based discretization produced robust performance for both of training and validation sample. We also compared rough set analysis and decision tree. This study experimented C4.5 for the comparison purpose. The results show that rough set analysis with expert's knowledge-based discretization produced more profitable rules than C4.5.
A Case Study on Forecasting Inbound Calls of Motor Insurance Company Using Interactive Data Mining Technique
Woong Baek, and Nam-Gyu Kim
Vol. 16, No. 3, Page: 99 ~ 120
Keywords : Call Center Management, Classification and Prediction, Data Mining, Predicting Inbound Calls
Due to the wide spread of customers' frequent access of non face-to-face services, there have been many attempts to improve customer satisfaction using huge amounts of data accumulated throughnon face-to-face channels. Usually, a call center is regarded to be one of the most representative non-faced channels. Therefore, it is important that a call center has enough agents to offer high level customer satisfaction. However, managing too many agents would increase the operational costs of a call center by increasing labor costs. Therefore, predicting and calculating the appropriate size of human resources of a call center is one of the most critical success factors of call center management. For this reason, most call centers are currently establishing a department of WFM(Work Force Management) to estimate the appropriate number of agents and to direct much effort to predict the volume of inbound calls. In real world applications, inbound call prediction is usually performed based on the intuition and experience of a domain expert. In other words, a domain expert usually predicts the volume of calls by calculating the average call of some periods and adjusting the average according tohis/her subjective estimation. However, this kind of approach has radical limitations in that the result of prediction might be strongly affected by the expert's personal experience and competence. It is often the case that a domain expert may predict inbound calls quite differently from anotherif the two experts have mutually different opinions on selecting influential variables and priorities among the variables. Moreover, it is almost impossible to logically clarify the process of expert's subjective prediction. Currently, to overcome the limitations of subjective call prediction, most call centers are adopting a WFMS(Workforce Management System) package in which expert's best practices are systemized. With WFMS, a user can predict the volume of calls by calculating the average call of each day of the week, excluding some eventful days. However, WFMS costs too much capital during the early stage of system establishment. Moreover, it is hard to reflect new information ontothe system when some factors affecting the amount of calls have been changed. In this paper, we attempt to devise a new model for predicting inbound calls that is not only based on theoretical background but also easily applicable to real world applications. Our model was mainly developed by the interactive decision tree technique, one of the most popular techniques in data mining. Therefore, we expect that our model can predict inbound calls automatically based on historical data, and it can utilize expert's domain knowledge during the process of tree construction. To analyze the accuracy of our model, we performed intensive experiments on a real case of one of the largest car insurance companies in Korea. In the case study, the prediction accuracy of the devised two models and traditional WFMS are analyzed with respect to the various error rates allowable. The experiments reveal that our data mining-based two models outperform WFMS in terms of predicting the amount of accident calls and fault calls in most experimental situations examined.
A Template-based Interactive University Timetabling Support System
Yong-Sik Chang, and Ye-Won Jeong
Vol. 16, No. 3, Page: 121 ~ 145
Keywords : Timetable, Case, Rule
University timetabling depending on the educational environments of universities is an NP-hard problem that the amount of computation required to find solutions increases exponentially with the problem size. For many years, there have been lots of studies on university timetabling from the necessity of automatic timetable generation for students' convenience and effective lesson, and for the effective allocation of subjects, lecturers, and classrooms. Timetables are classified into a course timetable and an examination timetable. This study focuses on the former. In general, a course timetable for liberal arts is scheduled by the office of academic affairs and a course timetable for major subjects is scheduled by each department of a university. We found several problems from the analysis of current course timetabling in departments. First, it is time-consuming and inefficient for each department to do the routine and repetitive timetabling work manually. Second, many classes are concentrated into several time slots in a timetable. This tendency decreases the effectiveness of students' classes. Third, several major subjects might overlap some required subjects in liberal arts at the same time slots in the timetable. In this case, it is required that students should choose only one from the overlapped subjects. Fourth, many subjects are lectured by same lecturers every year and most of lecturers prefer the same time slots for the subjects compared with last year. This means that it will be helpful if departments reuse the previous timetables. To solve such problems and support the effective course timetabling in each department, this study proposes a university timetabling support system based on two phases. In the first phase, each department generates a timetable template from the most similar timetable case, which is based on case-based reasoning. In the second phase, the department schedules a timetable with the help of interactive user interface under the timetabling criteria, which is based on rule-based approach. This study provides the illustrations of Hanshin University. We classified timetabling criteria into intrinsic and extrinsic criteria. In intrinsic criteria, there are three criteria related to lecturer, class, and classroom which are all hard constraints. In extrinsic criteria, there are four criteria related to 'the numbers of lesson hours' by the lecturer, 'prohibition of lecture allocation to specific day-hours' for committee members, 'the number of subjects in the same day-hour,' and 'the use of common classrooms.' In 'the numbers of lesson hours' by the lecturer, there are three kinds of criteria : 'minimum number of lesson hours per week,' 'maximum number of lesson hours per week,' 'maximum number of lesson hours per day.' Extrinsic criteria are also all hard constraints except for 'minimum number of lesson hours per week' considered as a soft constraint. In addition, we proposed two indices for measuring similarities between subjects of current semester and subjects of the previous timetables, and for evaluating distribution degrees of a scheduled timetable. Similarity is measured by comparison of two attributes-subject name and its lecturer-between current semester and a previous semester. The index of distribution degree, based on information entropy, indicates a distribution of subjects in the timetable. To show this study's viability, we implemented a prototype system and performed experiments with the real data of Hanshin University. Average similarity from the most similar cases of all departments was estimated as 41.72%. It means that a timetable template generated from the most similar case will be helpful. Through sensitivity analysis, the result shows that distribution degree will increase if we set 'the number of subjects in the same day-hour' to more than 90%.
Personalized Recommendation System for IPTV using Ontology and K-medoids
Byeong-Dae Yun, Jong-Woo Kim, Yong-Seok Cho, and Sang-Gil Kang
Vol. 16, No. 3, Page: 147 ~ 161
Keywords : Ontology, k-medoids clustering, recommendation system, clustering
As broadcasting and communication are converged recently, communication is jointed to TV. TV viewing has brought about many changes. The IPTV (Internet Protocol Television) provides information service, movie contents, broadcast, etc. through internet with live programs + VOD (Video on demand) jointed. Using communication network, it becomes an issue of new business. In addition, new technical issues have been created by imaging technology for the service, networking technology without video cuts, security technologies to protect copyright, etc. Through this IPTV network, users can watch their desired programs when they want. However, IPTV has difficulties in search approach, menu approach, or finding programs. Menu approach spends a lot of time in approaching programs desired. Search approach can't be found when title, genre, name of actors, etc. are not known. In addition, inserting letters through remote control have problems. However, the bigger problem is that many times users are not usually ware of the services they use. Thus, to resolve difficulties when selecting VOD service in IPTV, a personalized service is recommended, which enhance users' satisfaction and use your time, efficiently. This paper provides appropriate programs which are fit to individuals not to save time in order to solve IPTV's shortcomings through filtering and recommendation-related system. The proposed recommendation system collects TV program information, the user's preferred program genres and detailed genre, channel, watching program, and information on viewing time based on individual records of watching IPTV. To look for these kinds of similarities, similarities can be compared by using ontology for TV programs. The reason to use these is because the distance of program can be measured by the similarity comparison. TV program ontology we are using is one extracted from TV-Anytime metadata which represents semantic nature. Also, ontology expresses the contents and features in figures. Through world net, vocabulary similarity is determined. All the words described on the programs are expanded into upper and lower classes for word similarity decision. The average of described key words was measured. The criterion of distance calculated ties similar programs through K-medoids dividing method. K-medoids dividing method is a dividing way to divide classified groups into ones with similar characteristics. This K-medoids method sets K-unit representative objects. Here, distance from representative object sets temporary distance and colonize it. Through algorithm, when the initial n-unit objects are tried to be divided into K-units. The optimal object must be found through repeated trials after selecting representative object temporarily. Through this course, similar programs must be colonized. Selecting programs through group analysis, weight should be given to the recommendation. The way to provide weight with recommendation is as the follows. When each group recommends programs, similar programs near representative objects will be recommended to users. The formula to calculate the distance is same as measure similar distance. It will be a basic figure which determines the rankings of recommended programs. Weight is used to calculate the number of watching lists. As the more programs are, the higher weight will be loaded. This is defined as cluster weight. Through this, sub-TV programs which are representative of the groups must be selected. The final TV programs ranks must be determined. However, the group-representative TV programs include errors. Therefore, weights must be added to TV program viewing preference. They must determine the finalranks.Based on this, our customers prefer proposed to recommend contents. So, based on the proposed method this paper suggested, experiment was carried out in controlled environment. Through experiment, the superiority of the proposed method is shown, compared to existing ways.
A Study on the Intelligent Quick Response System for Fast Fashion(IQRS-FF)
Hyun-Sung Park, and Kwang-Ho Park
Vol. 16, No. 3, Page: 163 ~ 179
Keywords : Fast Fashion, Quick Response, Intelligent System
Recentlythe concept of fast fashion is drawing attention as customer needs are diversified and supply lead time is getting shorter in fashion industry. It is emphasized as one of the critical success factors in the fashion industry how quickly and efficiently to satisfy the customer needs as the competition has intensified. Because the fast fashion is inherently susceptible to trend, it is very important for fashion retailers to make quick decisions regarding items to launch, quantity based on demand prediction, and the time to respond. Also the planning decisions must be executed through the business processes of procurement, production, and logistics in real time. In order to adapt to this trend, the fashion industry urgently needs supports from intelligent quick response(QR) system. However, the traditional functions of QR systems have not been able to completely satisfy such demands of the fast fashion industry. This paper proposes an intelligent quick response system for the fast fashion(IQRS-FF). Presented are models for QR process, QR principles and execution, and QR quantity and timing computation. IQRS-FF models support the decision makers by providing useful information with automated and rule-based algorithms. If the predefined conditions of a rule are satisfied, the actions defined in the rule are automatically taken or informed to the decision makers. In IQRS-FF, QRdecisions are made in two stages: pre-season and in-season. In pre-season, firstly master demand prediction is performed based on the macro level analysis such as local and global economy, fashion trends and competitors. The prediction proceeds to the master production and procurement planning. Checking availability and delivery of materials for production, decision makers must make reservations or request procurements. For the outsourcing materials, they must check the availability and capacity of partners. By the master plans, the performance of the QR during the in-season is greatly enhanced and the decision to select the QR items is made fully considering the availability of materials in warehouse as well as partners' capacity. During in-season, the decision makers must find the right time to QR as the actual sales occur in stores. Then they are to decide items to QRbased not only on the qualitative criteria such as opinions from sales persons but also on the quantitative criteria such as sales volume, the recent sales trend, inventory level, the remaining period, the forecast for the remaining period, and competitors' performance. To calculate QR quantity in IQRS-FF, two calculation methods are designed: QR Index based calculation and attribute similarity based calculation using demographic cluster. In the early period of a new season, the attribute similarity based QR amount calculation is better used because there are not enough historical sales data. By analyzing sales trends of the categories or items that have similar attributes, QR quantity can be computed. On the other hand, in case of having enough information to analyze the sales trends or forecasting, the QR Index based calculation method can be used. Having defined the models for decision making for QR, we design KPIs(Key Performance Indicators) to test the reliability of the models in critical decision makings: the difference of sales volumebetween QR items and non-QR items; the accuracy rate of QR the lead-time spent on QR decision-making. To verify the effectiveness and practicality of the proposed models, a case study has been performed for a representative fashion company which recently developed and launched the IQRS-FF. The case study shows that the average sales rateof QR items increased by 15%, the differences in sales rate between QR items and non-QR items increased by 10%, the QR accuracy was 70%, the lead time for QR dramatically decreased from 120 hours to 8 hours.
Development of User Based Recommender System using Social Network for u-Healthcare
Hyea-Kyeong Kim, Il-Young Choi, Ki-Mok Ha, and Jae-Kyeong Kim
Vol. 16, No. 3, Page: 181 ~ 199
Keywords : u-Healthcare, Ubiquitous Service, Recommender System, Social Network, Social Network
As rapid progress of population aging and strong interest in health, the demand for new healthcare service is increasing. Until now healthcare service has provided post treatment by face-to-face manner. But according to related researches, proactive treatment is resulted to be more effective for preventing diseases. Particularly, the existing healthcare services have limitations in preventing and managing metabolic syndrome such a lifestyle disease, because the cause of metabolic syndrome is related to life habit. As the advent of ubiquitous technology, patients with the metabolic syndrome can improve life habit such as poor eating habits and physical inactivity without the constraints of time and space through u-healthcare service. Therefore, lots of researches for u-healthcare service focus on providing the personalized healthcare service for preventing and managing metabolic syndrome. For example, Kim et al.(2010) have proposed a healthcare model for providing the customized calories and rates of nutrition factors by analyzing the user's preference in foods. Lee et al.(2010) have suggested the customized diet recommendation service considering the basic information, vital signs, family history of diseases and food preferences to prevent and manage coronary heart disease. And, Kim and Han(2004) have demonstrated that the web-based nutrition counseling has effects on food intake and lipids of patients with hyperlipidemia. However, the existing researches for u-healthcare service focus on providing the predefined one-way u-healthcare service. Thus, users have a tendency to easily lose interest in improving life habit. To solve such a problem of u-healthcare service, this research suggests a u-healthcare recommender system which is based on collaborative filtering principle and social network. This research follows the principle of collaborative filtering, but preserves local networks (consisting of small group of similar neighbors) for target users to recommend context aware healthcare services. Our research is consisted of the following five steps. In the first step, user profile is created using the usage history data for improvement in life habit. And then, a set of users known as neighbors is formed by the degree of similarity between the users, which is calculated by Pearson correlation coefficient. In the second step, the target user obtains service information from his/her neighbors. In the third step, recommendation list of top-N service is generated for the target user. Making the list, we use the multi-filtering based on user's psychological context information and body mass index (BMI) information for the detailed recommendation. In the fourth step, the personal information, which is the history of the usage service, is updated when the target user uses the recommended service. In the final step, a social network is reformed to continually provide qualified recommendation. For example, the neighbors may be excluded from the social network if the target user doesn't like the recommendation list received from them. That is, this step updates each user's neighbors locally, so maintains the updated local neighbors always to give context aware recommendation in real time. The characteristics of our research as follows. First, we develop the u-healthcare recommender system for improving life habit such as poor eating habits and physical inactivity. Second, the proposed recommender system uses autonomous collaboration, which enables users to prevent dropping and not to lose user's interest in improving life habit. Third, the reformation of the social network is automated to maintain the quality of recommendation. Finally, this research has implemented a mobile prototype system using JAVA and Microsoft Access2007 to recommend the prescribed foods and exercises for chronic disease prevention, which are provided by A university medical center. This research intends to prevent diseases such as chronic illnesses and to improve user's lifestyle through providing context aware and personalized food and exercise services with the help of similar users'experience and knowledge. We expect that the user of this system can improve their life habit with the help of handheld mobile smart phone, because it uses autonomous collaboration to arouse interest in healthcare.

Advanced Search
Date Range