Journal of Intelligence and Information Systems,
Vol. 23, No. 1, March 2017
A Data-based Sales Forecasting Support System for New Businesses
Seung-Pyo Jun, Tae-Eung Sung, and San Choi
Vol. 23, No. 1, Page: 1 ~ 22
Keywords : Sales Forecasting Systems, Analogical Forecasting, Sales Estimation, Technology Valuation, Business Feasibility Analysis, Latent Growth Models(LGM)
Analysis of future business or investment opportunities, such as business feasibility analysis and company or technology valuation, necessitate objective estimation on the relevant market and expected sales. While there are various ways to classify the estimation methods of these new sales or market size, they can be broadly divided into top-down and bottom-up approaches by benchmark references. Both methods, however, require a lot of resources and time. Therefore, we propose a data-based intelligent demand forecasting system to support evaluation of new business.
This study focuses on analogical forecasting, one of the traditional quantitative forecasting methods, to develop sales forecasting intelligence systems for new businesses. Instead of simply estimating sales for a few years, we hereby propose a method of estimating the sales of new businesses by using the initial sales and the sales growth rate of similar companies. To demonstrate the appropriateness of this method, it is examined whether the sales performance of recently established companies in the same industry category in Korea can be utilized as a reference variable for the analogical forecasting.
In this study, we examined whether the phenomenon of “mean reversion” was observed in the sales of start-up companies in order to identify errors in estimating sales of new businesses based on industry sales growth rate and whether the differences in business environment resulting from the different timing of business launch affects growth rate. We also conducted analyses of variance (ANOVA) and latent growth model (LGM) to identify differences in sales growth rates by industry category. Based on the results, we proposed industry-specific range and linear forecasting models.
This study analyzed the sales of only 150,000 start-up companies in Korea in the last 10 years, and identified that the average growth rate of start-ups in Korea is higher than the industry average in the first few years, but it shortly shows the phenomenon of mean-reversion. In addition, although the start-up founding juncture affects the sales growth rate, it is not high significantly and the sales growth rate can be different according to the industry classification. Utilizing both this phenomenon and the performance of start-up companies in relevant industries, we have proposed two models of new business sales based on the sales growth rate.
The method proposed in this study makes it possible to objectively and quickly estimate the sales of new business by industry, and it is expected to provide reference information to judge whether sales estimated by other methods (top-down/bottom-up approach) pass the bounds from ordinary cases in relevant industry. In particular, the results of this study can be practically used as useful reference information for business feasibility analysis or technical valuation for entering new business. When using the existing top-down method, it can be used to set the range of market size or market share. As well, when using the bottom-up method, the estimation period may be set in accordance of the mean reverting period information for the growth rate. The two models proposed in this study will enable rapid and objective sales estimation of new businesses, and are expected to improve the efficiency of business feasibility analysis and technology valuation process by developing intelligent information system.
In academic perspectives, it is a very important discovery that the phenomenon of ‘mean reversion’ is found among start-up companies out of general small-and-medium enterprises (SMEs) as well as stable companies such as listed companies. In particular, there exists the significance of this study in that over the large-scale data the mean reverting phenomenon of the start-up firms' sales growth rate is different from that of the listed companies, and that there is a difference in each industry. If a linear model, which is useful for estimating the sales of a specific company, is highly likely to be utilized in practical aspects, it can be explained that the range model, which can be used for the estimation method of the sales of the unspecified firms, is highly likely to be used in political aspects. It implies that when analyzing the business activities and performance of a specific industry group or enterprise group there is political usability in that the range model enables to provide references and compare them by data based start-up sales forecasting system.
A Study on Web-based Technology Valuation System
Tae-Eung Sung, Seung-Pyo Jun, Sang-Gook Kim, and Hyun-Woo Park
Vol. 23, No. 1, Page: 23 ~ 46
Keywords : STAR-Value System, Technology Valuation, Intelligent Support System, Valuation Model Selection Guideline, Range Inference of Valuation Results, Market Size and Sales Estimation
Although there have been cases of evaluating the value of specific companies or projects which have centralized on developed countries in North America and Europe from the early 2000s, the system and methodology for estimating the economic value of individual technologies or patents has been activated on and on. Of course, there exist several online systems that qualitatively evaluate the technology’s grade or the patent rating of the technology to be evaluated, as in ‘KTRS’ of the KIBO and ‘SMART 3.1’ of the Korea Invention Promotion Association. However, a web-based technology valuation system, referred to as ‘STAR-Value system’ that calculates the quantitative values of the subject technology for various purposes such as business feasibility analysis, investment attraction, tax/litigation, etc., has been officially opened and recently spreading.
In this study, we introduce the type of methodology and evaluation model, reference information supporting these theories, and how database associated are utilized, focusing various modules and frameworks embedded in STAR-Value system. In particular, there are six valuation methods, including the discounted cash flow method (DCF), which is a representative one based on the income approach that anticipates future economic income to be valued at present, and the relief-from-royalty method, which calculates the present value of royalties' where we consider the contribution of the subject technology towards the business value created as the royalty rate. We look at how models and related support information (technology life, corporate (business) financial information, discount rate, industrial technology factors, etc.) can be used and linked in a intelligent manner.
Based on the classification of information such as International Patent Classification (IPC) or Korea Standard Industry Classification (KSIC) for technology to be evaluated, the STAR-Value system automatically returns meta data such as technology cycle time (TCT), sales growth rate and profitability data of similar company or industry sector, weighted average cost of capital (WACC), indices of industrial technology factors, etc., and apply adjustment factors to them, so that the result of technology value calculation has high reliability and objectivity. Furthermore, if the information on the potential market size of the target technology and the market share of the commercialization subject refers to data-driven information, or if the estimated value range of similar technologies by industry sector is provided from the evaluation cases which are already completed and accumulated in database, the STAR-Value is anticipated that it will enable to present highly accurate value range in real time by intelligently linking various support modules.
Including the explanation of the various valuation models and relevant primary variables as presented in this paper, the STAR-Value system intends to utilize more systematically and in a data-driven way by supporting the optimal model selection guideline module, intelligent technology value range reasoning module, and similar company selection based market share prediction module, etc. In addition, the research on the development and intelligence of the web-based STAR-Value system is significant in that it widely spread the web-based system that can be used in the validation and application to practices of the theoretical feasibility of the technology valuation field, and it is expected that it could be utilized in various fields of technology commercialization.
Steel Plate Faults Diagnosis with S-MTS
Joon-Young Kim, Jae-Min Cha, Junguk Shin, and Choongsub Yeom
Vol. 23, No. 1, Page: 47 ~ 67
Keywords : Big Data, Multiclass Classification, Simultaneous MTS (S-MTS), Mahalanobis Taguchi System (MTS), Steel Plates Faults Diagnosis
Steel plate faults is one of important factors to affect the quality and price of the steel plates. So far many steelmakers generally have used visual inspection method that could be based on an inspector's intuition or experience. Specifically, the inspector checks the steel plate faults by looking the surface of the steel plates. However, the accuracy of this method is critically low that it can cause errors above 30% in judgment. Therefore, accurate steel plate faults diagnosis system has been continuously required in the industry. In order to meet the needs, this study proposed a new steel plate faults diagnosis system using Simultaneous MTS (S-MTS), which is an advanced Mahalanobis Taguchi System (MTS) algorithm, to classify various surface defects of the steel plates. MTS has generally been used to solve binary classification problems in various fields, but MTS was not used for multiclass classification due to its low accuracy. The reason is that only one mahalanobis space is established in the MTS. In contrast, S-MTS is suitable for multi-class classification. That is, S-MTS establishes individual mahalanobis space for each class. 'Simultaneous' implies comparing mahalanobis distances at the same time. The proposed steel plate faults diagnosis system was developed in four main stages. In the first stage, after various reference groups and related variables are defined, data of the steel plate faults is collected and used to establish the individual mahalanobis space per the reference groups and construct the full measurement scale. In the second stage, the mahalanobis distances of test groups is calculated based on the established mahalanobis spaces of the reference groups. Then, appropriateness of the spaces is verified by examining the separability of the mahalanobis diatances. In the third stage, orthogonal arrays and Signal-to-Noise (SN) ratio of dynamic type are applied for variable optimization. Also, Overall SN ratio gain is derived from the SNratio and SN ratio gain. If the derived overall SN ratio gain is negative, it means that the variable should be removed. However, the variable with the positive gain may be considered as worth keeping. Finally, in the fourth stage, the measurement scale that is composed of selected useful variables is reconstructed.
Next, an experimental test should be implemented to verify the ability of multi-class classification and thus the accuracy of the classification is acquired. If the accuracy is acceptable, this diagnosis system can be used for future applications. Also, this study compared the accuracy of the proposed steel plate faults diagnosis system with that of other popular classification algorithms including Decision Tree, Multi Perception Neural Network (MLPNN), Logistic Regression (LR), Support Vector Machine (SVM), Tree Bagger Random Forest, Grid Search (GS), Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The steel plates faults dataset used in the study is taken from the University of California at Irvine (UCI) machine learning repository. As a result, the proposed steel plate faults diagnosis system based on S-MTS shows 90.79% of classification accuracy. The accuracy of the proposed diagnosis system is 6-27% higher than MLPNN, LR, GS, GA and PSO. Based on the fact that the accuracy of commercial systems is only about 75-80%, it means that the proposed system has enough classification performance to be applied in the industry. In addition, the proposed system can reduce the number of measurement sensors that are installed in the fields because of variable optimization process. These results show that the proposed system not only can have a good ability on the steel plate faults diagnosis but also reduce operation and maintenance cost. For our future work, it will be applied in the fields to validate actual effectiveness of the proposed system and plan to improve the accuracy based on the results.
A Coexistence Model in a Dynamic Platform with ICT-based Multi-Value Chains: focusing on Healthcare Service
Hyun Jung Lee, and Yong Sik Chang
Vol. 23, No. 1, Page: 69 ~ 93
Keywords : ICT, Dynamic Platform, Tele-Healthcare service, Intelligent Healthcare Service, Multi-value Chains
The development of ICT has leaded the diversification and changes of supplies and demands in markets. It also caused the creations of a variety of values which are differentiated from those in the existing market. Therefore, a new-type market is created, which can include multi-value chains which are from ICT-based created markets as well as the existing markets. We defined the platform as the new-type market. In the platform, the multi-value chains can be coexisted with multi-values. In true market, when a new-type value chain entered into an existing market, it is general that it can be conflicted with the existing value chain in the market. The conflicted problem among multi-value chains in a market is caused by the sharing of limited market resources like suppliers, consumers, services or products among the value chains. In other words, if there are multi-value chains in the platform, then it is possible to have conflictions, overlapping, creations or losses of values among the value chains. To solve the problem, we introduce coexistence factors to reduce the conflictions to reach market equilibrium in the platform. In the other hand, it is possible to lead the creations of differentiated values from the existing market and to augment the total market values in the platform. In the early era of ICT development, ICT was introduced for improvement of efficiency and effectiveness of the value chains in the existing market. However, according to the changed role of ICT from the supporter to the promotor of the market, ICT became to lead the variations of the value chains and creations of various values in the markets. For instance, Uber Taxi created a new value chain with ICT-based new-type service or products with new resources like new suppliers and consumers. When Uber and Traditional Taxi services are playing at the same time in Taxi service platform, it is possible to create values or make conflictions among values between the new and old value chains. In this research, like Uber and traditional taxi services, if there are conflictions among the multi-value chains, then it is necessary to minimize the conflictions in the platform for the coexistence of multi-value chains which can create the value-added values in the platform. So, it is important to predict and discuss the possible conflicted problems between new and old value chains. The confliction should be solved to reach market equilibrium with multi-value chains in the platform. That is, we discuss the possibility of the coexistence of multi-value chains in the platform which are comprised of a variety of suppliers and customers. To do this, especially we are focusing on the healthcare markets. Nowadays healthcare markets are popularized in global market as well as domestic. Therefore, there are a lot of and a variety of healthcare services like Traditional-, Tele-, or Intelligent- healthcare services and so on. It shows that there are multi-suppliers, -consumers and -services as components of each different value chain in the same platform. The platform can be shared by different values that are created or overlapped by confliction and loss of values in the value chains. In this research, as was said, we focused on the healthcare services to show if a platform can be shared by different value chains like traditional-, tele-healthcare and intelligent-healthcare services and products. Additionally, we try to show if it is possible to increase the value of each value chain as well as the total value of the platform. As the result, it is possible to increase of each value of each value chain as well as the total value in the platform. Finally, we propose a coexistence model to overcome such problems and showed the possibility of coexistence between the value chains through experimentation.
Feasibility of Deep Learning Algorithms for Binary Classification Problems
Kitae Kim, Bomi Lee, and Jong Woo Kim
Vol. 23, No. 1, Page: 95 ~ 108
Keywords : Binary Classification, Deep Learning, Multi-Layer Perceptron, Convolutional Neural Network, Long Short-Term Memory
Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm.
The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models.
The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy.
The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field.
In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer.
The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it’s effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.
Emotion Detection Model based on Sequential Neural Networks in Smar t Exhibition Environment
Min Kyu Jung, II Young Choi, and Jae Kyeong Kim
Vol. 23, No. 1, Page: 109 ~ 126
Keywords : Emotion detection model, Valence-Arousal model, Sequential neural networks, Facial features
In the various kinds of intelligent services, many studies for detecting emotion are in progress. Particularly, studies on emotion recognition at the particular time have been conducted in order to provide personalized experiences to the audience in the field of exhibition though facial expressions change as time passes. So, the aim of this paper is to build a model to predict the audience’s emotion from the changes of facial expressions while watching an exhibit. The proposed model is based on both sequential neural network and the Valence-Arousal model. To validate the usefulness of the proposed model, we performed an experiment to compare the proposed model with the standard neural-network–based model to compare their performance. The results confirmed that the proposed model considering time sequence had better prediction accuracy.
A Regression-Model-based Method for Combining Interestingness Measures of Association Rule Mining
Dongwon Lee
Vol. 23, No. 1, Page: 127 ~ 141
Keywords : Recommender system, association rule mining, regression model, online shopping, model-based recommender system
Advances in Internet technologies and the proliferation of mobile devices enabled consumers to approach a wide range of goods and services, while causing an adverse effect that they have hard time reaching their congenial items even if they devote much time to searching for them. Accordingly, businesses are using the recommender systems to provide tools for consumers to find the desired items more easily. Association Rule Mining (ARM) technology is advantageous to recommender systems in that ARM provides intuitive form of a rule with interestingness measures (support, confidence, and lift) describing the relationship between items. Given an item, its relevant items can be distinguished with the help of the measures that show the strength of relationship between items. Based on the strength, the most pertinent items can be chosen among other items and exposed to a given item’s web page. However, the diversity of the measures may confuse which items are more recommendable. Given two rules, for example, one rule’s support and confidence may not be concurrently superior to the other rule’s. Such discrepancy of the measures in distinguishing one rule’s superiority from other rules may cause difficulty in selecting proper items for recommendation. In addition, in an online environment where a web page or mobile screen can provide a limited number of recommendations that attract consumer interest, the prudent selection of items to be included in the list of recommendations is very important. The exposure of items of little interest may lead consumers to ignore the recommendations. Then, such consumers will possibly not pay attention to other forms of marketing activities. Therefore, the measures should be aligned with the probability of consumer’s acceptance of recommendations. For this reason, this study proposes a model-based approach to combine those measures into one unified measure that can consistently determine the ranking of recommended items. A regression model was designed to describe how well the measures (independent variables; i.e., support, confidence, and lift) explain consumer’s acceptance of recommendations (dependent variables, hit rate of recommended items). The model is intuitive to understand and easy to use in that the equation consists of the commonly used measures for ARM and can be used in the estimation of hit rates. The experiment using transaction data from one of the Korea’s largest online shopping malls was conducted to show that the proposed model can improve the hit rates of recommendations. From the top of the list to 13th place, recommended items in the higher rakings from the proposed model show the higher hit rates than those from the competitive model’s. The result shows that the proposed model’s performance is superior to the competitive model’s in online recommendation environment. In a web page, consumers are provided around ten recommendations with which the proposed model outperforms. Moreover, a mobile device cannot expose many items simultaneously due to its limited screen size. Therefore, the result shows that the newly devised recommendation technique is suitable for the mobile recommender systems. While this study has been conducted to cover the cross-selling in online shopping malls that handle merchandise, the proposed method can be expected to be applied in various situations under which association rules apply. For example, this model can be applied to medical diagnostic systems that predict candidate diseases from a patient’s symptoms. To increase the efficiency of the model, additional variables will need to be considered for the elaboration of the model in future studies. For example, price can be a good candidate for an explanatory variable because it has a major impact on consumer purchase decisions. If the prices of recommended items are much higher than the items in which a consumer is interested, the consumer may hesitate to accept the recommendations.
Development on Early Warning System about Technology Leakage of Small and Medium Enterprises
Bong-Goon Seo, and Do-Hyung Park
Vol. 23, No. 1, Page: 143 ~ 159
Keywords : Technology Leakage, Datamining, Early Warning System, SVM
Due to the rapid development of IT in recent years, not only personal information but also the key technologies and information leakage that companies have are becoming important issues. For the enterprise, the core technology that the company possesses is a very important part for the survival of the enterprise and for the continuous competitive advantage. Recently, there have been many cases of technical infringement. Technology leaks not only cause tremendous financial losses such as falling stock prices for companies, but they also have a negative impact on corporate reputation and delays in corporate development. In the case of SMEs, where core technology is an important part of the enterprise, compared to large corporations, the preparation for technological leakage can be seen as an indispensable factor in the existence of the enterprise. As the necessity and importance of Information Security Management (ISM) is emerging, it is necessary to check and prepare for the threat of technology infringement early in the enterprise.
Nevertheless, previous studies have shown that the majority of policy alternatives are represented by about 90%. As a research method, literature analysis accounted for 76% and empirical and statistical analysis accounted for a relatively low rate of 16%. For this reason, it is necessary to study the management model and prediction model to prevent leakage of technology to meet the characteristics of SMEs. In this study, before analyzing the empirical analysis, we divided the technical characteristics from the technology value perspective and the organizational factor from the technology control point based on many previous researches related to the factors affecting the technology leakage. A total of 12 related variables were selected for the two factors, and the analysis was performed with these variables.
In this study, we use three - year data of "Small and Medium Enterprise Technical Statistics Survey" conducted by the Small and Medium Business Administration. Analysis data includes 30 industries based on KSIC-based 2-digit classification, and the number of companies affected by technology leakage is 415 over 3 years. Through this data, we conducted a randomized sampling in the same industry based on the KSIC in the same year, and compared with the companies (n = 415) and the unaffected firms (n = 415) 1:1 Corresponding samples were prepared and analyzed.
In this research, we will conduct an empirical analysis to search for factors influencing technology leakage, and propose an early warning system through data mining. Specifically, in this study, based on the questionnaire survey of SMEs conducted by the Small and Medium Business Administration (SME), we classified the factors that affect the technology leakage of SMEs into two factors(Technology Characteristics, Organization Characteristics). And we propose a model that informs the possibility of technical infringement by using Support Vector Machine(SVM) which is one of the various techniques of data mining based on the proven factors through statistical analysis.
Unlike previous studies, this study focused on the cases of various industries in many years, and it can be pointed out that the artificial intelligence model was developed through this study. In addition, since the factors are derived empirically according to the actual leakage of SME technology leakage, it will be possible to suggest to policy makers which companies should be managed from the viewpoint of technology protection. Finally, it is expected that the early warning model on the possibility of technology leakage proposed in this study will provide an opportunity to prevent technology Leakage from the viewpoint of enterprise and government in advance.

Advanced Search
Date Range