Journal of Intelligence and Information Systems,
Vol. 20, No. 2, June 2014
A Network Analysis of Information Exchange using Social Media in ICT Exhibition
Ki Mok Ha, Hyun Sil Moon, Il Young Choi, and Jae Kyeong Kim
Vol. 20, No. 2, Page: 1 ~ 17
Keywords : Social Network Analysis, User Network, Social media, Information Exchange
The proliferation of using social media and social networking services affects the lifestyles of people. These phenomena are useful to companies that wish to promote and advertise new products or services through these social media; these social media venues also come with large amounts of user data. However, studies that analyze the data of social media within the perspective of information exchanges are hard to find. Much of the previous research in this area is focused on measuring the performance of exhibitions using general statistical approaches and piecemeal measures. Therefore, in this study, we want to analyze the characteristics of information exchanges in social media by using Twitter data sets, which are relating to the Mobile World Congress (MWC). Using this methodology provides exhibition organizers and exhibitors to objectively estimate the effect of social media, and establish strategies with social media use. Through a user network analysis, we additionally found that social attributes are as important as the popular attribute regarding the sustainability of information exchanges. Consequently, this research provides a network analysis using the data derived from the use of social media to communicate information regarding the MWC exhibition, and reveals the significance of social attributes such as the degree and the betweenness centrality regarding the sustainability of information exchanges.
Towards the Virtuous Circle in Virtual Community through Knowledge Seeking and Sharing
Jae Kyung Kim
Vol. 20, No. 2, Page: 19 ~ 38
Keywords : Knowledge searching, knowledge browsing, sense of virtual community, knowledge sharing, virtual community promotion
This study focused on the role of active knowledge seeking (knowledge browsing and knowledge searching) in the context of virtual community of interest. Knowledge seeking is rarely studied as an antecedent in knowledge management (KM) research. Active knowledge seeking is considered as antecedents of sense of virtual community which mediates to knowledge sharing intention and virtual community promotion. Research hypotheses are tested by applying structure equation modeling with survey data from virtual community members in South Korea. Active knowledge seeking behavior was found to be the strong predictor of sense of virtual community, which, in turn, positively affects knowledge sharing intention and virtual community promotion. Implication to practitioners is to understand and accommodate the members’ knowledge seeking efforts, who are potential contributors and promoters of the virtual community. Knowledge seeking, knowledge sharing and promoting virtual community are more of human activities than technology and this study extends the understanding of such human activities. By providing a mechanism of how knowledge seeking and sharing could work harmoniously, a virtuous circle with win-win situation could be achieved in virtual communities.
A MVC Framework for Visualizing Text Data
Kwang Sun Choi, Kyo Sung Jeong, and Soo Dong Kim
Vol. 20, No. 2, Page: 39 ~ 58
Keywords : Text Data, Visualization, MVC Framework
As the importance of big data and related technologies continues to grow in the industry, it has become highlighted to visualize results of processing and analyzing big data. Visualization of data delivers people effectiveness and clarity for understanding the result of analyzing. By the way, visualization has a role as the GUI (Graphical User Interface) that supports communications between people and analysis systems. Usually to make development and maintenance easier, these GUI parts should be loosely coupled from the parts of processing and analyzing data. And also to implement a loosely coupled architecture, it is necessary to adopt design patterns such as MVC (Model-View-Controller) which is designed for minimizing coupling between UI part and data processing part. On the other hand, big data can be classified as structured data and unstructured data. The visualization of structured data is relatively easy to unstructured data. For all that, as it has been spread out that the people utilize and analyze unstructured data, they usually develop the visualization system only for each project to overcome the limitation traditional visualization system for structured data. Furthermore, for text data which covers a huge part of unstructured data, visualization of data is more difficult. It results from the complexity of technology for analyzing text data as like linguistic analysis, text mining, social network analysis, and so on. And also those technologies are not standardized. This situation makes it more difficult to reuse the visualization system of a project to other projects. We assume that the reason is lack of commonality design of visualization system considering to expanse it to other system. In our research, we suggest a common information model for visualizing text data and propose a comprehensive and reusable framework, TexVizu, for visualizing text data. At first, we survey representative researches in text visualization era. And also we identify common elements for text visualization and common patterns among various cases of its. And then we review and analyze elements and patterns with three different viewpoints as structural viewpoint, interactive viewpoint, and semantic viewpoint. And then we design an integrated model of text data which represent elements for visualization. The structural viewpoint is for identifying structural element from various text documents as like title, author, body, and so on. The interactive viewpoint is for identifying the types of relations and interactions between text documents as like post, comment, reply and so on. The semantic viewpoint is for identifying semantic elements which extracted from analyzing text data linguistically and are represented as tags for classifying types of entity as like people, place or location, time, event and so on. After then we extract and choose common requirements for visualizing text data. The requirements are categorized as four types which are structure information, content information, relation information, trend information. Each type of requirements comprised with required visualization techniques, data and goal (what to know). These requirements are common and key requirement for design a framework which keep that a visualization system are loosely coupled from data processing or analyzing system. Finally we designed a common text visualization framework, TexVizu which is reusable and expansible for various visualization projects by collaborating with various Text Data Loader and Analytical Text Data Visualizer via common interfaces as like ITextDataLoader and IATDProvider. And also TexVisu is comprised with Analytical Text Data Model, Analytical Text Data Storage and Analytical Text Data Controller. In this framework, external components are the specifications of required interfaces for collaborating with this framework. As an experiment, we also adopt this framework into two text visualization systems as like a social opinion mining system and an online news analysis system.
Selection Model of System Trading Strategies using SVM
Sungcheol Park, Sun Woong Kim, and Heung Sik Choi
Vol. 20, No. 2, Page: 59 ~ 71
Keywords : SVM, Strategy Portfolio, System Trading, KOSPI 200 Index Futures
System trading is becoming more popular among Korean traders recently. System traders use automatic order systems based on the system generated buy and sell signals. These signals are generated from the predetermined entry and exit rules that were coded by system traders. Most researches on system trading have focused on designing profitable entry and exit rules using technical indicators.
However, market conditions, strategy characteristics, and money management also have influences on the profitability of the system trading. Unexpected price deviations from the predetermined trading rules can incur large losses to system traders. Therefore, most professional traders use strategy portfolios rather than only one strategy. Building a good strategy portfolio is important because trading performance depends on strategy portfolios. Despite of the importance of designing strategy portfolio, rule of thumb methods have been used to select trading strategies.
In this study, we propose a SVM-based strategy portfolio management system. SVM were introduced by Vapnik and is known to be effective for data mining area. It can build good portfolios within a very short period of time. Since SVM minimizes structural risks, it is best suitable for the futures trading market in which prices do not move exactly the same as the past.
Our system trading strategies include moving-average cross system, MACD cross system, trend-following system, buy dips and sell rallies system, DMI system, Keltner channel system, Bollinger Bands system, and Fibonacci system. These strategies are well known and frequently being used by many professional traders. We program these strategies for generating automated system signals for entry and exit. We propose SVM-based strategies selection system and portfolio construction and order routing system. Strategies selection system is a portfolio training system. It generates training data and makes SVM model using optimal portfolio. We make mn data matrix by dividing KOSPI 200 index futures data with a same period. Optimal strategy portfolio is derived from analyzing each strategy performance. SVM model is generated based on this data and optimal strategy portfolio.
We use 80% of the data for training and the remaining 20% is used for testing the strategy. For training, we select two strategies which show the highest profit in the next day. Selection method 1 selects two strategies and method 2 selects maximum two strategies which show profit more than 0.1 point. We use one-against-all method which has fast processing time.
We analyse the daily data of KOSPI 200 index futures contracts from January 1990 to November 2011. Price change rates for 50 days are used as SVM input data. The training period is from January 1990 to March 2007 and the test period is from March 2007 to November 2011. We suggest three benchmark strategies portfolio. BM1 holds two contracts of KOSPI 200 index futures for testing period. BM2 is constructed as two strategies which show the largest cumulative profit during 30 days before testing starts. BM3 has two strategies which show best profits during testing period. Trading cost include brokerage commission cost and slippage cost. The proposed strategy portfolio management system shows profit more than double of the benchmark portfolios. BM1 shows 103.44 point profit, BM2 shows 488.61 point profit, and BM3 shows 502.41 point profit after deducting trading cost. The best benchmark is the portfolio of the two best profit strategies during the test period. The proposed system 1 shows 706.22 point profit and proposed system 2 shows 768.95 point profit after deducting trading cost. The equity curves for the entire period show stable pattern. With higher profit, this suggests a good trading direction for system traders. We can make more stable and more profitable portfolios if we add money management module to the system.
Scalable Collaborative Filtering Technique based on Adaptive Clustering
O-Joun Lee, Min-Sung Hong, Won-Jin Lee, and Jae-Dong Lee
Vol. 20, No. 2, Page: 73 ~ 92
Keywords : Recommendation System, Adaptive System, Collaborative Filtering, Hybrid Filtering, Clustering
An Adaptive Clustering-based Collaborative Filtering Technique was proposed to solve the fundamental problems of collaborative filtering, such as cold-start problems, scalability problems and data sparsity problems. Previous collaborative filtering techniques were carried out according to the recommendations based on the predicted preference of the user to a particular item using a similar item subset and a similar user subset composed based on the preference of users to items. For this reason, if the density of the user preference matrix is low, the reliability of the recommendation system will decrease rapidly. Therefore, the difficulty of creating a similar item subset and similar user subset will be increased. In addition, as the scale of service increases, the time needed to create a similar item subset and similar user subset increases geometrically, and the response time of the recommendation system is then increased. To solve these problems, this paper suggests a collaborative filtering technique that adapts a condition actively to the model and adopts the concepts of a context-based filtering technique. This technique consists of four major methodologies. First, items are made, the users are clustered according their feature vectors, and an inter-cluster preference between each item cluster and user cluster is then assumed. According to this method, the run-time for creating a similar item subset or user subset can be economized, the reliability of a recommendation system can be made higher than that using only the user preference information for creating a similar item subset or similar user subset, and the cold start problem can be partially solved. Second, recommendations are made using the prior composed item and user clusters and inter-cluster preference between each item cluster and user cluster. In this phase, a list of items is made for users by examining the item clusters in the order of the size of the inter-cluster preference of the user cluster, in which the user belongs, and selecting and ranking the items according to the predicted or recorded user preference information. Using this method, the creation of a recommendation model phase bears the highest load of the recommendation system, and it minimizes the load of the recommendation system in run-time. Therefore, the scalability problem and large scale recommendation system can be performed with collaborative filtering, which is highly reliable. Third, the missing user preference information is predicted using the item and user clusters. Using this method, the problem caused by the low density of the user preference matrix can be mitigated. Existing studies on this used an item-based prediction or user-based prediction. In this paper, Hao Ji’s idea, which uses both an item-based prediction and user-based prediction, was improved. The reliability of the recommendation service can be improved by combining the predictive values of both techniques by applying the condition of the recommendation model. By predicting the user preference based on the item or user clusters, the time required to predict the user preference can be reduced, and missing user preference in run-time can be predicted. Fourth, the item and user feature vector can be made to learn the following input of the user feedback. This phase applied normalized user feedback to the item and user feature vector. This method can mitigate the problems caused by the use of the concepts of context-based filtering, such as the item and user feature vector based on the user profile and item properties. The problems with using the item and user feature vector are due to the limitation of quantifying the qualitative features of the items and users. Therefore, the elements of the user and item feature vectors are made to match one to one, and if user feedback to a particular item is obtained, it will be applied to the feature vector using the opposite one. Verification of this method was accomplished by comparing the performance with existing hybrid filtering techniques. Two methods were used for verification: MAE(Mean Absolute Error) and response time. Using MAE, this technique was confirmed to improve the reliability of the recommendation system. Using the response time, this technique was found to be suitable for a large scaled recommendation system. This paper suggested an Adaptive Clustering-based Collaborative Filtering Technique with high reliability and low time complexity, but it had some limitations. This technique focused on reducing the time complexity. Hence, an improvement in reliability was not expected. The next topic will be to improve this technique by rule-based filtering.
User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis
Jieun Kim, Namgyu Kim, and Yoonho Cho
Vol. 20, No. 2, Page: 93 ~ 107
Keywords : Data Mining, Issue Clustering, Social Network Analysis, Topic Analysis

In this paper, we report what we have observed with regard to user-perspective issue clustering based on multi-layered two-mode network analysis. This work is significant in the context of data collection by companies about customer needs. Most companies have failed to uncover such needs for products or services properly in terms of demographic data such as age, income levels, and purchase history. Because of excessive reliance on limited internal data, most recommendation systems do not provide decision makers with appropriate business information for current business circumstances. However, part of the problem is the increasing regulation of personal data gathering and privacy. This makes demographic or transaction data collection more difficult, and is a significant hurdle for traditional recommendation approaches because these systems demand a great deal of personal data or transaction logs. Our motivation for presenting this paper to academia is our strong belief, and evidence, that most customers' requirements for products can be effectively and efficiently analyzed from unstructured textual data such as Internet news text. In order to derive users' requirements from textual data obtained online, the proposed approach in this paper attempts to construct double two-mode networks, such as a user-news network and news-issue network, and to integrate these into one quasi-network as the input for issue clustering. One of the contributions of this research is the development of a methodology utilizing enormous amounts of unstructured textual data for user-oriented issue clustering by leveraging existing text mining and social network analysis.
In order to build multi-layered two-mode networks of news logs, we need some tools such as text mining and topic analysis. We used not only SAS Enterprise Miner 12.1, which provides a text miner module and cluster module for textual data analysis, but also NetMiner 4 for network visualization and analysis. Our approach for user-perspective issue clustering is composed of six main phases: crawling, topic analysis, access pattern analysis, network merging, network conversion, and clustering. In the first phase, we collect visit logs for news sites by crawler. After gathering unstructured news article data, the topic analysis phase extracts issues from each news article in order to build an article-news network. For simplicity, 100 topics are extracted from 13,652 articles. In the third phase, a user-article network is constructed with access patterns derived from web transaction logs. The double two-mode networks are then merged into a quasi-network of user-issue. Finally, in the user-oriented issue-clustering phase, we classify issues through structural equivalence, and compare these with the clustering results from statistical tools and network analysis.
An experiment with a large dataset was performed to build a multi-layer two-mode network. After that, we compared the results of issue clustering from SAS with that of network analysis. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The sample dataset contains 150 million transaction logs and 13,652 news articles of 5,000 panels over one year. User-article and article-issue networks are constructed and merged into a user-issue quasi-network using Netminer. Our issue-clustering results applied the Partitioning Around Medoids (PAM) algorithm and Multidimensional Scaling (MDS), and are consistent with the results from SAS clustering.
In spite of extensive efforts to provide user information with recommendation systems, most projects are successful only when companies have sufficient data about users and transactions. Our proposed methodology, user-perspective issue clustering, can provide practical support to decision-making in companies because it enhances user-related data from unstructured textual data. To overcome the problem of insufficient data from traditional approaches, our methodology infers customers' real interests by utilizing web transaction logs. In addition, we suggest topic analysis and issue clustering as a practical means of issue identification.
Twitter Issue Tracking System by Topic Modeling Techniques
Jung-hwan Bae, Nam-gi Han, and Min Song
Vol. 20, No. 2, Page: 109 ~ 122
Keywords : Social Media Mining; Text Mining; Twitter Issue; Topic Modeling; Social Network Service;
Big Data
People are nowadays creating a tremendous amount of data on Social Network Service (SNS).
In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data
generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now
we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount
of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are
satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can
be used as a new important source for the creation of new values because this information covers
the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and
established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts
and visualizes them on the web. The proposed system provides the following four functions:(1) Provide the topic keyword set that corresponds to daily ranking;
(2) Visualize the daily time series graph of a topic for the duration of a month;
(3) Provide the importance of a topic through a treemap based on the score system and
(4) Visualize the daily time-series graph of keywords by searching the keyword;
The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis
requires various natural language processing techniques, including the removal of stop words, and
noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis
requires the latest big data technology to process rapidly a large amount of real-time data, such as
the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built
TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is
classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented
database that provides high performance, high availability, and automatic scaling. Unlike existing
relational database, there are no schema or tables with MongoDB, and its most important goal is that
of data accessibility and data processing performance. In the Age of Big Data, the visualization of
Big Data is more attractive to the Big Data community because it helps analysts to examine such
data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is
designed for the purpose of creating Data Driven Documents that bind document object model (DOM)
and any data; the interaction between data is easy and useful for managing real-time data stream with
smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and
JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using
these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The
proposed work demonstrates the superiority of our issue detection techniques by matching detected
issues with corresponding online news articles.
The contributions of the present study are threefold. First, we suggest an alternative approach
to real-time big data analysis, which has become an extremely important issue. Second, we apply a
topic modeling technique that is used in various research areas, including Library and Information
Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third,
we develop a web-based system, and make the system available for the real-time discovery of topics.
The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.
A Methodology for Extracting Shopping-Related Keywords by Analyzing Internet Navigation Patterns
Mingyu Kim, Namgyu Kim, and Inhwan Jung
Vol. 20, No. 2, Page: 123 ~ 136
Keywords : Internet Shopping, Keyword Evaluation, Keyword Marketing, Search Keyword Extraction
Recently, online shopping has further developed as the use of the Internet and a variety of smart mobile devices becomes more prevalent. The increase in the scale of such shopping has led to the creation of many Internet shopping malls. Consequently, there is a tendency for increasingly fierce competition among online retailers, and as a result, many Internet shopping malls are making significant attempts to attract online users to their sites. One such attempt is keyword marketing, whereby a retail site pays a fee to expose its link to potential customers when they insert a specific keyword on an Internet portal site. The price related to each keyword is generally estimated by the keyword's frequency of appearance. However, it is widely accepted that the price of keywords cannot be based solely on their frequency because many keywords may appear frequently but have little relationship to shopping. This implies that it is unreasonable for an online shopping mall to spend a great deal on some keywords simply because people frequently use them. Therefore, from the perspective of shopping malls, a specialized process is required to extract meaningful keywords. Further, the demand for automating this extraction process is increasing because of the drive to improve online sales performance.
In this study, we propose a methodology that can automatically extract only shopping-related keywords from the entire set of search keywords used on portal sites. We define a shopping-related keyword as a keyword that is used directly before shopping behaviors. In other words, only search keywords that direct the search results page to shopping-related pages are extracted from among the entire set of search keywords. A comparison is then made between the extracted keywords' rankings and the rankings of the entire set of search keywords. Two types of data are used in our study's experiment: web browsing history from July 1, 2012 to June 30, 2013, and site information. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The original sample dataset contains 150 million transaction logs. First, portal sites are selected, and search keywords in those sites are extracted. Search keywords can be easily extracted by simple parsing. The extracted keywords are ranked according to their frequency. The experiment uses approximately 3.9 million search results from Korea's largest search portal site. As a result, a total of 344,822 search keywords were extracted. Next, by using web browsing history and site information, the shopping-related keywords were taken from the entire set of search keywords. As a result, we obtained 4,709 shopping-related keywords.
For performance evaluation, we compared the hit ratios of all the search keywords with the shopping-related keywords. To achieve this, we extracted 80,298 search keywords from several Internet shopping malls and then chose the top 1,000 keywords as a set of true shopping keywords. We measured precision, recall, and F-scores of the entire amount of keywords and the shopping-related keywords. The F-Score was formulated by calculating the harmonic mean of precision and recall. The precision, recall, and F-score of shopping-related keywords derived by the proposed methodology were revealed to be higher than those of the entire number of keywords.
This study proposes a scheme that is able to obtain shopping-related keywords in a relatively simple manner. We could easily extract shopping-related keywords simply by examining transactions whose next visit is a shopping mall. The resultant shopping-related keyword set is expected to be a useful asset for many shopping malls that participate in keyword marketing. Moreover, the proposed methodology can be easily applied to the construction of special area-related keywords as well as shopping-related ones.
Resolving the ‘Gray sheep’ Problem Using Social Network Analysis (SNA) in Collaborative Filtering (CF) Recommender Systems
Minsung Kim, and Il Im
Vol. 20, No. 2, Page: 137 ~ 148
Keywords : Collaborative filtering (CF), Social Network Analysis (SNA), Gray Sheep Problem
Recommender system has become one of the most important technologies in e-commerce in these days. The ultimate reason to shop online, for many consumers, is to reduce the efforts for information search and purchase. Recommender system is a key technology to serve these needs. Many of the past studies about recommender systems have been devoted to developing and improving recommendation algorithms and collaborative filtering (CF) is known to be the most successful one. Despite its success, however, CF has several shortcomings such as cold-start, sparsity, gray sheep problems. In order to be able to generate recommendations, ordinary CF algorithms require evaluations or preference information directly from users. For new users who do not have any evaluations or preference information, therefore, CF cannot come up with recommendations (Cold-star problem). As the numbers of products and customers increase, the scale of the data increases exponentially and most of the data cells are empty. This sparse dataset makes computation for recommendation extremely hard (Sparsity problem). Since CF is based on the assumption that there are groups of users sharing common preferences or tastes, CF becomes inaccurate if there are many users with rare and unique tastes (Gray sheep problem).
This study proposes a new algorithm that utilizes Social Network Analysis (SNA) techniques to resolve the gray sheep problem. We utilize ‘degree centrality’ in SNA to identify users with unique preferences (gray sheep). Degree centrality in SNA refers to the number of direct links to and from a node. In a network of users who are connected through common preferences or tastes, those with unique tastes have fewer links to other users (nodes) and they are isolated from other users. Therefore, gray sheep can be identified by calculating degree centrality of each node. We divide the dataset into two, gray sheep and others, based on the degree centrality of the users. Then, different similarity measures and recommendation methods are applied to these two datasets. More detail algorithm is as follows:
Step 1: Convert the initial data which is a two-mode network (user to item) into an one-mode network (user to user).
Step 2: Calculate degree centrality of each node and separate those nodes having degree centrality values lower than the pre-set threshold. The threshold value is determined by simulations such that the accuracy of CF for the remaining dataset is maximized.
Step 3: Ordinary CF algorithm is applied to the remaining dataset.
Step 4: Since the separated dataset consist of users with unique tastes, an ordinary CF algorithm cannot generate recommendations for them. A ‘popular item’ method is used to generate recommendations for these users. The F measures of the two datasets are weighted by the numbers of nodes and summed to be used as the final performance metric.
In order to test performance improvement by this new algorithm, an empirical study was conducted using a publically available dataset – the MovieLens data by GroupLens research team. We used 100,000 evaluations by 943 users on 1,682 movies. The proposed algorithm was compared with an ordinary CF algorithm utilizing ‘Best-N-neighbors’ and ‘Cosine’ similarity method. The empirical results show that F measure was improved about 11% on average when the proposed algorithm was used .
Past studies to improve CF performance typically used additional information other than users’ evaluations such as demographic data. Some studies applied SNA techniques as a new similarity metric. This study is novel in that it used SNA to separate dataset. This study shows that performance of CF can be improved, without any additional information, when SNA techniques are used as proposed.
This study has several theoretical and practical implications. This study empirically shows that the characteristics of dataset can affect the performance of CF recommender systems. This helps researchers understand factors affecting performance of CF. This study also opens a door for future studies in the area of applying SNA to CF to analyze characteristics of dataset. In practice, this study provides guidelines to improve performance of CF recommender systems with a simple modification.
Ontology-based Course Mentoring System
Kyeong-Jin Oh, Ui-Nyoung Yoon, and Geun-Sik Jo
Vol. 20, No. 2, Page: 149 ~ 162
Keywords : Ontology, Ontology Modeling, Course Mentoring, Curriculum
Course guidance is a mentoring process which is performed before students register for coming classes. The course guidance plays a very important role to students in checking degree audits of students and mentoring classes which will be taken in coming semester. Also, it is intimately involved with a graduation assessment or a completion of ABEEK certification. Currently, course guidance is manually performed by some advisers at most of universities in Korea because they have no electronic systems for the course guidance. By the lack of the systems, the advisers should analyze each degree audit of students and curriculum information of their own departments. This process often causes the human error during the course guidance process due to the complexity of the process. The electronic system thus is essential to avoid the human error for the course guidance. If the relation data model-based system is applied to the mentoring process, then the problems in manual way can be solved. However, the relational data model-based systems have some limitations. Curriculums of a department and certification systems can be changed depending on a new policy of a university or surrounding environments. If the curriculums and the systems are changed, a scheme of the existing system should be changed in accordance with the variations. It is also not sufficient to provide semantic search due to the difficulty of extracting semantic relationships between subjects. In this paper, we model a course mentoring ontology based on the analysis of a curriculum of computer science department, a structure of degree audit, and ABEEK certification. Ontology-based course guidance system is also proposed to overcome the limitation of the existing methods and to provide the effectiveness of course mentoring process for both of advisors and students. In the proposed system, all data of the system consists of ontology instances. To create ontology instances, ontology population module is developed by using JENA framework which is for building semantic web and linked data applications. In the ontology population module, the mapping rules to connect parts of degree audit to certain parts of course mentoring ontology are designed. All ontology instances are generated based on degree audits of students who participate in course mentoring test. The generated instances are saved to JENA TDB as a triple repository after an inference process using JENA inference engine. A user interface for course guidance is implemented by using Java and JENA framework. Once a advisor or a student input student’s information such as student name and student number at an information request form in user interface, the proposed system provides mentoring results based on a degree audit of current student and rules to check scores for each part of a curriculum such as special cultural subject, major subject, and MSC subject containing math and basic science. Recall and precision are used to evaluate the performance of the proposed system. The recall is used to check that the proposed system retrieves all relevant subjects. The precision is used to check whether the retrieved subjects are relevant to the mentoring results. An officer of computer science department attends the verification on the results derived from the proposed system. Experimental results using real data of the participating students show that the proposed course guidance system based on course mentoring ontology provides correct course mentoring results to students at all times. Advisors can also reduce their time cost to analyze a degree audit of corresponding student and to calculate each score for the each part. As a result, the proposed system based on ontology techniques solves the difficulty of mentoring methods in manual way and the proposed system derive correct mentoring results as human conduct.

Advanced Search
Date Range