DIGITAL LIBRARY ARCHIVE
HOME > DIGITAL LIBRARY ARCHIVE
Journal of Intelligence and Information Systems,
Vol. 27, No. 4, December 2021
Target-Aspect-Sentiment Joint Detection with CNN Auxiliary Loss for Aspect-Based Sentiment Analysis
Min Jin Jeon, Ji Won Hwang, and Jong Woo Kim
Vol. 27, No. 4, Page: 1 ~ 22
Keywords : Online review analysis, ABSA, TASD, CNN, BERT
Abstract
Aspect Based Sentiment Analysis (ABSA), which analyzes sentiment based on aspects that appear in the text, is drawing attention because it can be used in various business industries. ABSA is a study that analyzes sentiment by aspects for multiple aspects that a text has. It is being studied in various forms depending on the purpose, such as analyzing all targets or just aspects and sentiments. Here, the aspect refers to the property of a target, and the target refers to the text that causes the sentiment. For example, for restaurant reviews, you could set the aspect into food taste, food price, quality of service, mood of the restaurant, etc. Also, if there is a review that says, "The pasta was delicious, but the salad was not," the words "steak" and "salad," which are directly mentioned in the sentence, become the “target.”
So far, in ABSA, most studies have analyzed sentiment only based on aspects or targets. However, even with the same aspects or targets, sentiment analysis may be inaccurate. Instances would be when aspects or sentiment are divided or when sentiment exists without a target. For example, sentences like, "Pizza and the salad were good, but the steak was disappointing." Although the aspect of this sentence is limited to “food,” conflicting sentiments coexist. In addition, in the case of sentences such as "Shrimp was delicious, but the price was extravagant," although the target here is “shrimp,” there are opposite sentiments coexisting that are dependent on the aspect. Finally, in sentences like "The food arrived too late and is cold now." there is no target (NULL), but it transmits a negative sentiment toward the aspect "service." Like this, failure to consider both aspects and targets – when sentiment or aspect is divided or when sentiment exists without a target – creates a dual dependency problem.
To address this problem, this research analyzes sentiment by considering both aspects and targets (Target-Aspect-Sentiment Detection, hereby TASD). This study detected the limitations of existing research in the field of TASD: local contexts are not fully captured, and the number of epochs and batch size dramatically lowers the F1-score. The current model excels in spotting overall context and relations between each word. However, it struggles with phrases in the local context and is relatively slow when learning. Therefore, this study tries to improve the model's performance.
To achieve the objective of this research, we additionally used auxiliary loss in aspect-sentiment classification by constructing CNN(Convolutional Neural Network) layers parallel to existing models. If existing models have analyzed aspect-sentiment through BERT encoding, Pooler, and Linear layers, this research added CNN layer-adaptive average pooling to existing models, and learning was progressed by adding additional loss values for aspect-sentiment to existing loss. In other words, when learning, the auxiliary loss, computed through CNN layers, allowed the local context to be captured more fitted. After learning, the model is designed to do aspect-sentiment analysis through the existing method.
To evaluate the performance of this model, two datasets, SemEval-2015 task 12 and SemEval-2016 task 5, were used and the f1-score increased compared to the existing models. When the batch was 8 and epoch was 5, the difference was largest between the F1-score of existing models and this study with 29 and 45, respectively. Even when batch and epoch were adjusted, the F1-scores were higher than the existing models. It can be said that even when the batch and epoch numbers were small, they can be learned effectively compared to the existing models. Therefore, it can be useful in situations where resources are limited.
Through this study, aspect-based sentiments can be more accurately analyzed. Through various uses in business, such as development or establishing marketing strategies, both consumers and sellers will be able to make efficient decisions. In addition, it is believed that the model can be fully learned and utilized by small businesses, those that do not have much data, given that they use a pre-training model and recorded a relatively high F1-score even with limited resources.
Development of the forecasting model for import volume by item of major countries based on economic, industrial structural and cultural factors: Focusing on the cultural factors of Korea
Seung-Pyo Jun, Bong-Goon Seo, and Do-Hyung Park
Vol. 27, No. 4, Page: 23 ~ 48
Keywords : Trade; Forecasting Model for Import Volume; Design Thinking; Search Traffic; Hofstede Model
Abstract
The Korean economy has achieved continuous economic growth for the past several decades thanks to the government's export strategy policy. This increase in exports is playing a leading role in driving Korea's economic growth by improving economic efficiency, creating jobs, and promoting technology development. Traditionally, the main factors affecting Korea's exports can be found from two perspectives: economic factors and industrial structural factors. First, economic factors are related to exchange rates and global economic fluctuations. The impact of the exchange rate on Korea's exports depends on the exchange rate level and exchange rate volatility. Global economic fluctuations affect global import demand, which is an absolute factor influencing Korea's exports. Second, industrial structural factors are unique characteristics that occur depending on industries or products, such as slow international division of labor, increased domestic substitution of certain imported goods by China, and changes in overseas production patterns of major export industries. Looking at the most recent studies related to global exchanges, several literatures show the importance of cultural aspects as well as economic and industrial structural factors. Therefore, this study attempted to develop a forecasting model by considering cultural factors along with economic and industrial structural factors in calculating the import volume of each country from Korea. In particular, this study approaches the influence of cultural factors on imports of Korean products from the perspective of PUSH-PULL framework. The PUSH dimension is a perspective that Korea develops and actively promotes its own brand and can be defined as the degree of interest in each country for Korean brands represented by K-POP, K-FOOD, and K-CULTURE. In addition, the PULL dimension is a perspective centered on the cultural and psychological characteristics of the people of each country. This can be defined as how much they are inclined to accept Korean Flow as each country’s cultural code represented by the country’s governance system, masculinity, risk avoidance, and short-term/long-term orientation. The unique feature of this study is that the proposed final prediction model can be selected based on Design Principles. The design principles we presented are as follows. 1) A model was developed to reflect interest in Korea and cultural characteristics through newly added data sources. 2) It was designed in a practical and convenient way so that the forecast value can be immediately recalled by inputting changes in economic factors, item code and country code. 3) In order to derive theoretically meaningful results, an algorithm was selected that can interpret the relationship between the input and the target variable. This study can suggest meaningful implications from the technical, economic and policy aspects, and is expected to make a meaningful contribution to the export support strategies of small and medium-sized enterprises by using the import forecasting model.
Knowledge graph-based knowledge map for efficient expression and inference of associated knowledge
Keedong Yoo
Vol. 27, No. 4, Page: 49 ~ 71
Keywords : Knowledge graph, Knowledge map, Graph DB, Associated knowledge, Neo4j
Abstract
Users who intend to utilize knowledge to actively solve given problems proceed their jobs with cross- and sequential exploration of associated knowledge related each other in terms of certain criteria, such as content relevance. A knowledge map is the diagram or taxonomy overviewing status of currently managed knowledge in a knowledge-base, and supports users' knowledge exploration based on certain relationships between knowledge. A knowledge map, therefore, must be expressed in a networked form by linking related knowledge based on certain types of relationships, and should be implemented by deploying proper technologies or tools specialized in defining and inferring them.
To meet this end, this study suggests a methodology for developing the knowledge graph-based knowledge map using the Graph DB known to exhibit proper functionality in expressing and inferring relationships between entities and their relationships stored in a knowledge-base. Procedures of the proposed methodology are modeling graph data, creating nodes, properties, relationships, and composing knowledge networks by combining identified links between knowledge. Among various Graph DBs, the Neo4j is used in this study for its high credibility and applicability through wide and various application cases.
To examine the validity of the proposed methodology, a knowledge graph-based knowledge map is implemented deploying the Graph DB, and a performance comparison test is performed, by applying previous research's data to check whether this study’s knowledge map can yield the same level of performance as the previous one did. Previous research’s case is concerned with building a process-based knowledge map using the ontology technology, which identifies links between related knowledge based on the sequences of tasks producing or being activated by knowledge. In other words, since a task not only is activated by knowledge as an input but also produces knowledge as an output, input and output knowledge are linked as a flow by the task. Also since a business process is composed of affiliated tasks to fulfill the purpose of the process, the knowledge networks within a business process can be concluded by the sequences of the tasks composing the process. Therefore, using the Neo4j, considered process, task, and knowledge as well as the relationships among them are defined as nodes and relationships so that knowledge links can be identified based on the sequences of tasks. The resultant knowledge network by aggregating identified knowledge links is the knowledge map equipping functionality as a knowledge graph, and therefore its performance needs to be tested whether it meets the level of previous research’s validation results. The performance test examines two aspects, the correctness of knowledge links and the possibility of inferring new types of knowledge: the former is examined using 7 questions, and the latter is checked by extracting two new-typed knowledge.
As a result, the knowledge map constructed through the proposed methodology has showed the same level of performance as the previous one, and processed knowledge definition as well as knowledge relationship inference in a more efficient manner. Furthermore, comparing to the previous research’s ontology-based approach, this study's Graph DB-based approach has also showed more beneficial functionality in intensively managing only the knowledge of interest, dynamically defining knowledge and relationships by reflecting various meanings from situations to purposes, agilely inferring knowledge and relationships through Cypher-based query, and easily creating a new relationship by aggregating existing ones, etc.
This study's artifacts can be applied to implement the user-friendly function of knowledge exploration reflecting user's cognitive process toward associated knowledge, and can further underpin the development of an intelligent knowledge-base expanding autonomously through the discovery of new knowledge and their relationships by inference. This study, moreover than these, has an instant effect on implementing the networked knowledge map essential to satisfying contemporary users eagerly excavating the way to find proper knowledge to use.
A Study on Searching for Export Candidate Countries of the Korean Food and Beverage Industry Using Node2vec Graph Embedding and Light GBM Link Prediction
Jae-Seong Lee, Seung-Pyo Jun, and Jinny Seo
Vol. 27, No. 4, Page: 73 ~ 95
Keywords : Node2vec, Graph Embedding, Light GBM, Link Prediction, Global Value Chain
Abstract
This study uses Node2vec graph embedding method and Light GBM link prediction to explore undeveloped export candidate countries in Korea's food and beverage industry. Node2vec is the method that improves the limit of the structural equivalence representation of the network, which is known to be relatively weak compared to the existing link prediction method based on the number of common neighbors of the network. Therefore, the method is known to show excellent performance in both community detection and structural equivalence of the network. The vector value obtained by embedding the network in this way operates under the condition of a constant length from an arbitrarily designated starting point node. Therefore, it has the advantage that it is easy to apply the sequence of nodes as an input value to the model for downstream tasks such as Logistic Regression, Support Vector Machine, and Random Forest. Based on these features of the Node2vec graph embedding method, this study applied the above method to the international trade information of the Korean food and beverage industry. Through this, we intend to contribute to creating the effect of extensive margin diversification in Korea in the global value chain relationship of the industry. The optimal predictive model derived from the results of this study recorded a precision of 0.95 and a recall of 0.79, and an F1 score of 0.86, showing excellent performance. This performance was shown to be superior to that of the binary classifier based on Logistic Regression set as the baseline model. In the baseline model, a precision of 0.95 and a recall of 0.73 were recorded, and an F1 score of 0.83 was recorded. In addition, the light GBM-based optimal prediction model derived from this study showed superior performance than the link prediction model of previous studies, which is set as a benchmarking model in this study. The predictive model of the previous study recorded only a recall rate of 0.75, but the proposed model of this study showed better performance which recall rate is 0.79. The difference in the performance of the prediction results between benchmarking model and this study model is due to the model learning strategy. In this study, groups were classified by the trade value scale, and prediction models were trained differently for these groups. Specific methods are (1) a method of randomly masking and learning a model for all trades without setting specific conditions for trade value, (2) arbitrarily masking a part of the trades with an average trade value or higher and using the model method, and (3) a method of arbitrarily masking some of the trades with the top 25% or higher trade value and learning the model. As a result of the experiment, it was confirmed that the performance of the model trained by randomly masking some of the trades with the above-average trade value in this method was the best and appeared stably. It was found that most of the results of potential export candidates for Korea derived through the above model appeared appropriate through additional investigation. Combining the above, this study could suggest the practical utility of the link prediction method applying Node2vec and Light GBM. In addition, useful implications could be derived for weight update strategies that can perform better link prediction while training the model. On the other hand, this study also has policy utility because it is applied to trade transactions that have not been performed much in the research related to link prediction based on graph embedding. The results of this study support a rapid response to changes in the global value chain such as the recent US-China trade conflict or Japan's export regulations, and I think that it has sufficient usefulness as a tool for policy decision-making.
1


Advanced Search
Date Range

to
Search