• Research on the Evolution of Public Opinion Themes on Microblogs of Emergency Events Based on the BERTopic: A Case Research of the Eastern Airlines Flight MU5735 Crash

    Subjects: Library Science,Information Science >> Information Processing submitted time 2024-04-18

    Abstract: Purpose/Significance : This research aims to systematically analyze the thematic evolution trends of public sentiment during emergency events, visualizing the focal themes throughout the development process of public sentiment, and providing a practical reference for guiding future online public sentiment. Method/Process : Utilizing the BERTopic model for topic extraction, this research identifies the themes at different stages of sentiment development and measures the similarity between themes using cosine similarity to visualize the thematic evolutionary paths. The case of the Eastern Airlines Flight Mu5735 crash on Sina Weibo is examined to research the thematic evolution of public sentiment during an emergency. Results/Conclusion : The empirical results demonstrate that the BERTopic model is effective and offers high visibility in theme identification for sentiment events, accurately capturing the hot topics in each phase of sentiment development and revealing the thematic evolution process during the spread of public sentiment. Innovation/Limitation : In this research, we propose a general framework for analyzing the theme evolution of microblog public opinion on short-text emergencies based on the BERTopic model, and we analyze the theme content evolution of the extracted results and present them visually. The limitation of this study lies in the fact that the data sources selected in this study are only from the microblogging platform, and the diversity of data sources can be improved in the future.

  • Stance Detection in Chinese Microblogs

    Subjects: Library Science,Information Science >> Library Science submitted time 2023-10-08 Cooperative journals: 《知识管理论坛》

    Abstract: [Purpose/significance] The paper introduces a new approach to automatically detect stance in Chinese microblogs by building a serial combination model based on Sentiment Weighted Algorithm and Naive Bayes (SWNB model). [Method/process] Firstly, this paper used the SWNB model to simplify complex sentences by using a defined complex sentence pattern, assigning a sentiment weight to each microblog according to calculation rules, and optimizing sentiment weight by detecting the presence of the target’s associated entities; thus, we could classify microblogs into those containing any stance or with no stance at all. Secondly, the SWNB model extracted some feature words and used Naive Bayes to classify the microblogs labeled as FAVOR or AGAINST. [Result/conclusion] Experiments show that this model can comprehensively process complex sentences, target-related entities and linguistic context.

  • Towards Professional Publishing: Research on Hotspot Detection Model Based on Multi-source Data

    Subjects: Library Science,Information Science >> Library Science submitted time 2023-07-26 Cooperative journals: 《图书情报工作》

    Abstract: [Purpose/significance] In order to solve the problem of topic selection for professional fields in publishing industry, this paper integrates multisource dynamic information on the Internet to detect the hotspots for professional fields through multi-dimensional intelligence analysis. The data-driven topic selection is realized to lay a solid foundation for the digitization transformation and development of publishing industry.[Method/process] A intelligence analysis model towards topic selection was proposed to detect hotspots in professional fields. The model was divided into two steps:the hotspot discovery and the hotness evaluation. The hotspot discovery in this model identified hotspots in professional fields through word frequency statistics and the algorithm of word growth rate. Then, in the step of hotness evaluation, a series of indices in the dimension of content and spread were designed to calculate and evaluate the hotness of the hotspots identified in the last step.[Result/conclusion] A hotspots detecting experiment was conducted with 36,550 pieces of Chinese multisource dynamic information in the area of ICT collected from January to April of 2018, which verified the effectiveness of the proposed model. This model can be used in publishing industry to complete the step of topic selection scientificallyn