• Construction of Sensitive Thesaurus for Network Rumors ——Taking the Microblog Rumors as an Example

    Subjects: Library Science,Information Science >> Library Science submitted time 2023-10-08 Cooperative journals: 《知识管理论坛》

    Abstract: [Purpose/significance] The network rumors seriously influent the spread of normal information on the internet. The purpose of this paper is to construct a sensitive lexicon on microblog rumors and to improve the recognition accuracy of the network rumors. [Method/process] According to the characteristics of microblog’s short text on social networking platforms, this paper focuses on construction of the microblog sensitive thesaurus, which is built up through LBCP algorithm and extension of multiple level words. At first, the method directly extracts words through LBCP algorithm, which considers the cohesion and polymerization of rumor words. And then, based on the core words, multiple level words are expanded to get sensitive thesaurus. [Result/conclusion] In addition to the features of the text, user characteristics, propagation characteristics, emotional analysis, and rumor features based on sensitive thesaurus are exploited. Experimental results show that the accuracy of microblog’s rumor recognition can be improved greatly based on sensitive thesaurus.

  • Sentiment Classification for Micro-Blogs Based on Word Embedding

    Subjects: Library Science,Information Science >> Library Science submitted time 2023-08-27 Cooperative journals: 《图书情报工作》

    Abstract: [Purpose/significance] Weibo has become an important platform for public emotional expression. Weibo's sentiment analysis plays an important role in public opinion analysis, user experience, and business opportunities. [Method/process] The sentiment orientation model named WE_SDAE proposed by this paper uses word embedding to transform a weibo into a dense low-dimensional vector and optimizes the simple auto-encoder into a deep denoise auto-encoder by appending a regularization term in the equation and adding noise during data pre-processing. Besides, the top-level classifier does the final sentimental classification. Considering the flexible term usage in the weibo, the sentiment orientation model is trained on character level and word level respectively. [Result/conclusion] The experimental results show that character-level model beats word-level model. In addition, comparative experiments show that WE_SDAE is better than traditional classifier SVM, Naive-Bayes, XgBoost, etc., and word embedding data preprocessing is better than traditional vector space model representation.