• Construction of Text Knowledge Network Integrating Discourse Structure

    Subjects: Library Science,Information Science >> Information Science submitted time 2023-04-01 Cooperative journals: 《图书情报工作》

    Abstract: [Purpose/significance] Text vectorization is a necessary pre-processing process in the fields of text mining, information retrieval, sentiment analysis, etc. It is an urgent problem to make node vectors contain rich and effective semantic and structural information.[Method/process] At first, this paper analyzed the text characteristic of science and technology policy. According to the classification system of the concept and the relationship between the concepts, this paper used BiLSTM-CRF algorithm and SVM respectively to extract index the concepts and their relations automatically. Meanwhile, the model integrated basic characteristics and syntactic semantic features in feature engineering, leading to a boost in recognition accuracy and efficiency. This article also put forward the concept knowledge network combining reasoning knowledge and the knowledge network construction method of furtherly integrating discourse structure.[Result/conclusion] Based on this knowledge network model, this paper implements a network representation learning model that can integrate node semantics, topology structure and category label information. It can fully exploit and represent text semantic and structural information, and through the visualization and experiment to verify the effectiveness of the proposed method.