• Empirical Research on TF-IDF Assisted Indexing Algorithm Based on Users' Natural Annotation

    Subjects: Library Science,Information Science >> Library Science submitted time 2023-08-26 Cooperative journals: 《图书情报工作》

    Abstract: [Purpose/significance] This paper studies the TF-IDF assisted indexing algorithm based on the user natural annotation from the users' point of view.[Method/process] First, the keywords and the classification number in Chinese core journals were taken as the data source. The user natural annotation vocabulary was constructed by computing the keywords frequency and using the TF-IDF algorithm. Second, the featured words were extracted from the scientific and technological project data by the IK Analyzer word segmentation software and the TF-IDF algorithm. Finally, the keywords and classification number of the scientific and technological project data were indexed synchronously.[Result/conclusion] The experiment indicates that the data of scientific and technical projects take up 68.1% in total. In these projects, the ratio similitude of the keywords of machine indexing and the keywords of human indexing is more than 60% in total. The ratio of the uniformity in the former three numbers of machine-indexed classification number and the human-indexed classification number is 83.9% in total. It is feasible to adopt the TF-IDF algorithm based on the users' natural annotation data.