Abstract:
[Objective] To develop a quantifiable framework for evaluating intelligence quality and detecting enterprise AI-washing behavior.
[Methods] This study integrates retrieval-augmented generation with a multi-agent game framework, constructing a multi-source knowledge base combining annual reports, patent, and financial data, and implementing parallel auditing, adversarial debate, and reasoning-based adjudication.
[Results] Based on 34,952 A-share firm-year observations (2015–2024), the proposed method achieves a QWK of 0.864 (13.9% over multi-agent baseline), reduces RMSE to 0.892, and lowers MAE to 0.685, consistently outperforming TF-IDF and single-model approaches.
[Limitations] The framework depends on the completeness of multi-source data and is sensitive to patent and financial disclosures; it also incurs relatively high computational costs, and its cross-market generalizability requires further validation.
[Conclusions] The proposed framework enables interpretable and scalable quantification of AI disclosure distortion, providing a novel computational paradigm for intelligence quality evaluation.