Journal of East China Normal University(Natural Science) ›› 2024, Vol. 2024 ›› Issue (2): 131-142.doi: 10.3969/j.issn.1000-5641.2024.02.014

• Computer Science • Previous Articles     Next Articles

An image caption generation algorithm based on decoupling commonsense association

Jiawei LIU, Xin LIN*()   

  1. 1. School of Computer Science and Technology, East China Normal University, Shanghai 200062, China
  • Received:2023-03-08 Online:2024-03-25 Published:2024-03-18
  • Contact: Xin LIN E-mail:xlin@cs.ecnu.edu.cn

Abstract:

The image caption generation algorithm based on decoupling commonsense association aims to eliminate the interference of commonsense association between various types of entities on the model reasoning, and improve the fluency and accuracy of the generated description. Aiming at the relationship sentences in the current image description that conform to common sense but do not conform to the image content, the algorithm first uses a novel training method to improve the attention of the relationship detection model to the real relationship in the image and improve the accuracy of relationship reasoning. Then, a relation-aware entity interaction method was used to carry out targeted information interaction for entities with relationships, and the relationship information was strengthened. The experimental results show that the proposed algorithm can correct some commonsense false relationships, generate more accurate image captions, and obtain better experimental results on various evaluation indicators.

Key words: image captioning, decoupling commonsense association, attention

CLC Number: