QIU Shuang, ZHAO Yao, WEI Shikui. A Survey of Referring Image Segmentation[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(6): 1144-1154. DOI: 10.16798/j.issn.1003-0530.2022.06.002
Citation: QIU Shuang, ZHAO Yao, WEI Shikui. A Survey of Referring Image Segmentation[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(6): 1144-1154. DOI: 10.16798/j.issn.1003-0530.2022.06.002

A Survey of Referring Image Segmentation

  • ‍ ‍As a hot issue in the cross field of computer vision and natural language processing, referring image segmentation aims to segment the corresponding target region in the image according to the natural language description. With the maturity of related deep learning technology and the emergence of large-scale datasets, this task has attracted extensive attention of researchers. In this paper, we describe the development of referring image segmentation. We first elaborate the existing methods including CNN-LSTM framework structure, the complex modular-based and graph-based method, and classify them into two categories according to the encoding and decoding methods for multimodal information. Then, the mainstream datasets and common evaluation metrics that can be used in referring image segmentation are summarized. In addition, the performance differences between the existing referring image segmentation models are comprehensively compared through experiments. Finally, we discuss the shortcomings of the existing methods in this field and the future development direction, especially for the complex referring description, we need multi-step and explicit reasoning steps to solve the problem of image referring image segmentation.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return