HUANG Chen, PEI Jihong, ZHAO Yang. Pedestrian Sequence Attribute Recognition Method with Multi-feature Fusion Combined with Temporal Attention Mechanism[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(1): 64-73. DOI: 10.16798/j.issn.1003-0530.2022.01.008
Citation: HUANG Chen, PEI Jihong, ZHAO Yang. Pedestrian Sequence Attribute Recognition Method with Multi-feature Fusion Combined with Temporal Attention Mechanism[J]. JOURNAL OF SIGNAL PROCESSING, 2022, 38(1): 64-73. DOI: 10.16798/j.issn.1003-0530.2022.01.008

Pedestrian Sequence Attribute Recognition Method with Multi-feature Fusion Combined with Temporal Attention Mechanism

  • The majority of pedestrian attribute recognition tasks are based on a single image. The information contained in a single image is limited, and the image sequence contained rich useful information and temporal features. Using sequence information is an important way to improve the performance of pedestrian attribute recognition. This paper proposed a multi feature fusion pedestrian sequence attribute recognition network based on temporal attention mechanism. In addition to using common spatial-temporal quadratic average pooling feature aggregation and spatial-temporal mean maximum pooling feature aggregation to extract features, the network also designs spatial-temporal attention factor weighted feature aggregation branch to further extract sequence features. By fusing the sequence features of the above three branches, the network can obtain more abundant information. In the spatial-temporal attention factor weighted feature aggregation branches, a full channel spatial-temporal attention factor generation network based on 3D convolution is designed to better capture the spatial-temporal features in a sequence. Based on the cross-entropy loss, this paper adds the Tversky loss, which is used to constrain the number of FP and FN, as the overall loss function of the network, so that the network has a better trade-off between the precision and the recall. The experimental results show that the proposed method is superior to the method based on a single image and other common feature fusion and time series modeling methods in each performance metrics.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return