基于通道分组注意力机制的水声目标识别网络

Underwater Acoustic Target Recognition Network Based on Channel Grouping Attention Mechanism

  • 摘要: 针对传统目标识别网络中特征局部通道信息未被充分利用的问题,本文提出了一种特征通道分组注意力机制,与残差卷积神经网络组成有效的特征提取网络。首先,对特征沿通道维度分割形成多个子特征,在子特征中关注通道的重要性并赋予权重,进行通道重排得到信息的子特征分组并重复加权过程,在特征整体通道上进行信息交流。随后,取子特征的平均池化特征图作为代表,进行子特征之间的信息交流,实现特征整体与局部通道信息的增强与结合。最后,为进一步提高网络的识别性能,本文以水声目标辐射噪声的低频分析与记录谱(Low Frequency Analysis And Recording, LOFAR)和Mel谱两种特征作为网络模型的输入,构建了加入自编码器实现不同特征间信息交流的特征融合网络,将输入的两种信号时频特征进行深度融合,提高特征对信号携带信息的表征能力。基于ShipsEar数据集的实验验证表明,本文所提出的改进注意力机制,相较于常用的通道注意力机制在识别准确率上提高了1.38%以上。融合两种特征进行识别相较于单独应用LOFAR和Mel谱在识别准确率上分别提高了6.17%和1.2%。

     

    Abstract: ‍ ‍This paper addresses the issue of inadequate utilization of local channel information in traditional object recognition networks by proposing a featured channel grouping attention mechanism. This mechanism is integrated with residual convolutional neural networks to create an effective feature extraction network. Initially, the features are segmented along the channel dimension, resulting in multiple sub-features. Within these sub-features, the significance of each channel is highlighted, and appropriate weights are assigned. Channel rearrangement is then applied to form sub-feature groups, facilitating enhanced information exchange among the overall channels. Following this, the average pooled feature map of the sub-features is utilized as a representative, allowing for further information exchange to enhance and amalgamate both the overall and local channel information of the features. To further enhance the recognition performance of the network, this paper uses the Low-Frequency Analysis and Recording (LOFAR) spectrum and the Mel spectrum of underwater acoustic target radiation noise as inputs for the network model. It constructs a feature fusion network using an autoencoder to achieve information exchange between different features. The time-frequency features of the two input signals are deeply fused to improve the feature representation of the information conveyed by the signal. Experimental validation using the ShipsEar dataset shows that the improved attention mechanism proposed in this paper increases recognition accuracy by over 1.38% compared to commonly used channel attention mechanisms. The fusion of the two features for recognition enhances accuracy by 6.17% and 1.2%, respectively, compared to using the LOFAR and Mel spectra separately.

     

/

返回文章
返回