采用SIFT和VLAD特征编码的布匹检索算法

朱建清; 林露馨; 沈飞; 曾焕强; 蔡灿辉; 郑力新

doi:10.16798/j.issn.1003-0530.2019.10.013

采用SIFT和VLAD特征编码的布匹检索算法

Fabric Retrieval Algorithm Using SIFT and VLAD Feature Coding

摘要

摘要: 本文提出一种采用尺度不变特征变换(Scale-Invariant Feature Transform, SIFT)和局部聚合向量 (Vector of Locally Aggregated Descriptors, VLAD)特征编码的布匹检索算法。首先，提取图像的SIFT特征，以对图像进行特征表达。但是，每张图像SIFT特征点数量可能不同，导致不同图像的特征向量维度不一致，无法直接进行图像之间的相似度计算。为此，本文进一步对图像的SIFT特征进行VLAD编码，在保证不同图像的特征维度一致的同时，改进SIFT特征对图像的表达能力。在VLAD编码方面，先用K-means聚类算法生成视觉词典；再进行特征向量局部聚合。局部聚合过程包括：首先，计算图像中SIFT特征向量与对应视觉词之间的残差；然后，将每个视觉词相应的残差求和；最后，把各个视觉词上的残差求和值进行串联得到图像的VLAD编码。本文实验采用十次平均的累计匹配特性(Cumulative Match Characteristic，CMC)曲线作为性能指标。结果表明，本文所提出的方法能提高检索速度，且具有较高的识别率，其平均Rank 1识别率达到95.03%。

Abstract: A fabric retrieval algorithm using Scale-Invariant Feature Transform (SIFT) and Vector of Locally Aggregated Descriptors (VLAD) feature encoding is proposed in this paper. Firstly, SIFT features of images are extracted to represent images. However, different images usually contain different numbers of SIFT feature points. That causes a problem that feature dimensions of two different images are inconsistent so that the similarity between the images cannot be directly calculated. To solve this problem, the VLAD feature encoding is further implemented to ensure the consistency of feature dimensions of different images, while the feature representation ability of SIFT feature is also improved. The VLAD encoding includes two steps. First, learning a visual dictionary by using the K-means clustering algorithm. Second, local aggregation of eigenvectors. The local aggregation step contains three sub-steps: 1) calculating the residuals between SIFT feature vectors and corresponding visual words in the image ; 2) summarizing up the residuals corresponding to each visual word; 3) concatenating the residual sum values of each visual word were to obtain the VLAD code of the image. In this paper, the 10-time average of Cumulative Match Characteristic (CMC) curve is used as the performance measurement. The experiment results show that the proposed method is able to improve recognition speed and acquire a high identification rate, i.e., the average rank-1 identification rate is 95.03 %.

HTML全文

参考文献(21)

施引文献

资源附件(0)