工业视觉异常检测:关于架构、模态及学习范式的综述
Visual Anomaly Detection in Industrial Systems: A Survey on Architectures, Modalities, and Learning Paradigms
-
摘要: 作为决定工业系统可靠性的关键技术,异常检测在现代制造业中具有不可替代的地位。随着生产系统对精度、自动化及泛化能力要求的不断提升,传统方法的局限性日益凸显。近年来,深度学习与计算机视觉技术的快速发展推动了工业异常检测技术的多维度升级。在数据维度,检测对象从单一二维纹理图像扩展至富含几何信息的三维点云。在架构维度,模型设计深度融合了 Transformer 的长距离依赖建模能力与卷积神经网络的局部感知优势。在信息利用上,研究正逐步转向多源异构信息的深度模态融合。工业场景中正负样本极端不平衡且异常形态难以预测,构成了该领域的核心挑战。为此,主流方法已由数据依赖型的全监督学习,逐步转向更加灵活高效的无监督重建、特征嵌入或弱监督学习机制。尽管相关研究成果呈快速增长,但现有的文献体系尚显碎片化,缺乏对涵盖2D与3D全模态技术的系统性理论归纳与统一视角的深入剖析。本文旨在填补这一空白,通过全面回顾现有的前沿检测算法,从数据集构建特性、评价指标体系的适用性以及主流开源框架的工程化应用等多个层面,梳理技术演进脉络。同时,我们对当前领域亟待解决的逻辑异常识别、模型轻量化部署等挑战及未来的大模型赋能趋势进行了深入探讨。本文期望通过提供一个全面、系统且具有前瞻性的综述,为推动工业异常检测的学术理论创新与工程实践落地提供重要参考。Abstract: As a pivotal technology determining the reliability of industrial systems, anomaly detection occupies an indispensable position in modern manufacturing. With escalating demands for precision, automation, and generalization capabilities in production systems, the limitations of traditional methods have become increasingly apparent. In recent years, rapid innovations in deep learning and computer vision have catalyzed a multidimensional evolution in industrial anomaly detection. In the data dimension, detection targets have transitioned from single 2D texture images to 3D point clouds rich in geometric information; in the architectural dimension, model designs deeply integrate the long-range dependency modeling capabilities of Transformers with the local perception advantages of Convolutional Neural Networks (CNNs); regarding information utilization, a progressive shift exists toward the deep fusion of multi-source heterogeneous information. Crucially, to address the intrinsic pain points of extreme class imbalance between positive and negative samples and the unpredictable nature of anomaly morphologies in industrial scenarios, mainstream methodologies have accelerated the paradigm shift from data-intensive fully supervised learning to more flexible and efficient mechanisms such as unsupervised reconstruction, embedding-based approaches, or weakly supervised learning. Despite the exponential growth of related research achievements, the existing literature remains fragmented, lacking a systematic theoretical induction and deep analysis from a unified perspective that covers both 2D and 3D full-modal technologies. This study aims to bridge this gap by comprehensively reviewing state-of-the-art detection algorithms. We delineate the technological evolutionary trajectory along multiple dimensions, including the characteristics of dataset construction, applicability of evaluation metric systems, and engineering application of mainstream open-source frameworks. Furthermore, we provide an in-depth discussion on urgent challenges such as logical anomaly recognition and lightweight model deployment, as well as future trends involving the empowerment of large models. We expect that this study will serve as a vital reference for promoting both academic theoretical innovation and practical engineering implementation in industrial anomaly detection by offering a comprehensive, systematic, and forward-looking survey.
下载: