Multimodal Industrial Anomaly Detection(多模态异常检测)

 

数据模态

目前多模态检测算法涉及的数据模态主要(或者说仅仅)有RGB图像、单视角点云(深度图)、文本

 

数据集

常见的数据集包括如 MVTec 3D-AD、EyeCandies 和 Real3D-AD等

数据集名称类别数规模数据模态来源应用场景官方链接
MVTec 3D-AD10~4KRGB-D图像、点云数据真实采集的工业场景数据工业零件表面缺陷检测(如金属、塑料制品)https://www.mvtec.com/company/research/datasets/mvtec-3d-ad
EyeCandies7~10KRGB图像、点云数据高度逼真的合成数据,模拟复杂环境算法鲁棒性测试,生成复杂背景和光照下的异常https://github.com/emmanuelbranlard/eye-candies
Real3D-AD10~1.5KRGB-D、点云、多视角图像真实场景采集,覆盖多样光照和角度日常物体异常检测(如电子设备、家具)https://github.com/Real3DAD/Real3D-AD

 

检测算法(基于 RGB + Point Cloud)

(写在前面)PatchCore

论文:Towards Total Recall in Industrial Anomaly Detection (CVPR 2022)[注:后面大量的工作都是基于PatchCore的模式]

关键思想:Maximally Representative Memory Bank of Nominal Patch-features.

Memory Bank 机制

MemoryBank 建立在一个具有内存检索和更新机制的内存存储器上,能够总结过去的事件。通过不断的记忆更新不断进化,通过合成以前的信息,随着时间的推移理解,根据经过的时间和记忆的相对重要性来忘记和强化记忆。每次出现查询请求时,都会遍历一遍历史对话记录,然后当前查询的内容遗忘保留率 s+1

参考链接:

MemoryBank:Enhancing Large Language Models with Long-Term Memory_memory bank-CSDN博客

(艾宾浩斯记忆曲线有无数学模型? - 知乎 (zhihu.com)

image-20241106092732766

Back to Features (BTF aka. 3D-ADS)

论文:Back to the feature: classical 3d features are (almost) all you need for 3d anomaly detection (CVPRW 2023)

核心思想:CNN (RGB图像特征提取)+ FPFH (点云深度特征提取)

Shape-Guided Dual-Memory Learning

论文:Shape-Guided Dual-Memory Learning for 3D Anomaly Detection (ICML 2023)

image-20240930112928606

Multi-3D-Memory (M3DM)

论文:Multimodal Industrial Anomaly Detection via Hybrid Fusion (CVPR 2023)

复现:M3DM复现记录 | “干杯( ゚-゚)っロ” (svyj.github.io)

image-20240930112450543

Complementary Pseudo Multimodal Feature (CPMF)

论文:Complementary pseudo multimodal feature for point cloud anomaly detection (PR 2024)

image-20240930113515145

###

Multi-modality Reconstruction Network (EasyNet)

论文:EasyNet: An Easy Network for 3D Industrial Anomaly Detection (ACMMM 2023)

image-20240930112745044

Crossmodal Feature Mapping (CFM)

论文:Multimodal Industrial Anomaly Detection by Crossmodal Feature Mapping (CVPR 2024)

image-20240930113052528

3D Dual Subspace Reprojection (3DSR)

论文:Cheating Depth: Enhancing 3D Surface Anomaly Detection via Depth Simulation (WACV 2024)

image-20240930113848502

Multi-Modal Reverse Distillation (MMRD)

论文:Rethinking Reverse Distillation for Multi-Modal Anomaly Detection (AAAI 2024)

image-20240930113716721

Dual-modality Anomaly Synthesis (DAS3D)

论文:DAS3D: Dual-modality Anomaly Synthesis for 3D Anomaly Detection (arXiv 2024)

image-20241106163312867

检测算法(基于 Text + RGB + Point Cloud)

Noisy-Resistant Multi-3D-Memory (M3DM-NR)

论文:M3DM-NR: RGB-3D Noisy-Resistant Industrial Anomaly Detection via Multimodal Denoising (arXiv 2024)

image-20241106162144247

结果现状

Image-AUROC

MethodPubicationBagelCable GlandCarrotCookieDowelFoamPeachPotatoRopeTireMean
DepthGAN [1]VISIGRAPP'220.5380.3720.5800.6030.4300.5340.6420.6010.4430.5770.532
DepthAE [1]VISIGRAPP'220.6480.5020.6500.4880.8050.5220.7120.5290.5400.5520.595
DepthVM [1]VISIGRAPP'220.5130.5510.4770.5810.6170.7160.4500.4210.5980.6230.555
VoxelGAN [1]VISIGRAPP'220.6800.3240.5650.3990.4970.4820.5660.5790.6010.4820.518
VoxelAE [1]VISIGRAPP'220.5100.5400.3840.6930.4460.6320.5500.4940.7210.4130.538
VoxelVM [1]VISIGRAPP'220.5530.7720.4840.7010.7510.5780.4800.4660.6890.6110.609
3D-ST [2]WACV'230.9500.4830.9860.9210.9050.6320.9450.9880.9760.5420.833
BTF [3]CVPR'230.9180.7480.9670.8830.9320.5820.8960.9120.9210.8860.865
EasyNet [4]MM'230.9910.9980.9180.9680.9450.9450.9050.8070.9940.7930.926
AST [5]WACV'230.9830.8730.9760.9710.9320.8850.9740.9811.0000.7970.937
CMDIAD [6]arXiv'240.9920.8930.9770.9600.9530.8830.9500.9370.9430.8930.938
M3DM [7]CVPR'230.9940.9090.9720.9760.9600.9420.9730.8990.9720.8500.945
M3DM-NR [8]arXiv'240.9930.9110.9770.9760.9600.9220.9730.8990.9550.8820.945
Shape-Guided [9]ICML'230.9860.8940.9830.9910.9760.8570.9900.9650.9600.8690.947
MMRD [10]AAAI'240.9990.9430.9640.9430.9920.9120.9490.9010.9940.9010.950
CPMF [11]PR'240.9830.8890.9890.9910.9580.8020.9880.9590.9790.9690.951
CFM [12]CVPR'240.9940.8880.9840.9930.9800.8880.9410.9430.9800.9530.954
LSFA [13]arXiv'241.0000.9390.9820.9890.9610.9510.9830.9620.9890.9510.971
3DSR [14]WACV'240.9810.8670.9960.9811.0000.9940.9860.9781.0000.9950.978
DAS3D [15]arXiv'240.9970.9730.9990.9920.9700.9950.9620.9540.9980.9770.982

Pixel-AUROC

MethodPubicationBagelCable GlandCarrotCookieDowelFoamPeachPotatoRopeTireMean
AST [5]WACV'23----------0.976
CMDIAD [6]arXiv'240.9950.9930.9960.9760.9840.9880.9960.9950.9970.9960.992
BTF [3]CVPR'23----------0.992
M3DM [7]CVPR'230.9950.9930.9970.9790.9850.9890.9960.9940.9970.9960.992
M3DM-NR [8]arXiv'240.9960.9930.9970.9790.9850.9890.9960.9950.9970.9960.992
CFM [12]CVPR'24----------0.993
DAS3D [15]arXiv'24----------0.993
3DSR [14]WACV'24----------0.995

参考文献

[1] Bergmann P, Jin X, Sattlegger D, et al. The mvtec 3d-ad dataset for unsupervised 3d anomaly detection and localization[J]. arXiv preprint arXiv:2112.09045, 2021.

[2] Bergmann P, Sattlegger D. Anomaly detection in 3d point clouds using deep geometric descriptors[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2023: 2613-2623.

[3] Horwitz E, Hoshen Y. Back to the feature: classical 3d features are (almost) all you need for 3d anomaly detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2023: 2968-2977.

[4] Chen R, Xie G, Liu J, et al. Easynet: An easy network for 3d industrial anomaly detection[C]//Proceedings of the 31st ACM International Conference on Multimedia. 2023: 7038-7046.

[5] Rudolph M, Wehrbein T, Rosenhahn B, et al. Asymmetric student-teacher networks for industrial anomaly detection[C]//Proceedings of the IEEE/CVF winter conference on applications of computer vision. 2023: 2592-2602.

[6] Sui W, Lichau D, Lefèvre J, et al. Incomplete Multimodal Industrial Anomaly Detection via Cross-Modal Distillation[J]. arXiv preprint arXiv:2405.13571, 2024.

[7] Wang Y, Peng J, Zhang J, et al. Multimodal industrial anomaly detection via hybrid fusion[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 8032-8041.

[8] Wang C, Zhu H, Peng J, et al. M3DM-NR: RGB-3D Noisy-Resistant Industrial Anomaly Detection via Multimodal Denoising[J]. arXiv preprint arXiv:2406.02263, 2024.

[9] Chu Y M, Liu C, Hsieh T I, et al. Shape-Guided Dual-Memory Learning for 3D Anomaly Detection[C]//International Conference on Machine Learning. PMLR, 2023: 6185-6194.

[10] Gu Z, Zhang J, Liu L, et al. Rethinking Reverse Distillation for Multi-Modal Anomaly Detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2024, 38(8): 8445-8453.

[11] Cao Y, Xu X, Shen W. Complementary pseudo multimodal feature for point cloud anomaly detection[J]. Pattern Recognition, 2024, 156: 110761.

[12] Costanzino A, Ramirez P Z, Lisanti G, et al. Multimodal industrial anomaly detection by crossmodal feature mapping[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 17234-17243.

[13] Tu Y, Zhang B, Liu L, et al. Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection[J]. arXiv preprint arXiv:2401.03145, 2024.

[14] Zavrtanik V, Kristan M, Skočaj D. Cheating depth: Enhancing 3d surface anomaly detection via depth simulation[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2024: 2164-2172.

[15] Li K, Dai B, Fu J, et al. DAS3D: Dual-modality Anomaly Synthesis for 3D Anomaly Detection[J]. arXiv preprint arXiv:2410.09821, 2024.