2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection
This technical report introduces the winning solution of the team Segment Any Anomaly for the CVPR2023 Visual Anomaly and Novelty Detection (VAND) challenge. Going beyond uni-modal prompt, e.g., language prompt, we present a novel framework, i.e., Segment Any Anomaly + (SAA$+$), for zero-shot anomal...
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
15.06.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | This technical report introduces the winning solution of the team Segment Any
Anomaly for the CVPR2023 Visual Anomaly and Novelty Detection (VAND) challenge.
Going beyond uni-modal prompt, e.g., language prompt, we present a novel
framework, i.e., Segment Any Anomaly + (SAA$+$), for zero-shot anomaly
segmentation with multi-modal prompts for the regularization of cascaded modern
foundation models. Inspired by the great zero-shot generalization ability of
foundation models like Segment Anything, we first explore their assembly (SAA)
to leverage diverse multi-modal prior knowledge for anomaly localization.
Subsequently, we further introduce multimodal prompts (SAA$+$) derived from
domain expert knowledge and target image context to enable the non-parameter
adaptation of foundation models to anomaly segmentation. The proposed SAA$+$
model achieves state-of-the-art performance on several anomaly segmentation
benchmarks, including VisA and MVTec-AD, in the zero-shot setting. We will
release the code of our winning solution for the CVPR2023 VAN. |
---|---|
DOI: | 10.48550/arxiv.2306.09067 |