Fully Unsupervised Anomaly Detection in Industrial Images with Unknown Data Contamination

AI algorithms for the automatic detection of unusual or abnormal patterns in image data have become increasingly important in industrial quality inspection, improving product quality and operational efficiency. Most state-of-the-art Image Anomaly Detection (IAD) methods are based on unsupervised app...

Full description

Saved in:
Bibliographic Details
Published inSwiss Conference on Data Science (Online) pp. 40 - 47
Main Authors Wuest, Matthias, Huber, Lilach Goren
Format Conference Proceeding
LanguageEnglish
Published IEEE 26.06.2025
Subjects
Online AccessGet full text

Cover

Loading…
Abstract AI algorithms for the automatic detection of unusual or abnormal patterns in image data have become increasingly important in industrial quality inspection, improving product quality and operational efficiency. Most state-of-the-art Image Anomaly Detection (IAD) methods are based on unsupervised approaches, learning normal patterns from anomaly-free training data. However, in real-world applications the assumption of anomaly-free training data is often unrealistic, as labeling anomalies in the historical data can be expensive, error-prone, or even impossible. Anomalies contaminating the training data typically lead to a degraded anomaly detection (AD) performance at deployment, yet this issue remains largely overlooked in research. Some studies have attempted to mitigate this challenge through data refinement methods. However, these approaches often require prior knowledge of the anomaly ratio (AR) in the training data, which is rarely available in practice. In this paper, we introduce Overlapping Subsets Data Refinement (OSDR), a simple, fully unsupervised, and model-agnostic refinement framework designed to address image anomaly detection (IAD) under data contamination with no prior assumptions about the AR. OSDR assigns a refinement score to each training sample using an ensemble of models trained on partially overlapping data subsets, followed by robust anomaly removal through an adaptive thresholding technique. Evaluations on two widely used industrial image datasets demonstrate that OSDR effectively restores performance losses caused by contamination and outperforms existing refinement frameworks. Our approach provides a flexible, practical, and easy-to-deploy solution for IAD in real-world settings, where data contamination or mislabeling is often inevitable and the anomaly ratio is unknown.
AbstractList AI algorithms for the automatic detection of unusual or abnormal patterns in image data have become increasingly important in industrial quality inspection, improving product quality and operational efficiency. Most state-of-the-art Image Anomaly Detection (IAD) methods are based on unsupervised approaches, learning normal patterns from anomaly-free training data. However, in real-world applications the assumption of anomaly-free training data is often unrealistic, as labeling anomalies in the historical data can be expensive, error-prone, or even impossible. Anomalies contaminating the training data typically lead to a degraded anomaly detection (AD) performance at deployment, yet this issue remains largely overlooked in research. Some studies have attempted to mitigate this challenge through data refinement methods. However, these approaches often require prior knowledge of the anomaly ratio (AR) in the training data, which is rarely available in practice. In this paper, we introduce Overlapping Subsets Data Refinement (OSDR), a simple, fully unsupervised, and model-agnostic refinement framework designed to address image anomaly detection (IAD) under data contamination with no prior assumptions about the AR. OSDR assigns a refinement score to each training sample using an ensemble of models trained on partially overlapping data subsets, followed by robust anomaly removal through an adaptive thresholding technique. Evaluations on two widely used industrial image datasets demonstrate that OSDR effectively restores performance losses caused by contamination and outperforms existing refinement frameworks. Our approach provides a flexible, practical, and easy-to-deploy solution for IAD in real-world settings, where data contamination or mislabeling is often inevitable and the anomaly ratio is unknown.
Author Huber, Lilach Goren
Wuest, Matthias
Author_xml – sequence: 1
  givenname: Matthias
  surname: Wuest
  fullname: Wuest, Matthias
  email: matthias.wueest@zhaw.ch
  organization: School of Engineering, Zurich University of Applied Sciences,Winterthur,Switzerland
– sequence: 2
  givenname: Lilach Goren
  surname: Huber
  fullname: Huber, Lilach Goren
  email: lilach.gorenhuber@zhaw.ch
  organization: School of Engineering, Zurich University of Applied Sciences,Winterthur,Switzerland
BookMark eNotkM1OAjEUhavRRESeQBd9gcHe_ndJQJSExAW4cEXuzFStznTItEh4e8fo6iQn33cW55pcxC56Qm6BTQGYu98sNlqDgClnXE0ZYyDOyMQZZ4UA5aQ2cE5G3ApVCMnZFZmk9DlgggMYaUfkdXlomhN9iemw9_13SL6ms9i1OJQLn32VQxdpiHQV60PKfcCGrlp894keQ_4YxK_YHSNdYEY672LGNkT8lW7I5Rs2yU_-c0y2y4ft_KlYPz-u5rN1EZzIBa81GFGaEjRqzixyV3lVq6p2zFrP0HvlveemRF1ZYMwwKRxTEiSgkZUYk7u_2TBgu30fWuxPu-EcC9IZ8QMjoFVN
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/SDS66131.2025.00013
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9798331594671
EISSN 2835-3420
EndPage 47
ExternalDocumentID 11081497
Genre orig-research
GroupedDBID 6IE
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-i93t-2d6173b7b16a6208a29ce5d5cd9088e0aee5eee27ba6c810070439054141a74c3
IEDL.DBID RIE
IngestDate Wed Jul 23 05:50:31 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i93t-2d6173b7b16a6208a29ce5d5cd9088e0aee5eee27ba6c810070439054141a74c3
PageCount 8
ParticipantIDs ieee_primary_11081497
PublicationCentury 2000
PublicationDate 2025-June-26
PublicationDateYYYYMMDD 2025-06-26
PublicationDate_xml – month: 06
  year: 2025
  text: 2025-June-26
  day: 26
PublicationDecade 2020
PublicationTitle Swiss Conference on Data Science (Online)
PublicationTitleAbbrev SDS
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003211748
Score 1.9156159
Snippet AI algorithms for the automatic detection of unusual or abnormal patterns in image data have become increasingly important in industrial quality inspection,...
SourceID ieee
SourceType Publisher
StartPage 40
SubjectTerms Adaptation models
Anomaly detection
Contamination
Data models
Detectors
fully unsupervised learning
image anomaly detection
Security
Testing
Training
Training data
Unsupervised learning
Title Fully Unsupervised Anomaly Detection in Industrial Images with Unknown Data Contamination
URI https://ieeexplore.ieee.org/document/11081497
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NSwMxEA22J08qVvwmB6_bbrKbZPdsLVWwCG2hnko-plCk22J3D_rrndl-KILgbQmE7CYkL2_2vRnG7oR0IMMM9_dMmSjNhYucDDayxmgLPvfGklH4eaD74_RpoiZbs3rthQGAWnwGbXqs_-WHpa8oVNYhyTre6E2DNZC5bcxa-4BKglTGpNk2s5CI886wO0T0SYgFSoqcxFTD4EcNlRpCekdssBt8oxx5a1ela_vPX3kZ__12x6z17dbjL3scOmEHUJyyV-KWH3xcrKsVHQdrCByp_sJiYxfKWoBV8HnBv2t38McFHi5rTqFZ7EjRtoJ3bWk5pbCypJmhTi026j2M7vvRtoxCNM-TMpIBLymJM05oq2WcWZl7UEH5QBIniC2Awm-RxlntMxJNkFs2pvLgwprUJ2esWSwLOGdc5MpJiIOWWUBQS22sZ0KFDHCTe-RCF6xF0zJdbRJlTHczcvlH-xU7pKUh5ZXU16xZvldwgxhfutt6bb8ADTunMw
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1dS8MwFA06H_RJxYnf5sHXbk3aJO2zc2y6DWEbzKeRjzsYsm649kF_vbndlwiCbyUQ0iYkJ-f2nHsJeWDcAHcTv78nQgVxykxguNOBVkpqsKlVGo3C3Z5sDePnkRitzeqlFwYASvEZ1PCx_Jfv5rbAUFkdJev-Rq_2yYEHfsFWdq1tSCXyZEbFyTq3EAvTer_R9_gTIQ_kGDsJsYrBjyoqJYg0j0lvM_xKO_JeK3JTs1-_MjP--_1OSHXn16OvWyQ6JXuQnZE3ZJefdJgtiwUeCEtw1JP9mfaNDchLCVZGpxndVe-g7Zk_XpYUg7O-I8bbMtrQuaaYxEqjagY7Vcmg-TR4bAXrQgrBNI3ygDt_TYmMMkxqycNE89SCcMI6FDlBqAGE_xaujJY2QdkE-mVDLBDOtIptdE4q2TyDC0JZKgyH0EmeOA9rsQ7lhAmXgN_m1rOhS1LFaRkvVqkyxpsZufqj_Z4ctgbdzrjT7r1ckyNcJtRhcXlDKvlHAbce8XNzV67zN_d9qnw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Swiss+Conference+on+Data+Science+%28Online%29&rft.atitle=Fully+Unsupervised+Anomaly+Detection+in+Industrial+Images+with+Unknown+Data+Contamination&rft.au=Wuest%2C+Matthias&rft.au=Huber%2C+Lilach+Goren&rft.date=2025-06-26&rft.pub=IEEE&rft.eissn=2835-3420&rft.spage=40&rft.epage=47&rft_id=info:doi/10.1109%2FSDS66131.2025.00013&rft.externalDocID=11081497