Anomaly detection under data scarcity and uncertainty using zero-shot and few-shot approaches
https://doi.org/10.17586/2226-1494-2025-25-4-684-693
Abstract
Anomaly detection under conditions of limited data volume represents a pressing challenge across numerous applied domains, including medical diagnostics. Machine learning methods typically rely on the availability of annotated anomalous samples for training, which is often impractical. Existing anomaly detection techniques designed for few-shot or zero-shot scenarios suffer from various limitations. In particular, the common assumption of normally distributed data reduces the accuracy of anomaly classification. In this study, the task of improving the accuracy and completeness of anomaly detection in previously unseen images by leveraging a combination of the Contrastive Language-Image Pretraining (CLIP) and the domain-specific transformer BERT Pre-Training of Image Transformers (BeiT) models. The integration of CLIP and BeiT models enables simultaneous binary segmentation and anomaly classification. Enhanced anomaly detection is achieved through the use of weighted embeddings from each module. Additionally, the automated generation of textual representations based on a Large Language Model significantly enhances the generalization capacity of the system. The performance of the proposed models was evaluated on the Benchmarks for Medical Anomaly Detection test set. For the dermatological domain, a test set was constructed from ISIC-18, ISIC-19, SD-198, and 7-point criteria database. The proposed method demonstrated an average improvement in the ROC-AUC metric by 10.95 % at the image-level and by 0.66 % at the pixel-level compared to existing state-of-the-art solutions. Experimental results confirm the high effectiveness of the proposed approach in anomaly classification and segmentation tasks, showing superior average metric values. Inference analysis revealed that the incorporation of a variational autoencoder within the CLIP+BeiT architecture for centroid generation enhances the model stability in few-shot scenarios. The practical significance of the proposed method lies in its adaptability and robustness to changing data distributions, making it a promising solution for automated anomaly analysis in medical diagnostics, industrial monitoring, and other domains characterized by high data uncertainty.
About the Authors
S. A. MilantevRussian Federation
Sergey A. Milantev, PhD Student
197101; Saint Petersburg
sc 57225127274
P. D. Mikhailova
Russian Federation
Polina D. Mikhailova, Magister
197022; Saint Petersburg
I. A. Bessmertny
Russian Federation
Igor A. Bessmertny, D.Sc., Full Professor
197101; Saint Petersburg
sc 36661767800
References
1. Bao J., Sun H., Deng H., He Y., Zhang Z., Li X. BMAD: Benchmarks for Medical Anomaly Detection. arXiv, 2023, arXiv:2306.11876. doi: 10.48550/arXiv.2306.11876
2. Chen L., You Z., Zhang N., Xi J., Le X. UTRAD: Anomaly detection and localization with U-Transformer. Neural Networks, 2022, vol. 147, pp. 53–62. doi: 10.1016/j.neunet.2021.12.008
3. Salehi M., Sadjadi N., Baselizadeh S., Rohban M.H., Rabiee H.R. Multiresolution knowledge distillation for anomaly detection. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 14897–14907. doi: 10.1109/CVPR46437.2021.01466
4. Deng H., Li X. Anomaly detection via reverse distillation from one-class embedding. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 9727–9736. doi: 10.1109/CVPR52688.2022.00951
5. Roth K., Pemula L., Zepeda J., Schölkopf B., Brox T., Gehler P. Towards total recall in industrial anomaly detection. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 14298–14308. doi: 10.1109/CVPR52688.2022.01392
6. Lee S., Lee S., Song B. CFA: coupled-hypersphere-based feature adaptation for target-oriented anomaly localization. IEEE Access, 2022, vol. 10, pp. 78446–78454. doi: 10.1109/ACCESS.2022.3193699
7. Gudovskiy D., Ishizaka S., Kozuka K. CFLOW-AD: real-time unsupervised anomaly detection with localization via conditional normalizing flows. Proc. of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 1819–1828. doi: 10.1109/WACV51458.2022.00188
8. Chen X., Han Y., Zhang J. APRIL-GAN: a Zero-/Few-shot anomaly classification and segmentation method. arXiv, 2023, arXiv:2305.17382v3. doi: 10.48550/arXiv.2305.17382
9. Hu J., Chen Y., Yi Z. Automated segmentation of macular edema in OCT using deep neural networks. Medical Image Analysis, 2019, vol. 55, pp. 216–227. doi: 10.1016/j.media.2019.05.002
10. Wang X., Peng Y., Lu L., Lu Z., Bagheri M., Summers R. ChestX-Ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 3462–3471. doi: 10.1109/CVPR.2017.369
11. Bejnordi B., Veta M., van Diest P.J., van Ginneken B., Karssemeijer N., Litjens G., van der Laak J.A.W.M. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA Journal of the American Medical Association, 2017, vol. 318, no. 22, pp. 2199–2210. doi: 10.1001/jama.2017.14585
12. Tschandl P., Rosendahl C., Kittler H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific Data, 2018, vol. 5, pp. 180161. doi: 10.1038/sdata.2018.161
13. Codella N.C.F., Gutman D., Celebi M.E., Helba B., Marchetti M.A., Dusza S.W., Kalloo A., Liopyris K., Mishra N., Kittler H., Halpern A. Skin lesion analysis toward melanoma detection: a challenge at the 2017 International symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC). Proc. of the IEEE 15<sup>th</sup> International Symposium on Biomedical Imaging (ISBI 2018), 2018, pp. 168–172. doi: 10.1109/ISBI.2018.8363547
14. Combalia M., Codella N.C.F., Rotemberg V., Helba B., Vilaplana V., Reiter O., Carrera C., Barreiro A., Halpern A.C., Puig S., Malvehy J. BCN20000: Dermoscopic lesions in the wild. arXiv, 2019, arXiv:1908.02288. doi: 10.48550/arXiv.1908.02288
15. Sun X., Yang J., Sun M., Wang K. A benchmark for automatic visual classification of clinical skin disease images. Lecture Notes in Computer Science, 2016, vol. 9910, pp. 206–222. doi: 10.1007/978-3-319-46466-4_13
16. Kawahara J., Daneshvar S., Argenziano G., Hamarneh G. Seven-point checklist and skin lesion classification using multitask multimodal neural nets. IEEE Journal of Biomedical and Health Informatics, 2019, vol. 23, no. 2, pp. 538–546. doi: 10.1109/JBHI.2018.2824327
17. Baid U., Ghodasara S., Mohan S., Bilello M., Calabrese E., Colak E., et al. The RSNA-ASNR-MICCAI BraTS 2021 benchmark on brain tumor segmentation and radiogenomic classification. arXiv, 2021, arXiv:2107.02314. doi: 10.48550/arXiv.2107.02314
18. Bilic P, Christ P., Li H.B., Vorontsov E., Ben-Cohen A., Kaissis G., et al. The Liver Tumor Segmentation benchmark (LiTS). arXiv, 2019, arXiv:190.04056. doi: 10.48550/arXiv.1901.04056
Review
For citations:
Milantev S.A., Mikhailova P.D., Bessmertny I.A. Anomaly detection under data scarcity and uncertainty using zero-shot and few-shot approaches. Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2025;25(4):684-693. (In Russ.) https://doi.org/10.17586/2226-1494-2025-25-4-684-693