Skip to content

Latest commit

Β 

History

History
1226 lines (687 loc) Β· 77.3 KB

reference.md

File metadata and controls

1226 lines (687 loc) Β· 77.3 KB

Deep Learning for Video Anomaly Detection: A Review

This is the official repository for the paper entitled "Deep Learning for Video Anomaly Detection: A Review".

πŸ“– Table of contents

Reviews

Reference Year Venue Main Focus Main Categorization UVAD WVAD SVAD FVAD OVAD LVAD IVAD
Ramachandra et al. 2020 IEEE TPAMI Semi-supervised single-scene VAD Methodology Γ— Γ— √ Γ— Γ— Γ— Γ—
Santhosh et al. 2020 ACM CSUR VAD applied on road traffic Methodology √ Γ— √ √ Γ— Γ— Γ—
Nayak et al. 2021 IMAVIS Deep learning driven semi-supervised VAD Methodology Γ— Γ— √ Γ— Γ— Γ— Γ—
Tran et al. 2022 ACM CSUR Semi&weakly supervised VAD Architecture Γ— Γ— √ Γ— Γ— Γ— Γ—
Chandrakala et al. 2023 Artif. Intell. Rev. Deep model-based one&two-class VAD Methodology&Architecture Γ— √ √ √ Γ— Γ— Γ—
Liu et al. 2023 ACM CSUR Deep models for semi&weakly supervised VAD Model Input √ √ √ √ Γ— Γ— Γ—
Our survey 2024 - Comprehensive VAD taxonomy and deep models Methodology, Architecture, Refinement, Model Input, Model Output √ √ √ √ √ √ √

UVAD=Unsupervised VAD, WVAD=Weakly supervised VAD, SVAD=Semi-supervised VAD, FVAD=Fully supervised VAD, OVAD=Open-set supervised VAD, LVAD: Large-model based VAD, IVAD: Interpretable VAD

Taxonomy

1. Semi-Supervised Video Anomaly Detection

1.1 Model Input

1.1.1 RGB

Frame-Level RGB

πŸ—“οΈ 2016

  • πŸ“„ ConvAE:Learning temporal regularity in video sequences, πŸ“° CVPR code homepage

πŸ—“οΈ 2017

  • πŸ“„ ConvLSTM-AE:Remembering history with convolutional LSTM for anomaly detection, πŸ“° ICCV code

  • πŸ“„ STAE: Spatio-temporal autoencoder for video anomaly detection, πŸ“° ACM MM

  • πŸ“„ AnomalyGAN: Abnormal event detection in videos using generative adversarial nets, πŸ“° ICIP

πŸ—“οΈ 2019

  • πŸ“„ AMC: Anomaly detection in video sequence with appearance-motion correspondence, πŸ“° ICCV code

Patch-Level RGB

πŸ—“οΈ 2015

  • πŸ“„ AMDN:Learning deep representations of appearance and motion for anomalous event detection, πŸ“° BMVC

πŸ—“οΈ 2017

  • πŸ“„ AMDN2:Detecting anomalous events in videos by learning deep representations of appearance and motion, πŸ“° CVIU

  • πŸ“„ Deep-cascade:Deep-cascade: Cascading 3d deep neural networks for fast anomaly detection and localization in crowded scenes, πŸ“° TIP

πŸ—“οΈ 2018

  • πŸ“„ S$^2$-VAE:Generative neural networks for anomaly detection in crowded scenes, πŸ“° TIFS

πŸ—“οΈ 2019

  • πŸ“„ DeepOC:A deep one-class neural network for anomalous event detection in complex scenes, πŸ“° TNNLS

πŸ—“οΈ 2020

  • πŸ“„ GM-VAE:Video anomaly detection and localization via gaussian mixture fully convolutional variational autoencoder, πŸ“° CVIU

Object-Level RGB

πŸ—“οΈ 2017

  • πŸ“„ FRCN:Joint detection and recounting of abnormal events by learning deep generic knowledge, πŸ“° ICCV

πŸ—“οΈ 2019

  • πŸ“„ ObjectAE:Object-centric auto-encoders and dummy anomalies for abnormal event detection in video, πŸ“° CVPR code

πŸ—“οΈ 2021

  • πŸ“„ HF$^2$-VAD:A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction, πŸ“° ICCV code

πŸ—“οΈ 2022

  • πŸ“„ HSNBM:Hierarchical scene normality-binding modeling for anomaly detection in surveillance videos, πŸ“° ACM MM code

  • πŸ“„ BDPN:Comprehensive regularization in a bi-directional predictive network for video anomaly detection, πŸ“° AAAI

  • πŸ“„ ER-VAD:Evidential reasoning for video anomaly detection, πŸ“° ACM MM

πŸ—“οΈ 2023

  • πŸ“„ HSC:Hierarchical semantic contrast for scene-aware video anomaly detection, πŸ“° CVPRcode

1.1.2 Optical Flow

Frame Level

πŸ—“οΈ 2018

  • πŸ“„ FuturePred:Future frame prediction for anomaly detection–a new baseline, πŸ“° CVPR code

πŸ—“οΈ 2020

  • πŸ“„ FSCN:Fast sparse coding networks for anomaly detection in videos, πŸ“° PR code

    πŸ—“οΈ 2021

  • πŸ“„ F$^2$PN:Future frame prediction network for video anomaly detection, πŸ“° TPAMI code

  • πŸ“„ AMMC-Net:Appearance-motion memory consistency network for video anomaly detection, πŸ“° AAAI code

πŸ—“οΈ 2022

  • πŸ“„ STA-Net:Learning task-specific representation for video anomaly detection with spatialtemporal attention, πŸ“° ICASSP

πŸ—“οΈ 2023

  • πŸ“„ AMSRC:A video anomaly detection framework based on appearance-motion semantics representation consistency, πŸ“° ICASSP

Patch Level

πŸ—“οΈ 2019

  • πŸ“„ DeepOC:A deep one-class neural network for anomalous event detection in complex scenes, πŸ“° TNNLS

πŸ—“οΈ 2020

  • πŸ“„ ST-CaAE:Spatial-temporal cascade autoencoder for video anomaly detection in crowded scenes, πŸ“° TMM

  • πŸ“„ Siamese-Net:Learning a distance function with a siamese network to localize anomalies in videos, πŸ“° WACV

Object Level

πŸ—“οΈ 2021

  • πŸ“„ HF$^2$-VAD:A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction, πŸ“° ICCV code

πŸ—“οΈ 2022

  • πŸ“„ ER-VAD:Evidential reasoning for video anomaly detection, πŸ“° ACM MM

  • πŸ“„ Accurate-Interpretable-VAD:Attribute-based representations for accurate and interpretable video anomaly detection, πŸ“° Arxiv code

πŸ—“οΈ 2023

  • πŸ“„ AMSRC:A video anomaly detection framework based on appearance-motion semantics representation consistency, πŸ“° ICASSP

1.1.3 Skeleton

πŸ—“οΈ 2019

  • πŸ“„ MPED-RNN:Learning regularity in skeleton trajectories for anomaly detection in videos, πŸ“° CVPR code

πŸ—“οΈ 2020

  • πŸ“„ GEPC:Graph embedded pose clustering for anomaly detection, πŸ“° CVPR code

  • πŸ“„ MTTP:Multi-timescale trajectory prediction for abnormal human activity detection, πŸ“° WACV homepage

πŸ—“οΈ 2021

  • πŸ“„ NormalGraph:Normal graph: Spatial temporal graph convolutional networks based prediction network for skeleton based video anomaly detection, πŸ“° Neurocomputing

  • πŸ“„ HSTGCNN:A hierarchical spatio-temporal graph convolutional neural network for anomaly detection in videos, πŸ“° TCSVT code

πŸ—“οΈ 2022

  • πŸ“„ TSIF:A two-stream information fusion approach to abnormal event detection in video, πŸ“° ICASSP

  • πŸ“„ STGCAE-LSTM:Human-related anomalous event detection via spatial-temporal graph convolutional autoencoder with embedded long short-term memory network, πŸ“° Neurocomputing

  • πŸ“„ STGformer:Hierarchical graph embedded pose regularity learning via spatiotemporal transformer for abnormal behavior detection, πŸ“° ACM MM

πŸ—“οΈ 2023

  • πŸ“„ STG-NF:Normalizing flows for human pose anomaly detection, πŸ“° ICCV code

  • πŸ“„ MoPRL:Regularity learning via explicit distribution modeling for skeletal video anomaly detection, πŸ“° TCSVT

  • πŸ“„ MoCoDAD:Multimodal motion conditioned diffusion model for skeleton-based video anomaly detection, πŸ“° ICCV code

πŸ—“οΈ 2024

  • πŸ“„ TrajREC:Holistic representation learning for multitask trajectory anomaly detection, πŸ“° WACV

1.1.4 Hybrid

πŸ—“οΈ 2018

  • πŸ“„ FuturePred:Future frame prediction for anomaly detection–a new baseline, πŸ“° CVPR code

πŸ—“οΈ 2019

  • πŸ“„ DeepOC:A deep one-class neural network for anomalous event detection in complex scenes, πŸ“° TNNLS

πŸ—“οΈ 2021

  • πŸ“„ HF$^2$-VAD:A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction, πŸ“° ICCV code

πŸ—“οΈ 2024

  • πŸ“„ EOGT:Eogt: Video anomaly detection with enhanced object information and global temporal dependency, πŸ“° TOMM

1.2 Methodology

1.2.1 Self-Supervised Learning

Reconstruction

πŸ—“οΈ 2016

  • πŸ“„ ConvAE:Learning temporal regularity in video sequences, πŸ“° CVPR code homepage

πŸ—“οΈ 2017

  • πŸ“„ ConvLSTM-AE:Remembering history with convolutional LSTM for anomaly detection, πŸ“° ICCV code

πŸ—“οΈ 2018

  • πŸ“„ S$^2$-VAE:Generative neural networks for anomaly detection in crowded scenes, πŸ“° TIFS

πŸ—“οΈ 2019

  • πŸ“„ AMC: Anomaly detection in video sequence with appearance-motion correspondence, πŸ“° ICCV code

πŸ—“οΈ 2020

  • πŸ“„ ClusterAE:Clustering driven deep autoencoder for video anomaly detection, πŸ“° ECCV

  • πŸ“„ SIGnet:Anomaly detection with bidirectional consistency in videos, πŸ“° TNNLS

πŸ—“οΈ 2021

  • πŸ“„ SSR-AE:Self-supervision-augmented deep autoencoder for unsupervised visual anomaly detection, πŸ“° TCYB

πŸ—“οΈ 2023

  • πŸ“„ MoPRL:Regularity learning via explicit distribution modeling for skeletal video anomaly detection, πŸ“° TCSVT

Prediction

πŸ—“οΈ 2018

  • πŸ“„ FuturePred:Future frame prediction for anomaly detection–a new baseline, πŸ“° CVPR code

πŸ—“οΈ 2019

πŸ—“οΈ 2020

  • πŸ“„ Multispace:Normality learning in multispace for video anomaly detection, πŸ“° TCSVT

πŸ—“οΈ 2021

  • πŸ“„ HF$^2$-VAD:A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction, πŸ“° ICCV code

  • πŸ“„ AMMC-Net:Appearance-motion memory consistency network for video anomaly detection, πŸ“° AAAI code

  • πŸ“„ ROADMAP:Robust unsupervised video anomaly detection by multipath frame prediction, πŸ“° TNNLS

  • πŸ“„ AEP:Abnormal event detection and localization via adversarial event prediction, πŸ“° TNNLS

πŸ—“οΈ 2022

  • πŸ“„ STGformer:Hierarchical graph embedded pose regularity learning via spatiotemporal transformer for abnormal behavior detection, πŸ“° ACM MM

  • πŸ“„ OGMRA:Object-guided and motion-refined attention network for video anomaly detection, πŸ“° ICME

πŸ—“οΈ 2023

  • πŸ“„ STGCN:Spatial-temporal graph convolutional network boosted flow-frame prediction for video anomaly detection, πŸ“° ICASSP

  • πŸ“„ AMP-NET:Amp-net: Appearance-motion prototype network assisted automatic video anomaly detection system, πŸ“° TII

Visual Cloze Test

πŸ—“οΈ 2020

  • πŸ“„ VEC:Cloze test helps: Effective video anomaly detection via learning to complete video events, πŸ“° ACM MM code

πŸ—“οΈ 2023

  • πŸ“„ USTN-DSC:Video event restoration based on keyframes for video anomaly detection, πŸ“° CVPR

  • πŸ“„ VCC:Video anomaly detection via visual cloze tests, πŸ“° TIFS

Jigsaw Puzzles

πŸ—“οΈ 2022

  • πŸ“„ STJP:Video anomaly detection by solving decoupled spatio-temporal jigsaw puzzles, πŸ“° ECCV code

πŸ—“οΈ 2023

  • πŸ“„ MPT:Video anomaly detection via sequentially learning multiple pretext tasks, πŸ“° ICCV

  • πŸ“„ SSMTL++:Ssmtl++: Revisiting self-supervised multi-task learning for video anomaly detection, πŸ“° CVIU

Contrastive Learning

πŸ—“οΈ 2020

  • πŸ“„ CAC:Cluster attention contrast for video anomaly detection, πŸ“° ACM MM

πŸ—“οΈ 2021

  • πŸ“„ TAC-Net:Abnormal event detection using deep contrastive learning for intelligent video surveillance system, πŸ“° TII

πŸ—“οΈ 2022

  • πŸ“„ LSH:Learnable locality-sensitive hashing for video anomaly detection, πŸ“° TCSVT

Denoising

πŸ—“οΈ 2020

  • πŸ“„ Adv-AE:Adversarial 3d convolutional autoencoder for abnormal event detection in videos, πŸ“° TMM

πŸ—“οΈ 2021

  • πŸ“„ NM-GAN:Nm-gan: Noise-modulated generative adversarial network for video anomaly detection, πŸ“° PR

Deep Sparse Coding

πŸ—“οΈ 2017

  • πŸ“„ Stacked-RNN, A revisit of sparse coding based anomaly detection in stacked RNN frameworkπŸ“° ICCV code

πŸ—“οΈ 2019

  • πŸ“„ Anomalynet:Anomalynet: An anomaly detection network for video surveillance, πŸ“° TIFS code

  • πŸ“„ sRNN-AE:Video anomaly detection with sparse coding inspired deep neural networks, πŸ“° TPAMI code

πŸ—“οΈ 2020

  • πŸ“„ FSCN:Fast sparse coding networks for anomaly detection in videos, πŸ“° PR code

Patch Inpainting

πŸ—“οΈ 2021

  • πŸ“„ RIAD:Reconstruction by inpainting for visual anomaly detection, πŸ“° PR code

πŸ—“οΈ 2022

  • πŸ“„ SSPCAB:Self-supervised predictive convolutional attentive block for anomaly detection, πŸ“° CVPR code

πŸ—“οΈ 2023

  • πŸ“„ SSMCTB:Self-supervised masked convolutional transformer block for anomaly detection, πŸ“° TPAMI code

πŸ—“οΈ 2024

  • πŸ“„ AED-MAE:Self-distilled masked auto-encoders are efficient video anomaly detectors, πŸ“° CVPR code

Multiple Task

πŸ—“οΈ 2017

  • πŸ“„ STAE: Spatio-temporal autoencoder for video anomaly detection, πŸ“° ACM MM

πŸ—“οΈ 2019

  • πŸ“„ MPED-RNN:Learning regularity in skeleton trajectories for anomaly detection in videos, πŸ“° CVPR

  • πŸ“„ AnoPCN:Anopcn: Video anomaly detection via deep predictive coding network, πŸ“° ACM MM

πŸ—“οΈ 2021

  • πŸ“„ Multitask:Anomaly detection in video via self-supervised and multi-task learning, πŸ“° CVPR homepage

πŸ—“οΈ 2022

  • πŸ“„ HSNBM:Hierarchical scene normality-binding modeling for anomaly detection in surveillance videos, πŸ“° ACM MM code

  • πŸ“„ LSH:Learnable locality-sensitive hashing for video anomaly detection, πŸ“° TCSVT

  • πŸ“„ AMAE:Appearance-motion united auto-encoder framework for video anomaly detection, πŸ“° TCAS-II

  • πŸ“„ STM-AE:Learning appearance-motion normality for video anomaly detection, πŸ“° ICME

  • πŸ“„ SSAGAN:Self-supervised attentive generative adversarial networks for video anomaly detection, πŸ“° TNNLS

πŸ—“οΈ 2023

  • πŸ“„ MPT:Video anomaly detection via sequentially learning multiple pretext tasks, πŸ“° ICCV

  • πŸ“„ SSMTL++:Ssmtl++: Revisiting self-supervised multi-task learning for video anomaly detection, πŸ“° CVIU

πŸ—“οΈ 2024

  • πŸ“„ MGSTRL:Multi-scale video anomaly detection by multi-grained spatiotemporal representation learning, πŸ“° CVPR

1.2.2 One-Class Learning

One-Class Classifier

πŸ—“οΈ 2015

  • πŸ“„ AMDN:Learning deep representations of appearance and motion for anomalous event detection, πŸ“° BMVC

πŸ—“οΈ 2018

  • πŸ“„ Deep SVDD:Deep one-class classification, πŸ“° PMLR code

πŸ—“οΈ 2019

  • πŸ“„ DeepOC:A deep one-class neural network for anomalous event detection in complex scenes, πŸ“° TNNLS
  • πŸ“„ GODS:Gods: Generalized one-class discriminative subspaces for anomaly detection, πŸ“° ICCV

πŸ—“οΈ 2021

  • πŸ“„ FCDD:Explainable deep one-class classification, πŸ“° ICLR code

Gaussian Classifier

πŸ—“οΈ 2018

  • πŸ“„ Deep-anomaly:Deep-anomaly: Fully convolutional neural network for fast anomaly detection in crowded scenes, πŸ“° CVIU

πŸ—“οΈ 2020

  • πŸ“„ GM-VAE:Video anomaly detection and localization via gaussian mixture fully convolutional variational autoencoder, πŸ“° CVIU

πŸ—“οΈ 2021

  • πŸ“„ Deep-cascade:Deep-cascade: Cascading 3d deep neural networks for fast anomaly detection and localization in crowded scenes, πŸ“° TIP

Adversarial Classifier

πŸ—“οΈ 2018

  • πŸ“„ ALOCC:Adversarially learned one-class classifier for novelty detection, πŸ“° CVPR code

  • πŸ“„ AVID:Avid: Adversarial visual irregularity detection, πŸ“° ACCV code

πŸ—“οΈ 2020

  • πŸ“„ ALOCC2:Deep end-to-end one-class classifier, πŸ“° TNNLS

  • πŸ“„ OGNet:Old is gold: Redefining the adversarially learned one-class classifier training paradigm, πŸ“° CVPR code

πŸ—“οΈ 2022

  • πŸ“„ OGNet+:Stabilizing adversarially learned one-class novelty detection using pseudo anomalies, πŸ“° TIP

1.2.3 Interpretable Learning

πŸ—“οΈ 2017

  • πŸ“„ FRCN:Joint detection and recounting of abnormal events by learning deep generic knowledge, πŸ“° ICCV

πŸ—“οΈ 2022

πŸ—“οΈ 2023

  • πŸ“„ InterVAD:Towards interpretable video anomaly detection, πŸ“° WACV

  • πŸ“„ EVAL:Eval: Explainable video anomaly localization, πŸ“° CVPR

πŸ—“οΈ 2024

  • πŸ“„ AnomalyRuler:Follow the rules: Reasoning for video anomaly detection with large language models, πŸ“° ECCV code

1.3 Network Architecture

1.3.1 Auto-Encoder

πŸ—“οΈ 2016

  • πŸ“„ Conv-LSTM:Anomaly detection in video using predictive convolutional long short-term memory networks, πŸ“° Arxiv

πŸ—“οΈ 2017

  • πŸ“„ STAE: Spatio-temporal autoencoder for video anomaly detection, πŸ“° ACM MM

  • πŸ“„ ConvLSTM-AE:Remembering history with convolutional LSTM for anomaly detection, πŸ“° ICCV code

πŸ—“οΈ 2019

  • πŸ“„ DeepOC:A deep one-class neural network for anomalous event detection in complex scenes, πŸ“° TNNLS

  • πŸ“„ sRNN-AE:Video anomaly detection with sparse coding inspired deep neural networks, πŸ“° TPAMI

  • πŸ“„ MPED-RNN:Learning regularity in skeleton trajectories for anomaly detection in videos, πŸ“° CVPR

πŸ—“οΈ 2021

  • πŸ“„ NormalGraph:Normal graph: Spatial temporal graph convolutional networks based prediction network for skeleton based video anomaly detection, πŸ“° Neurocomputing

πŸ—“οΈ 2022

  • πŸ“„ STGCAE-LSTM:Human-related anomalous event detection via spatial-temporal graph convolutional autoencoder with embedded long short-term memory network, πŸ“° Neurocomputing

πŸ—“οΈ 2023

  • πŸ“„ USTN-DSC:Video event restoration based on keyframes for video anomaly detection, πŸ“° CVPR

πŸ—“οΈ 2024

  • πŸ“„ AED-MAE:Self-distilled masked auto-encoders are efficient video anomaly detectors, πŸ“° CVPR code

1.3.2 GAN

πŸ—“οΈ 2018

  • πŸ“„ FuturePred:Future frame prediction for anomaly detection–a new baseline, πŸ“° CVPR code
  • πŸ“„ ALOCC:Adversarially learned one-class classifier for novelty detection, πŸ“° CVPR code

πŸ—“οΈ 2019

  • πŸ“„ AD-VAD:Training adversarial discriminators for cross-channel abnormal event detection in crowds, πŸ“° WACV

  • πŸ“„ VAD-GAN:Robust anomaly detection in videos using multilevel representations, πŸ“° AAAI code

  • πŸ“„ Ada-Net:Learning normal patterns via adversarial attention-based autoencoder for abnormal event detection in videos, πŸ“° TMM

πŸ—“οΈ 2020

  • πŸ“„ OGNet:Old is gold: Redefining the adversarially learned one-class classifier training paradigm, πŸ“° CVPR code

πŸ—“οΈ 2021

  • πŸ“„ CT-D2GAN:Convolutional transformer based dual discriminator generative adversarial networks for video anomaly detection, πŸ“° ACM MM

1.3.3 Diffusion

πŸ—“οΈ 2023

  • πŸ“„ FPDM:Feature prediction diffusion model for video anomaly detection, πŸ“° ICCV

  • πŸ“„ MoCoDAD:Multimodal motion conditioned diffusion model for skeleton-based video anomaly detection, πŸ“° ICCV code

1.4 Model Refinement

1.4.1 Pseudo Anomalies

πŸ—“οΈ 2021

  • πŸ“„ LNRA:Learning not to reconstruct anomalies, πŸ“° BMVC code

  • πŸ“„ G2D:G2d: Generate to detect anomaly, πŸ“° WACV code

  • πŸ“„ BAF:A background-agnostic framework with adversarial training for abnormal event detection in video, πŸ“° TPAMI code

πŸ—“οΈ 2022

  • πŸ“„ OGNet+:Stabilizing adversarially learned one-class novelty detection using pseudo anomalies, πŸ“° TIP

  • πŸ“„ MBPA:Limiting reconstruction capability of autoencoders using moving backward pseudo anomalies, πŸ“° UR

πŸ—“οΈ 2023

  • πŸ“„ DSS-NET:Dss-net: Dynamic self-supervised network for video anomaly detection, πŸ“° TMM

  • πŸ“„ PseudoBound:Pseudobound: Limiting the anomaly reconstruction capability of one-class classifiers using pseudo anomalies, πŸ“° Neurocomputing

  • πŸ“„ PFMF:Generating anomalies for video anomaly detection with prompt-based feature mapping, πŸ“° CVPR

1.4.2 Memory Bank

πŸ—“οΈ 2019

  • πŸ“„ MemAE: Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection, πŸ“° ICCV code

πŸ—“οΈ 2020

  • πŸ“„ MNAD:Learning memory-guided normality for anomaly detection, πŸ“° CVPR code homepage

πŸ—“οΈ 2021

  • πŸ“„ MPN:Learning normal dynamics in videos with meta prototype network, πŸ“° CVPR code

πŸ—“οΈ 2022

  • πŸ“„ EPAP-Net:Anomaly warning: Learning and memorizing future semantic patterns for unsupervised ex-ante potential anomaly prediction, πŸ“° ACM MM

  • πŸ“„ CAFE:Effective video abnormal event detection by learning a consistency-aware high-level feature extractor, πŸ“° ACM MM

  • πŸ“„ DLAN-AC:Dynamic local aggregation network with adaptive clusterer for anomaly detection, πŸ“° ECCV code

πŸ—“οΈ 2023

  • πŸ“„ DMAD:Diversity-measurable anomaly detection, πŸ“° CVPR code

  • πŸ“„ SVN:Stochastic video normality network for abnormal event detection in surveillance videos, πŸ“° KBS

  • πŸ“„ LERF:Learning event-relevant factors for video anomaly detection, πŸ“° AAAI

  • πŸ“„ MAAM-Net:Memory-augmented appearance-motion network for video anomaly detection, πŸ“° PR

πŸ—“οΈ 2024

  • πŸ“„ STU-Net:Context recovery and knowledge retrieval: A novel two-stream framework for video anomaly detection, πŸ“° TIP homepage

1.5 Model Output

1.5.1 Frame Level

1.5.2 Pixel Level

πŸ—“οΈ 2022

  • πŸ“„ UPformer:Pixel-level anomaly detection via uncertainty-aware prototypical transformer, πŸ“° ACM MM

2. Weakly Supervised Video Anomaly Detection

πŸ—“οΈ 2018

  • πŸ“„ DeepMIL: Real-world anomaly detectionin surveillance videos, πŸ“° CVPR code homepage

2.1 Model Input

2.1.1 RGB

πŸ—“οΈ 2018

πŸ—“οΈ 2019

  • πŸ“„ GCN:Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection, πŸ“° CVPR code

πŸ—“οΈ 2020

  • πŸ“„ CLAWS: Claws: Clustering assisted weakly supervised learning with normalcy suppression for anomalous event detection, πŸ“° ECCV code

  • πŸ“„ HLNet:Not only look, but also listen: Learning multimodal violence detection under weak supervision, πŸ“° ECCV code homepage

πŸ—“οΈ 2022

  • πŸ“„ S3R:Self-supervised sparse representation for video anomaly detection, πŸ“° ECCV code

  • πŸ“„ GCN+:Weakly-supervised anomaly detection in video surveillance via graph convolutional label noise cleaning, πŸ“° Neurocomputing

  • πŸ“„ MSL:Self-training multi-sequence learning with transformer for weakly supervised video anomaly detection, πŸ“° AAAI

πŸ—“οΈ 2023

  • πŸ“„ BN-WVAD:Batchnorm-based weakly supervised video anomaly detection, πŸ“° Arxiv code

  • πŸ“„ LSTC:Long-short temporal co-teaching for weakly supervised video anomaly detection, πŸ“° ICME code

πŸ—“οΈ 2024

  • πŸ“„ AlMarri Salem et al.: A multi-head approach with shuffled segments for weakly-supervised video anomaly detection, πŸ“° WACV

  • πŸ“„ OVVAD:Open-vocabulary video anomaly detection, πŸ“° CVPR

2.1.2 Optical Flow

πŸ—“οΈ 2019

  • πŸ“„ GCN:Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection, πŸ“° CVPR code

πŸ—“οΈ 2020

  • πŸ“„ AR-NET:Weakly supervised video anomaly detection via center-guided discriminative learning, πŸ“° ICME code

2.1.3 Audio

πŸ—“οΈ 2021

  • πŸ“„ FVAL:Violence detection in videos based on fusing visual and audio information, πŸ“° ICASSP

πŸ—“οΈ 2023

  • πŸ“„ HyperVD:Learning weakly supervised audio-visual violence detection in hyperbolic space, πŸ“° Arxiv code

2.1.4 Text

πŸ—“οΈ 2023

  • πŸ“„ PEL4VAD:Learning prompt-enhanced context features for weakly-supervised video anomaly detection, πŸ“° Arxiv code

  • πŸ“„ TEVAD:Tevad: Improved video anomaly detection with captions, πŸ“° CVPRW code

πŸ—“οΈ 2024

  • πŸ“„ LAP:Learn suspected anomalies from event prompts for video anomaly detection, πŸ“° Arxiv

  • πŸ“„ ALAN:Toward video anomaly retrieval from video anomaly detection: New benchmarks and model, πŸ“° TIP

2.1.5 Hybrid

πŸ—“οΈ 2020

  • πŸ“„ AR-NET:Weakly supervised video anomaly detection via center-guided discriminative learning, πŸ“° ICME code

πŸ—“οΈ 2022

  • πŸ“„ ACF_MMVD:Look, listen and pay more attention: Fusing multi-modal information for video violence detection, πŸ“° ICASSP code

  • πŸ“„ MSFA:Msaf: Multimodal supervise-attention enhanced fusion for video anomaly detection, πŸ“° SPL homepage

  • πŸ“„ MACIL_SD:Modality-aware contrastive instance learning with self-distillation for weakly-supervised audio-visual violence detection, πŸ“° ACM MM code

  • πŸ“„ HL-Net+:Weakly supervised audio-visual violence detection, πŸ“° TMM

πŸ—“οΈ 2024

  • πŸ“„ UCA:Towards surveillance video-and-language understanding: New dataset baselines and challenges, πŸ“° CVPR homepage

2.2 Methodology

2.2.1 One-Stage MIL

πŸ—“οΈ 2018

πŸ—“οΈ 2019

  • πŸ“„ MAF:Motion-aware feature for improved video anomaly detection πŸ“° BMVC

  • πŸ“„ TCN-IBL:Temporal convolutional network with complementary inner bag loss for weakly supervised anomaly detection, πŸ“° ICIP

πŸ—“οΈ 2020

  • πŸ“„ HLNet:Not only look, but also listen: Learning multimodal violence detection under weak supervision, πŸ“° ECCV code

πŸ—“οΈ 2022

  • πŸ“„ CNL:Collaborative normality learning framework for weakly supervised video anomaly detection, πŸ“° TCAS-II

2.2.2 Two-Stage Self-Training

πŸ—“οΈ 2019

  • πŸ“„ GCN:Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection, πŸ“° CVPR

πŸ—“οΈ 2021

  • πŸ“„ MIST:Mist: Multiple instance self-training framework for video anomaly detection, πŸ“° CVPR code homepage

πŸ—“οΈ 2022

  • πŸ“„ MSL:Self-training multi-sequence learning with transformer for weakly supervised video anomaly detection, πŸ“° AAAI

πŸ—“οΈ 2023

  • πŸ“„ CUPL:Exploiting completeness and uncertainty of pseudo labels for weakly supervised video anomaly detection, πŸ“° CVPR code

πŸ—“οΈ 2024

  • πŸ“„ TPWNG:Text prompt with normality guidance for weakly supervised video anomaly detection, πŸ“° CVPR

2.3 Refinement Strategy

2.3.1 Temporal Modeling

πŸ—“οΈ 2020

  • πŸ“„ HLNet:Not only look, but also listen: Learning multimodal violence detection under weak supervision, πŸ“° ECCV code

πŸ—“οΈ 2021

  • πŸ“„ CTR:Learning causal temporal relation and feature discrimination for anomaly detection, πŸ“° TIP

  • πŸ“„ RTFM:Weakly-supervised video anomaly detection with robust temporal feature magnitude learning, πŸ“° ICCV code

  • πŸ“„ CA-Net:Contrastive attention for video anomaly detection, πŸ“° TMM code

  • πŸ“„ CRF:Dance with self-attention: A new look of conditional random fields on anomaly detection in videos, πŸ“° ICCV

πŸ—“οΈ 2022

  • πŸ“„ MSL:Self-training multi-sequence learning with transformer for weakly supervised video anomaly detection, πŸ“° AAAI

  • πŸ“„ DAR:Decouple and resolve: transformer-based models for online anomaly detection from weakly labeled videos, πŸ“° TIFS

  • πŸ“„ WAGCN:Adaptive graph convolutional networks for weakly supervised anomaly detection in videos, πŸ“° SPL

  • πŸ“„ SGTDT:Weakly supervised video anomaly detection via self-guided temporal discriminative transformer, πŸ“° TCYB

  • πŸ“„ MLAD:Weakly supervised anomaly detection in videos considering the openness of events, πŸ“° TITS

πŸ—“οΈ 2023

  • πŸ“„ CMRL: Look around for anomalies: weakly-supervised anomaly detection via context-motion relational learning, πŸ“° CVPR

  • πŸ“„ CBCG:Weakly supervised video anomaly detection based on cross-batch clustering guidance, πŸ“° ICME

  • πŸ“„ DMU:Dual memory units with uncertainty regulation for weakly supervised video anomaly detection, πŸ“° AAAI code

2.3.2 Spatio-Temporal Modeling

πŸ—“οΈ 2022

  • πŸ“„ STA-Net:Learning task-specific representation for video anomaly detection with spatialtemporal attention, πŸ“° ICASSP

  • πŸ“„ SSRL:Scale-aware spatio-temporal relation learning for video anomaly detection, πŸ“° ECCV

πŸ—“οΈ 2023

  • πŸ“„ LSTC:Long-short temporal co-teaching for weakly supervised video anomaly detection, πŸ“° ICME code

πŸ—“οΈ 2024

  • πŸ“„ MSIP: Learning spatio-temporal relations with multi-scale integrated perception for video anomaly detection, πŸ“° ICASSP

2.3.3 MIL-Based Refinement

πŸ—“οΈ 2019

  • πŸ“„ Social-MIL:Social mil: Interaction-aware for crowd anomaly detection, πŸ“° AVSS

πŸ—“οΈ 2022

  • πŸ“„ MCR:Multiscale continuity-aware refinement network for weakly supervised video anomaly detection, πŸ“° ICME

  • πŸ“„ BN-SVP:Bayesian nonparametric submodular video partition for robust anomaly detection, πŸ“° CVPR code

πŸ—“οΈ 2023

  • πŸ“„ NGMIL:Normality guided multiple instance learning for weakly supervised video anomaly detection, πŸ“° WACV

  • πŸ“„ UMIL:Unbiased multiple instance learning for weakly supervised video anomaly detection, πŸ“° CVPR code

  • πŸ“„ MGFN:Mgfn: Magnitude-contrastive glance-and-focus network for weakly-supervised video anomaly detection, πŸ“° AAAI code

πŸ—“οΈ 2024

  • πŸ“„ LAP:Learn suspected anomalies from event prompts for video anomaly detection, πŸ“° Arxiv

  • πŸ“„ PE-MIL: Prompt-enhanced multiple instance learning for weakly supervised video anomaly detection, πŸ“° CVPR

2.3.4 Feature Metric Learning

πŸ—“οΈ 2019

  • πŸ“„ TCN-IBL:Temporal convolutional network with complementary inner bag loss for weakly supervised anomaly detection, πŸ“° ICIP

πŸ—“οΈ 2021

  • πŸ“„ CTR:Learning causal temporal relation and feature discrimination for anomaly detection, πŸ“° TIP

πŸ—“οΈ 2022

  • πŸ“„ SGTDT:Weakly supervised video anomaly detection via self-guided temporal discriminative transformer, πŸ“° TCYB

πŸ—“οΈ 2023

  • πŸ“„ BN-WVAD:Batchnorm-based weakly supervised video anomaly detection, πŸ“° Arxiv code

  • πŸ“„ PEL4VAD:Learning prompt-enhanced context features for weakly-supervised video anomaly detection, πŸ“° Arxiv code

  • πŸ“„ TeD-SPAD:Ted-spad: Temporal distinctiveness for self-supervised privacy-preservation for video anomaly detection, πŸ“° ICCV code

  • πŸ“„ CLAWS+:Clustering aided weakly supervised training to detect anomalous events in surveillance videos, πŸ“° TNNLS

πŸ—“οΈ 2024

  • πŸ“„ LAP:Learn suspected anomalies from event prompts for video anomaly detection, πŸ“° Arxiv

2.3.5 Knowledge Distillation

πŸ—“οΈ 2022

  • πŸ“„ MACIL-SD:Modality-aware contrastive instance learning with self-distillation for weakly-supervised audio-visual violence detection, πŸ“° ACM MM code

πŸ—“οΈ 2023

  • πŸ“„ DPK:Distilling privileged knowledge for anomalous event detection from weakly labeled videos, πŸ“° TNNLS

2.3.6 Leveraging Large Models:

πŸ—“οΈ 2023

  • πŸ“„ TEVAD:Tevad: Improved video anomaly detection with captions, πŸ“° CVPRW

  • πŸ“„ CLIP-TSA:Clip-tsa: Clip-assisted temporal self-attention for weakly-supervised video anomaly detection, πŸ“° ICIP code

πŸ—“οΈ 2024

  • πŸ“„ UCA:Towards surveillance video-and-language understanding: New dataset baselines and challenges, πŸ“° CVPR homepage

  • πŸ“„ VadCLIP:Vadclip: Adapting vision-language models for weakly supervised video anomaly detection, πŸ“° AAAI code

  • πŸ“„ Holmes-VAD:Holmes-vad: Towards unbiased and explainable video anomaly detection via multi-modal llm, πŸ“° Arxiv code homepage

  • πŸ“„ VADor w LSTC:Video anomaly detection and explanation via large language models, πŸ“° Arxiv

  • πŸ“„ LAVAD: Harnessing large language models for training-free video anomaly detection, πŸ“° CVPR code homepage

  • πŸ“„ STPrompt:Weakly supervised video anomaly detection and localization with spatio-temporal prompts, πŸ“° ACM MM

2.4 Model Output

2.4.1 Frame Level

2.4.2 Pixel Level

πŸ—“οΈ 2019

  • πŸ“„ Background-bias:Exploring background-bias for anomaly detection in surveillance videos, πŸ“° ACM MM code

πŸ—“οΈ 2021

  • πŸ“„ WSSTAD:Weakly-supervised spatio-temporal anomaly detection in surveillance video, πŸ“° IJCAI

3. Fully Supervised Video Anomaly Detection

3.1 Appearance Input

πŸ—“οΈ 2016

  • πŸ“„ TS-LSTM:Multi-stream deep networks for person to person violence detection in videos, πŸ“° CCPR

πŸ—“οΈ 2017

  • πŸ“„ FightNet:Violent interaction detection in video based on deep learning, πŸ“° JPCS

πŸ—“οΈ 2019

  • πŸ“„ Sub-Vio:Toward subjective violence detection in videos, πŸ“° ICASSP

  • πŸ“„ CCTV-Fights:Detection of real-world fights in surveillance videos, πŸ“° ICASSP homepage

3.2 Motion Input

πŸ—“οΈ 2016

  • πŸ“„ TS-LSTM:Multi-stream deep networks for person to person violence detection in videos, πŸ“° CCPR

πŸ—“οΈ 2017

  • πŸ“„ ConvLSTM:Learning to detect violent videos using convolutional long short-term memory, πŸ“° AVSS code

πŸ—“οΈ 2018

  • πŸ“„ BiConvLSTM:Bidirectional convolutional lstm for the detection of violence in videos, πŸ“° ECCVW

πŸ—“οΈ 2020

  • πŸ“„ MM-VD:Multimodal violence detection in videos, πŸ“° ICASSP

3.3 Skeleton Input

πŸ—“οΈ 2018

  • πŸ“„ DSS:Eye in the sky: Real-time drone surveillance system for violent individuals identification using scatternet hybrid deep learning network, πŸ“° CVPRW

    πŸ—“οΈ 2020

  • πŸ“„ SPIL:Human interaction learning on 3d skeleton point clouds for video violence recognition, πŸ“° ECCV

3.4 Audio Input

πŸ—“οΈ 2020

  • πŸ“„ MM-VD:Multimodal violence detection in videos, πŸ“° ICASSP

3.5 Hybrid Input

πŸ—“οΈ 2021

  • πŸ“„ FlowGatedNet:Rwf-2000: an open large scale video database for violence detection, πŸ“° ICPR code

πŸ—“οΈ 2022

  • πŸ“„ MutualDis:Multimodal violent video recognition based on mutual distillation, πŸ“° PRCV

πŸ—“οΈ 2023

  • πŸ“„ HSCD: Human skeletons and change detection for efficient violence detection in surveillance videos, πŸ“° CVIU code

4. Unsupervised Video Anomaly Detection

4.1 Pseudo Label Based Paradigm

πŸ—“οΈ 2018

  • πŸ“„ DAW:Detecting abnormality without knowing normality: A two-stage approach for unsupervised video abnormal event detection, πŸ“° ACM MM

πŸ—“οΈ 2020

  • πŸ“„ STDOR:Self-trained deep ordinal regression for end-to-end video anomaly detection, πŸ“° CVPR

πŸ—“οΈ 2022

  • πŸ“„ GCL:Generative cooperative learning for unsupervised video anomaly detection, πŸ“° CVPR

πŸ—“οΈ 2024

  • πŸ“„ C2FPL:A coarse-to-fine pseudo-labeling (c2fpl) framework for unsupervised video anomaly detection, πŸ“° WACV code

4.2 Change Detection Based Paradigm

πŸ—“οΈ 2016

  • πŸ“„ ADF:A discriminative framework for anomaly detection in large videos, πŸ“° ECCV code

πŸ—“οΈ 2017

  • πŸ“„ Unmasking:Unmasking the abnormal events in video, πŸ“° ICCV

πŸ—“οΈ 2018

  • πŸ“„ MC2ST:Classifier two sample test for video anomaly detections, πŸ“° BMVC code

πŸ—“οΈ 2022

  • πŸ“„ TMAE:Detecting anomalous events from unlabeled videos via temporal masked autoencoding, πŸ“° ICME

4.3 Others

πŸ—“οΈ 2021

  • πŸ“„ DUAD:Deep unsupervised anomaly detection, πŸ“° WACV

πŸ—“οΈ 2022

  • πŸ“„ CIL:A causal inference look at unsupervised video anomaly detection, πŸ“° AAAI

  • πŸ“„ LBR-SPR:Deep anomaly discovery from unlabeled videos via normality advantage and self-paced refinement, πŸ“° CVPR code

5. Open Set Supervised Video Anomaly Detection

5.1 Open-Set VAD

πŸ—“οΈ 2019

  • πŸ“„ MLEP:Margin learning embedded prediction for video anomaly detection with a few anomalies, πŸ“° IJCAI code

πŸ—“οΈ 2022

  • πŸ“„ UBnormal:Ubnormal: New benchmark for supervised open-set video anomaly detection, πŸ“° CVPR code

  • πŸ“„ OSVAD:Towards open set video anomaly detection, πŸ“° ECCV

πŸ—“οΈ 2024

  • πŸ“„ OVVAD:Open-vocabulary video anomaly detection, πŸ“° CVPR

5.2 Few-Shot VAD

πŸ—“οΈ 2020

  • πŸ“„ FSSA:Few-shot scene-adaptive anomaly detection, πŸ“° ECCV code

πŸ—“οΈ 2021

  • πŸ“„ AADNet:Adaptive anomaly detection network for unseen scene without fine-tuning, πŸ“° PRCV

πŸ—“οΈ 2022

  • πŸ“„ VADNet:Boosting variational inference with margin learning for few-shot scene-adaptive anomaly detection, πŸ“° TCSVT code

πŸ—“οΈ 2023

  • πŸ“„ zxVAD:Cross-domain video anomaly detection without target domain adaptation, πŸ“° WACV

Performance Comparison

The following tables are the performance comparison of semi-supervised VAD, weakly supervised VAD, fully supervised VAD, and unsupervised VAD methods as reported in the literature. For semi-supervised, weakly supervised, and unsupervised VAD methods, the evaluation metric used is AUC (%) and AP ( XD-Violence, %), while for fully supervised VAD methods, the metric is Accuracy (%).

  • Quantitative Performance Comparison of Semi-supervised Methods on Public Datasets.
Method Publication Methodology Ped1 Ped2 Avenue ShanghaiTech UBnormal
AMDN BMVC 2015 One-class classifier 92.1 90.8 - - -
ConvAE CVPR 2016 Reconstruction 81.0 90.0 72.0 - -
STAE ACMMM 2017 Hybrid 92.3 91.2 80.9 - -
StackRNN ICCV 2017 Sparse coding - 92.2 81.7 68.0 -
FuturePred CVPR 2018 Prediction 83.1 95.4 85.1 72.8 -
DeepOC TNNLS 2019 One-class classifier 83.5 96.9 86.6 - -
MemAE ICCV 2019 Reconstruction - 94.1 83.3 71.2 -
AnoPCN ACMMM 2019 Prediction - 96.8 86.2 73.6 -
ObjectAE CVPR 2019 One-class classifier - 97.8 90.4 84.9 -
BMAN TIP 2019 Prediction - 96.6 90.0 76.2 -
sRNN-AE TPAMI 2019 Sparse coding - 92.2 83.5 69.6 -
ClusterAE ECCV 2020 Reconstruction - 96.5 86.0 73.3 -
MNAD CVPR 2020 Reconstruction - 97.0 88.5 70.5 -
VEC ACMMM 2020 Cloze test - 97.3 90.2 74.8 -
AMMC-Net AAAI 2021 Prediction - 96.6 86.6 73.7 -
MPN CVPR 2021 Prediction 85.1 96.9 89.5 73.8 -
HF$^2$-VAD ICCV 2021 Hybrid - 99.3 91.1 76.2 -
BAF TPAMI 2021 One-class classifier 98.7 92.3 82.7 59.3
Multitask CVPR 2021 Multiple tasks - 99.8 92.8 90.2 -
F$^2$PN TPAMI 2022 Prediction 84.3 96.2 85.7 73.0 -
DLAN-AC ECCV 2022 Reconstruction - 97.6 89.9 74.7 -
BDPN AAAI 2022 Prediction - 98.3 90.3 78.1 -
CAFÉ ACMMM 2022 Prediction - 98.4 92.6 77.0 -
STJP ECCV 2022 Jigsaw puzzle - 99.0 92.2 84.3 56.4
MPT ICCV 2023 Multiple tasks - 97.6 90.9 78.8 -
HSC CVPR 2023 Hybrid - 98.1 93.7 83.4 -
LERF AAAI 2023 Predicition - 99.4 91.5 78.6 -
DMAD CVPR 2023 Reconstruction - 99.7 92.8 78.8 -
EVAL CVPR 2023 Interpretable learning - - 86.0 76.6 -
FBSC-AE CVPR 2023 Prediction - - 86.8 79.2 -
FPDM ICCV 2023 Prediction - - 90.1 78.6 62.7
PFMF CVPR 2023 Multiple tasks - - 93.6 85.0 -
STG-NF ICCV 2023 Gaussian classifier - - - 85.9 71.8
AED-MAE CVPR 2024 Patch inpainting - 95.4 91.3 79.1 58.5
SSMCTB TPAMI 2024 Patch inpainting - - 91.6 83.7 -
  • Quantitative Performance Comparison of Weakly Supervised Methods on Public Datasets.

    Method Publication Feature UCF-Crime XD-Violence ShanghaiTech TAD
    DeepMIL CVPR 2018 C3D(RGB) 75.40 - - -
    GCN CVPR 2019 TSN(RGB) 82.12 - 84.44 -
    HLNet ECCV 2020 I3D(RGB) 82.44 75.41 - -
    CLAWS ECCV 2020 C3D(RGB) 83.03 - 89.67 -
    MIST CVPR 2021 I3D(RGB) 82.30 - 94.83 -
    RTFM ICCV 2021 I3D(RGB) 84.30 77.81 97.21 -
    CTR TIP 2021 I3D(RGB) 84.89 75.90 97.48 -
    MSL AAAI 2022 VideoSwin(RGB) 85.62 78.59 97.32 -
    S3R ECCV 2022 I3D(RGB) 85.99 80.26 97.48 -
    SSRL ECCV 2022 I3D(RGB) 87.43 - 97.98 -
    CMRL CVPR 2023 I3D(RGB) 86.10 81.30 97.60 -
    CUPL CVPR 2023 I3D(RGB) 86.22 81.43 - 91.66
    MGFN AAAI 2023 VideoSwin(RGB) 86.67 80.11 - -
    UMIL CVPR 2023 CLIP 86.75 - - 92.93
    DMU AAAI 2023 I3D(RGB) 86.97 81.66 - -
    PE-MIL CVPR 2024 I3D(RGB) 86.83 88.05 98.35 -
    TPWNG CVPR 2024 CLIP 87.79 83.68 - -
    VadCLIP AAAI 2024 CLIP 88.02 84.51 - -
    STPrompt ACMMM 2024 CLIP 88.08 - 97.81 -
  • Quantitative Performance Comparison of Fully Supervised Methods on Public Datasets.

    Method Publication Model Input Hockey Fights Violent-Flows RWF-2000 Crowed Violence
    TS-LSTM PR 2016 RGB+Flow 93.9 - - -
    FightNet JPCS 2017 RGB+Flow 97.0 - - -
    ConvLSTM AVSS 2017 Frame Difference 97.1 94.6 - -
    BiConvLSTM ECCVW 2018 Frame Difference 98.1 96.3 - -
    SPIL ECCV 2020 Skeleton 96.8 - 89.3 94.5
    FlowGatedNet ICPR 2020 RGB+Flow 98.0 - 87.3 88.9
    X3D AVSS 2022 RGB - 98.0 94.0 -
    HSCD CVIU 2023 Skeleton+Frame Difference 94.5 - 90.3 94.3
  • Quantitative Performance Comparison of Unsupervised Methods on Public Datasets.

    Method Publication Methodology Avenue Subway Exit Ped1 Ped2 ShaihaiTech UMN
    ADF ECCV 2016 Change detection 78.3 82.4 - - - 91.0
    Unmasking ICCV 2017 Change detection 80.6 86.3 68.4 82.2 - 95.1
    MC2ST BMVC 2018 Change detection 84.4 93.1 71.8 87.5 - -
    DAW ACMMM 2018 Pseudo label 85.3 84.5 77.8 96.4 - -
    STDOR CVPR 2020 Pseudo label - 92.7 71.7 83.2 - 97.4
    TMAE ICME 2022 Change detection 89.8 - 75.7 94.1 71.4 -
    CIL AAAI 2022 Others 90.3 97.6 84.9 99.4 - 100
    LBR-SPR CVPR 2022 Others 92.8 - 81.1 97.2 72.6 -

Citation

If you find our work useful, please cite our paper:

@article{wu2024deep,
  title={Deep Learning for Video Anomaly Detection: A Review},
  author={Wu, Peng and Pan, Chengyu and Yan, Yuting and Pang, Guansong and Wang, Peng and Zhang, Yanning},
  journal={arXiv preprint arXiv:xxxxx},
  year={2024}
}