Please feel free to pull a request.
Note: ActivityNet v1.3, Kinetics-600, Moments in time, AVA will be used at ActivityNet challenge 2018
Dataset | Paper | Website | Category | #Examples | #Classes | Duration | Organizer | SOTA performance |
---|---|---|---|---|---|---|---|---|
UCF101 | Link | human action | 13,320 | 101 | <10s | UCF | 98% (DeepMind I3D) | |
HMDB51 | Link | human action | 6,766 | 51 | <10s | Brown | 80.7% (DeepMind I3D) | |
ActivityNet v1.3 | Link | human activities | ~20,000 | 200 | - | ActivityNet | 8.83% err (iBUG) | |
Charades | Link | daily human activities | 9,848 | 157 | - | AI2 | 0.3441 mAP (DeepMind I3D) | |
Kinetics | Link | human action | ~500,000 | 600 | 10s | DeepMind | - | |
Sports-1M | Link | sports | ~1 million | 478 | 5m36s | Google & Stanford | - | |
YouTube-8M | Link | visual contents | ~7 million | 4716 | 120-500s | Google Cloud | 85% GAP (WILLOW) | |
FCVID | Link | visual contents | 91,223 | 239 | 100s+ | Fudan-Columbia | - | |
Something-Something | Link | action with objects | 108,499 | 174 | ~4s | TwentyBN | - | |
Moments in Time | Link | action or activity | ~1 million | 339 | 3s | MIT-IBM Watson | - | |
SLAC | arXiv | Link | recognition and localization | 520K | 200 | ~30.6s | MIT and Facebook | - |
Dataset | Paper | Website | #Examples | Organizer | SOTA performance |
---|---|---|---|---|---|
THUMOS2014 | Link | 9.682 | UCF | - | |
ActivityNet(v1.3) | Link | ~20,000 | ActivityNet | 0.344(SJTU & Columbia ) | |
Broad Video Highlights | - | Link | 18000 | Baidu | - |
Dataset | Paper | Website | #Examples | #Classes | Organizer | SOTA performance |
---|---|---|---|---|---|---|
AVA | arXiv | Link | 57.6k | 80 | Google & Berkeley | - |
Dataset | Paper | Website | #Examples | #Classes | Organizer | SOTA performance |
---|---|---|---|---|---|---|
Jester | - | Link | 148,092 | 27 | TwentyBN | 95.34%(Ke Yang, NUDT_PDL) |
Dataset | Paper | Website | Context | #Examples | Organizer | SOTA performance |
---|---|---|---|---|---|---|
MPII-MD | Link | movie | 68,337 clips with 68,375 sentences | MPII | - | |
MSR-VTT | Link | 20 categories | 10,000 clips wth 200,000 sentences | MSR | - | |
Charades | Link | human activity | 9,848 clips wth 27,847 sentences | AI2 | - | |
Densevid | Link | event | 20k clips and 100k sentences | Stanford, ActivityNet | - |
Dataset | Paper | Website | Task | #Examples | Organizer | SOTA performance |
---|---|---|---|---|---|---|
MovieQA | Link | question-answering in movies | 408 movies & 14944 QAs | UToronto | - | |
MarioQA | Link | reasoning events in game videos | 187,757 examples with 92,874 QAs | POSTECH | - |