-
Notifications
You must be signed in to change notification settings - Fork 13
/
index.html
93 lines (80 loc) · 4.01 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
<!DOCTYPE html>
<html lang="en">
<title>IIC</title>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://www.w3schools.com/w3css/4/w3.css">
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Lato">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">
<style>
body {font-family: "Lato", sans-serif}
.mySlides {display: none}
</style>
<body>
<!-- The Band Section -->
<div class="w3-container w3-content w3-center w3-padding-64" style="max-width:900px" id="band">
<h2 class="w3-wide">Self-supervised Video Representation Learning
Using Inter-intra Contrastive Framework</h2>
<div class="w3-row w3-padding-4">
<div class="w3-third">
<p><font size=4>Li TAO </font></p>
</div>
<div class="w3-third">
<p><font size=4>Xueting Wang</font></p>
</div>
<div class="w3-third">
<p><font size=4>Toshihiko Yamasaki </font></p>
</div>
</div>
<div class="w3-row w3-padding-4">
<div class="w3-half">
<p><font size=4>Paper <a href="http://arxiv.org/abs/2008.02531" target="view_window">[arXiv]</a></font></p>
</div>
<div class="w3-half">
<p><font size=4>Code <a href="https://github.com/BestJuly/Inter-intra-video-contrastive-learning" target="view_window">[github]</a></font></p>
</div>
</div>
<div class="w3-row w3-padding-32">
<img src="./fig/general.png" class="w3-round w3-margin-bottom" alt="Random Name" style="width:60%">
<p class="w3-justify">Figure 1. General idea of proposed method. Given video 𝑥𝑖 ,
different views of this video are treated as positives, and
those features are constrained to be close to each other. Data
from other videos are treated as negatives. Temporal relations
in the anchor view will be broken down to generate
intra-negative samples, which are also treated as negatives
to help the model learn temporal information.</p>
</div>
<h2 class="w3-wide">Abstract</h2>
<p class="w3-justify">We propose a self-supervised method to learn feature representations
from videos.Astandard approach in traditional self-supervised
methods uses positive-negative data pairs to train with contrastive
learning strategy. In such a case, different modalities of the same
video are treated as positives and video clips from a different video
are treated as negatives. Because the spatio-temporal information is
important for video representation, we extend the negative samples
by introducing intra-negative samples, which are transformed from
the same anchor video by breaking temporal relations in video
clips. With the proposed inter-intra contrastive framework, we
can train spatio-temporal convolutional networks to learn video
representations. There are many flexible options in our proposed
framework and we conduct experiments by using several different
configurations. Evaluations are conducted on video retrieval and
video recognition tasks using the learned video representation. Our
proposed methods outperform current state-of-the-art results by a
large margin, such as 16.7% and 9.5% points improvements in top-
1 accuracy on UCF101 and HMDB51 datasets for video retrieval,
respectively. For video recognition, improvements can also be obtained
on these two benchmark datasets.</p>
<div class="w3-row w3-padding-32">
<img src="./fig/generate_intra.png" class="w3-round w3-margin-bottom" alt="Random Name" style="width:40%">
<p class="w3-center">Figure 2. Two ways to generate intra-negative samples.</p>
</div>
<div class="w3-row w3-padding-32">
<img src="./fig/framework.png" class="w3-round w3-margin-bottom" alt="Random Name" style="width:90%">
<p class="w3-center">Figure 3. Inter-intra contrastive learning framework.</p>
</div>
</div>
<!-- End Page Content -->
</div>
</body>
</html>