-
Notifications
You must be signed in to change notification settings - Fork 3
/
adaptive.log
108 lines (108 loc) · 9.19 KB
/
adaptive.log
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
begin_time: 1582645783.547678
| epoch 1 | 200/ 2983 batches | lr 20.00 | ms/batch 9.18 | loss 7.37 | ppl 1581.53
| epoch 1 | 400/ 2983 batches | lr 20.00 | ms/batch 8.96 | loss 6.67 | ppl 784.97
| epoch 1 | 600/ 2983 batches | lr 20.00 | ms/batch 8.97 | loss 6.34 | ppl 569.51
| epoch 1 | 800/ 2983 batches | lr 20.00 | ms/batch 8.96 | loss 6.19 | ppl 489.07
| epoch 1 | 1000/ 2983 batches | lr 20.00 | ms/batch 9.01 | loss 6.06 | ppl 429.07
| epoch 1 | 1200/ 2983 batches | lr 20.00 | ms/batch 8.90 | loss 6.02 | ppl 410.11
| epoch 1 | 1400/ 2983 batches | lr 20.00 | ms/batch 8.74 | loss 5.92 | ppl 374.21
| epoch 1 | 1600/ 2983 batches | lr 20.00 | ms/batch 8.93 | loss 5.94 | ppl 381.50
| epoch 1 | 1800/ 2983 batches | lr 20.00 | ms/batch 8.95 | loss 5.79 | ppl 326.52
| epoch 1 | 2000/ 2983 batches | lr 20.00 | ms/batch 9.03 | loss 5.78 | ppl 325.16
| epoch 1 | 2200/ 2983 batches | lr 20.00 | ms/batch 9.04 | loss 5.68 | ppl 292.24
| epoch 1 | 2400/ 2983 batches | lr 20.00 | ms/batch 8.99 | loss 5.67 | ppl 288.95
| epoch 1 | 2600/ 2983 batches | lr 20.00 | ms/batch 8.95 | loss 5.65 | ppl 284.40
| epoch 1 | 2800/ 2983 batches | lr 20.00 | ms/batch 9.00 | loss 5.55 | ppl 256.46
-----------------------------------------------------------------------------------------
| end of epoch 1 | time: 28.68s | valid loss 5.56 | valid ppl 259.24
-----------------------------------------------------------------------------------------
| epoch 2 | 200/ 2983 batches | lr 20.00 | ms/batch 9.67 | loss 5.52 | ppl 249.33
| epoch 2 | 400/ 2983 batches | lr 20.00 | ms/batch 9.60 | loss 5.48 | ppl 240.66
| epoch 2 | 600/ 2983 batches | lr 20.00 | ms/batch 9.59 | loss 5.34 | ppl 207.96
| epoch 2 | 800/ 2983 batches | lr 20.00 | ms/batch 9.60 | loss 5.36 | ppl 211.90
| epoch 2 | 1000/ 2983 batches | lr 20.00 | ms/batch 9.67 | loss 5.33 | ppl 207.30
| epoch 2 | 1200/ 2983 batches | lr 20.00 | ms/batch 9.63 | loss 5.33 | ppl 206.66
| epoch 2 | 1400/ 2983 batches | lr 20.00 | ms/batch 9.63 | loss 5.34 | ppl 208.92
| epoch 2 | 1600/ 2983 batches | lr 20.00 | ms/batch 9.63 | loss 5.40 | ppl 220.61
| epoch 2 | 1800/ 2983 batches | lr 20.00 | ms/batch 9.64 | loss 5.26 | ppl 191.61
| epoch 2 | 2000/ 2983 batches | lr 20.00 | ms/batch 9.66 | loss 5.27 | ppl 194.02
| epoch 2 | 2200/ 2983 batches | lr 20.00 | ms/batch 9.66 | loss 5.18 | ppl 178.47
| epoch 2 | 2400/ 2983 batches | lr 20.00 | ms/batch 9.68 | loss 5.21 | ppl 182.31
| epoch 2 | 2600/ 2983 batches | lr 20.00 | ms/batch 9.65 | loss 5.22 | ppl 184.32
| epoch 2 | 2800/ 2983 batches | lr 20.00 | ms/batch 9.65 | loss 5.13 | ppl 169.78
-----------------------------------------------------------------------------------------
| end of epoch 2 | time: 30.67s | valid loss 5.34 | valid ppl 208.12
-----------------------------------------------------------------------------------------
| epoch 3 | 200/ 2983 batches | lr 20.00 | ms/batch 9.63 | loss 5.16 | ppl 173.63
| epoch 3 | 400/ 2983 batches | lr 20.00 | ms/batch 9.60 | loss 5.16 | ppl 174.50
| epoch 3 | 600/ 2983 batches | lr 20.00 | ms/batch 9.61 | loss 5.01 | ppl 149.47
| epoch 3 | 800/ 2983 batches | lr 20.00 | ms/batch 9.58 | loss 5.05 | ppl 155.86
| epoch 3 | 1000/ 2983 batches | lr 20.00 | ms/batch 9.57 | loss 5.05 | ppl 155.96
| epoch 3 | 1200/ 2983 batches | lr 20.00 | ms/batch 9.57 | loss 5.06 | ppl 157.06
| epoch 3 | 1400/ 2983 batches | lr 20.00 | ms/batch 9.56 | loss 5.09 | ppl 161.79
| epoch 3 | 1600/ 2983 batches | lr 20.00 | ms/batch 9.59 | loss 5.15 | ppl 171.77
| epoch 3 | 1800/ 2983 batches | lr 20.00 | ms/batch 9.58 | loss 5.02 | ppl 151.51
| epoch 3 | 2000/ 2983 batches | lr 20.00 | ms/batch 9.60 | loss 5.04 | ppl 155.23
| epoch 3 | 2200/ 2983 batches | lr 20.00 | ms/batch 9.58 | loss 4.96 | ppl 142.48
| epoch 3 | 2400/ 2983 batches | lr 20.00 | ms/batch 9.58 | loss 4.99 | ppl 146.20
| epoch 3 | 2600/ 2983 batches | lr 20.00 | ms/batch 9.57 | loss 5.01 | ppl 149.56
| epoch 3 | 2800/ 2983 batches | lr 20.00 | ms/batch 9.59 | loss 4.94 | ppl 139.34
-----------------------------------------------------------------------------------------
| end of epoch 3 | time: 30.43s | valid loss 5.21 | valid ppl 183.21
-----------------------------------------------------------------------------------------
| epoch 4 | 200/ 2983 batches | lr 20.00 | ms/batch 9.69 | loss 4.97 | ppl 144.18
| epoch 4 | 400/ 2983 batches | lr 20.00 | ms/batch 9.58 | loss 4.98 | ppl 146.15
| epoch 4 | 600/ 2983 batches | lr 20.00 | ms/batch 9.60 | loss 4.83 | ppl 125.41
| epoch 4 | 800/ 2983 batches | lr 20.00 | ms/batch 9.65 | loss 4.88 | ppl 132.25
| epoch 4 | 1000/ 2983 batches | lr 20.00 | ms/batch 9.66 | loss 4.89 | ppl 132.58
| epoch 4 | 1200/ 2983 batches | lr 20.00 | ms/batch 9.65 | loss 4.90 | ppl 134.17
| epoch 4 | 1400/ 2983 batches | lr 20.00 | ms/batch 9.65 | loss 4.93 | ppl 138.52
| epoch 4 | 1600/ 2983 batches | lr 20.00 | ms/batch 9.64 | loss 4.99 | ppl 147.39
| epoch 4 | 1800/ 2983 batches | lr 20.00 | ms/batch 9.65 | loss 4.88 | ppl 131.40
| epoch 4 | 2000/ 2983 batches | lr 20.00 | ms/batch 9.64 | loss 4.91 | ppl 135.39
| epoch 4 | 2200/ 2983 batches | lr 20.00 | ms/batch 9.65 | loss 4.82 | ppl 123.74
| epoch 4 | 2400/ 2983 batches | lr 20.00 | ms/batch 9.59 | loss 4.84 | ppl 127.00
| epoch 4 | 2600/ 2983 batches | lr 20.00 | ms/batch 9.08 | loss 4.87 | ppl 130.44
| epoch 4 | 2800/ 2983 batches | lr 20.00 | ms/batch 9.07 | loss 4.81 | ppl 123.07
-----------------------------------------------------------------------------------------
| end of epoch 4 | time: 30.32s | valid loss 5.10 | valid ppl 164.43
-----------------------------------------------------------------------------------------
| epoch 5 | 200/ 2983 batches | lr 20.00 | ms/batch 9.09 | loss 4.85 | ppl 127.60
| epoch 5 | 400/ 2983 batches | lr 20.00 | ms/batch 9.01 | loss 4.87 | ppl 130.76
| epoch 5 | 600/ 2983 batches | lr 20.00 | ms/batch 9.08 | loss 4.71 | ppl 111.35
| epoch 5 | 800/ 2983 batches | lr 20.00 | ms/batch 9.08 | loss 4.77 | ppl 118.07
| epoch 5 | 1000/ 2983 batches | lr 20.00 | ms/batch 9.15 | loss 4.79 | ppl 119.81
| epoch 5 | 1200/ 2983 batches | lr 20.00 | ms/batch 9.12 | loss 4.79 | ppl 120.78
| epoch 5 | 1400/ 2983 batches | lr 20.00 | ms/batch 9.07 | loss 4.83 | ppl 125.24
| epoch 5 | 1600/ 2983 batches | lr 20.00 | ms/batch 9.09 | loss 4.90 | ppl 134.17
| epoch 5 | 1800/ 2983 batches | lr 20.00 | ms/batch 9.07 | loss 4.78 | ppl 118.81
| epoch 5 | 2000/ 2983 batches | lr 20.00 | ms/batch 9.10 | loss 4.82 | ppl 124.09
| epoch 5 | 2200/ 2983 batches | lr 20.00 | ms/batch 9.15 | loss 4.72 | ppl 112.42
| epoch 5 | 2400/ 2983 batches | lr 20.00 | ms/batch 9.13 | loss 4.74 | ppl 114.95
| epoch 5 | 2600/ 2983 batches | lr 20.00 | ms/batch 9.06 | loss 4.78 | ppl 119.10
| epoch 5 | 2800/ 2983 batches | lr 20.00 | ms/batch 8.99 | loss 4.72 | ppl 112.35
-----------------------------------------------------------------------------------------
| end of epoch 5 | time: 28.89s | valid loss 5.09 | valid ppl 162.26
-----------------------------------------------------------------------------------------
| epoch 6 | 200/ 2983 batches | lr 20.00 | ms/batch 9.20 | loss 4.76 | ppl 116.96
| epoch 6 | 400/ 2983 batches | lr 20.00 | ms/batch 9.12 | loss 4.79 | ppl 120.43
| epoch 6 | 600/ 2983 batches | lr 20.00 | ms/batch 9.23 | loss 4.63 | ppl 102.45
| epoch 6 | 800/ 2983 batches | lr 20.00 | ms/batch 9.16 | loss 4.69 | ppl 108.63
| epoch 6 | 1000/ 2983 batches | lr 20.00 | ms/batch 9.19 | loss 4.71 | ppl 110.59
| epoch 6 | 1200/ 2983 batches | lr 20.00 | ms/batch 9.21 | loss 4.71 | ppl 111.34
| epoch 6 | 1400/ 2983 batches | lr 20.00 | ms/batch 9.18 | loss 4.76 | ppl 116.33
| epoch 6 | 1600/ 2983 batches | lr 20.00 | ms/batch 9.18 | loss 4.82 | ppl 124.24
| epoch 6 | 1800/ 2983 batches | lr 20.00 | ms/batch 9.14 | loss 4.71 | ppl 110.71
| epoch 6 | 2000/ 2983 batches | lr 20.00 | ms/batch 9.12 | loss 4.75 | ppl 115.72
| epoch 6 | 2200/ 2983 batches | lr 20.00 | ms/batch 9.13 | loss 4.65 | ppl 104.74
| epoch 6 | 2400/ 2983 batches | lr 20.00 | ms/batch 9.16 | loss 4.67 | ppl 107.22
| epoch 6 | 2600/ 2983 batches | lr 20.00 | ms/batch 9.17 | loss 4.71 | ppl 111.51
| epoch 6 | 2800/ 2983 batches | lr 20.00 | ms/batch 9.22 | loss 4.65 | ppl 104.99
-----------------------------------------------------------------------------------------
| end of epoch 6 | time: 29.30s | valid loss 5.04 | valid ppl 154.52
-----------------------------------------------------------------------------------------
end_time: 1582645963.4035714
time_cost: 179.85589337348938
=========================================================================================
| End of training | test loss 4.97 | test ppl 143.32
=========================================================================================