forked from samtools/htscodecs
-
Notifications
You must be signed in to change notification settings - Fork 1
/
NEWS
157 lines (104 loc) · 5.35 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
Release 1.2.2: 1st April 2022
-----------------------------
This release contains some fixes found during fuzzing with Clang's
memory-sanitizer. None of these are involving writing memory so there
is no possibility for code execution vulnerabilities. However some do
could access uninitialised elements in locally allocated memory, which
could leak private data if the library was used in conjunction with
other tools which don't zero sensitive data before freeing.
Bug fixes:
- The name tokeniser now validates the stored length in the data
stream matches the actual decoded length. Discovered by Taotao Gu.
- Fixed an endless loop in arith_dynamic and rans4x16pr involving
X_STRIPE with 0 stripes.
- Avoid a harmless (and wrong?) undefined behaviour sanitizer error
when calling memcpy(ptr, NULL, 0) in the name tokeniser.
- Fixed possible uninitialised memory access in
rans_uncompress_O1_4x16. If the frequency table didn't add up to
the correct amount, parts of the "fb" table were left unpopulated.
It was then possible to use these array elements in some of the rANS
calculations.
- Similarly rans_uncompress_O0 could access an uninitialised element
4095 of the decoder tables if the frequencies summed to 4095 instead
of the expected 4096.
- Improved error detection from fqzcomp's read_array function.
- Reject fqzcomp parameters with inconsistent "sel" parameters, which
could lead to uninitialised access to the model.sel range coder.
Release 1.2.1: 15th February 2022
---------------------------------
The only change in this release is a minor adjustment to the histogram
code so it works on systems with small stacks. This was detected on
Windows Mingw builds.
Release 1.2: 10th February 2022
-------------------------------
This release contains the following minor changes.
Please see the "git log" for the full details.
Improvements / changes:
- Speed up of rANS4x16 order-0. We now use a branchless encoder
renormalisation step. For complex data it's between 13 and 50%
speed up depending on compiler.
- Improve rANS4x16 compute_shift estimates. The entropy calculation
is now more accurate. This leads to more frequent use of the 10-bit
frequency mode, at an expense of up to 1% size growth.
- Speed improvements to the striped rANS mode, both encoding and
decoding. Encoder gains ~8% and decoder ~5%, but varies
considerably by compiler and data.
- Added new var_put_u64_safe and var_put_u32_safe interfaces.
These are automatically used by var_put_u64 and var_put_u32 when
near the end of the buffer, but may also be called directly.
- Small speed ups to the hist8 and hist1_4 functions.
- Minor speed up to RLE decoding.
Bug fixes:
- Work around an icc-2021 compiler bug, but also speed up the varint
encoding too (#29).
- Fix an off-by-one error in the initial size check in arith_dynamic.
This meant the very smallest of blocks could fail to decode.
Reported by Divon Lan.
- Fixed hist1_4 to also count the last byte when computing T0[].
- Fixed overly harsh bounds checking in the fqzcomp read_array
function, which meant it failed to decode some configurations.
Release 1.1.1: 6th July 2021
----------------------------
This release contains the following minor changes.
Please see the "git log" for the full details.
Improvements / changes:
- Modernised autoconf usage to avoid warnings with newer versions.
(John Marshall)
- Avoid using awk with large records, due to some systems
(e.g. Solaris / OpenIndiana) with line length limits .
(John Marshall)
- Applied Debian patch to make the library link against -lm.
Bug fixes:
- Fixed an issue with the name tokeniser when a slice (name_context)
has exactly 1 more name than the previous call. (James Bonfield)
- Removed access to an uninitialised variable in the name tokeniser
decode when given malformed data. This occurs when we use delta
encoding for the very first name. (James Bonfield, OSS-Fuzz)
- Minor fixes to distcheck and distclean targets
Release 1.0: 23rd Feb 2021
--------------------------
This marks the first non-beta release of htscodecs, following a
perioid of integration with Htslib and automated fuzzing by Google's
OSS-Fuzz program.
[Note this testing only applies to the C implementation. The
JavaScript code should still be considered as examples of the codecs,
more for purposes of understanding and clarity than as a fully
optimised and tested release.]
Since the last release (0.5) the key changes are:
- Improved support for big endian platforms
- Speed improvements to CRAM 3.0 4x8 rANS order-1 encoding.
It's between 10 and 50% faster at encoding, based on input data.
- Improved autoconf bzip2 checks and tidy up "make test" output.
- Added some more files into "make install", so that "make distcheck"
now passes.
- Replaced Travis with Cirrus-CI testing.
- Removed various C undefined behaviour, such as left shifting of
negative values and integer overflows. As far as we know these were
currently harmless on the supported platforms, but may break future
compiler optimisations.
- Fixed numerous OSS-Fuzz identified flaws. Some of these were
potential security issues such as small buffer overruns.
- Tidied up some code to prevent warnings.
- The name tokeniser now has a limit on the size of data it can encode
(10 million records). This may still be too high given the memory
it will require, so it may be reduced again.