fix: initialize bytes up to `len` of `AlignedBuf`. #301

ritchie46 · 2023-07-26T13:33:27Z

It is UB to create Vec with unitialized bytes.

This fixes that by ensuring we only set_len to the length we have written.

Adding a few dbg! statements showing the len and capacity whilst reading this file showed they where off.

[/home/ritchie46/Downloads/simd-json/src/lib.rs:492] len = 1432
[/home/ritchie46/Downloads/simd-json/src/lib.rs:492] input_buffer.capacity = 1496
[/home/ritchie46/Downloads/simd-json/src/lib.rs:492] len = 1431
[/home/ritchie46/Downloads/simd-json/src/lib.rs:492] input_buffer.capacity = 1495

Reading that file in valgrind also showed that we read the uninitialized bytes confirming UB.

Valgrind output:

==180409== Memcheck, a memory error detector
==180409== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==180409== Using Valgrind-3.20.0 and LibVEX; rerun with -h for copyright info
==180409== Command: ./target/debug/memcheck
==180409== 
==180409== Conditional jump or move depends on uninitialised value(s)
==180409==    at 0x363B96: parse_str_ (deser.rs:130)
==180409==    by 0x363B96: simd_json::stage2::<impl simd_json::Deserializer>::build_tape (stage2.rs:414)
==180409==    by 0x38391C: simd_json::Deserializer::from_slice_with_buffers (lib.rs:507)
==180409==    by 0x383425: simd_json::Deserializer::from_slice_with_buffer (lib.rs:463)
==180409==    by 0x383312: simd_json::Deserializer::from_slice (lib.rs:443)
==180409==    by 0x35145B: simd_json::value::borrowed::to_value (borrowed.rs:49)
==180409==    by 0x31596D: polars_json::ndjson::file::parse_value (file.rs:91)
==180409==    by 0x2DCB58: polars_json::ndjson::file::infer (file.rs:117)
==180409==    by 0x2CD455: polars_io::ndjson::core::CoreJsonReader::new (core.rs:175)
==180409==    by 0x2B5911: <polars_io::json::JsonReader<R> as polars_io::SerReader<R>>::finish (mod.rs:272)
==180409==    by 0x2B4024: memcheck::issue::run (issue.rs:11)
==180409==    by 0x2B3E2A: memcheck::main::{{closure}} (main.rs:16)
==180409==    by 0x2B3D8D: memcheck::main (main.rs:15)
==180409==

Background

I got to this because I have very strange bugs upstream in polars which I cannot replicate and only occur when we compile with fat linking and all optimizations. This made me suspect UB.

pola-rs/polars#9791

pola-rs/polars#10034

ritchie46 · 2023-07-26T13:37:45Z

Looking more into it.

Licenser · 2023-07-26T13:43:14Z

I think the way around this would be to zero the data from len to len + SIMDJSON_PADDING * 2

OTOH it's while reading uninitialized data is UB we don't use the uninitialized data beyond len, the bytes are only read because we read a full register at a time and that might go beyond len but ignore data beyond len.

Not initializing the extra data is an optimisation (it removes a few memory writes)

ritchie46 · 2023-07-26T13:48:45Z

OTOH it's while reading uninitialized data is UB we don't use the uninitialized data beyond len

This doesn't matter. As this talk of Ralf Jung explains. Anything that is created as UB can lead to anything because the compiler is allowed to do so. We must just follow the rules: https://www.youtube.com/watch?v=svR0p6fSUYY

Valgrind also confirmed the UB in this case.

I think the way around this would be to zero the data from len to len + SIMDJSON_PADDING * 2

Yes, that is what I did now by alloc_zeroed. There are two strategies we can use.

alloc_zeroed and let the OS initialize memory.
we initialize all memory behind len ourselves.

In any case, we must initialize memory as now we have UB and there are strange bugs. :')

ritchie46 · 2023-07-26T13:52:58Z

@Licenser, I went for 2.

ritchie46 · 2023-07-26T13:54:11Z

Happy valgrind run now:

==189407== Memcheck, a memory error detector
==189407== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==189407== Using Valgrind-3.20.0 and LibVEX; rerun with -h for copyright info
==189407== Command: ./target/debug/memcheck
==189407==

Licenser · 2023-07-26T13:59:08Z

zeroing the entire buffer is something we really want to avoid, it's not needed and will really slow down the parsing. Let me explain what happens:

When reading the JSON data we read a SIMD full register at a time, (usually 256 bit) for that two things are important:

the data is correctly aligned
the memory we have is a multiple of the register size

if 1) isn't true reads will be slower as memory regions and cache don't align nicely meaning we'll have more rights.
if 2) isn't true we're at the risk of reading "foreign" memory or overlapping on a page border and crash

The way we deal with that is by allocating a new region of memory that is guaranteed to be aligned and while not being exactly a multiple has at least one extra register worth of space.

The algorithm, being coming from C-land terminates on a zero-byte, so when allocating the new buffer, we terminate it with that zero-byte.

So for example, if the JSON is 40 bytes long, we'd allocate a buffer of say 64 bytes (32 extra bytes to get over the 32 of the register) then write a zero at byte 41 (behind the end of).

The option of learning all memory behind 40 (in this example) with 0's seems like a good approach, the prob with the clear all is that we'll effectively cause ~ 2x as many memory writes as needed once for zeroing the entire data, once for then copying over the original.

Sorry for the long ramble before I saw your comments :D, I'll leave it in. before we merge this lets see I there is a better option then writing a number of 0 bytes, we know the data is aligned and we know we've 2 SIMD registers with to fill with 0's so perhaps we can reduce the writes larger chunks? ('ve not thrown this in godbolt to see how rust optimizes the writes)

ritchie46 · 2023-07-26T14:06:28Z

Right, I understand your issue with alloc_zeroed, I removed that. That would indeed do a double initialize.

The current PR only write the remaining bytes. If the capacity is correctly created, this is only a few bytes.

We could also say that we write the minimum of capacity - len and simd_register_width?

we know the data is aligned and we know we've 2 SIMD registers with to fill with 0's so perhaps we can reduce the writes larger chunks? ('ve not thrown this in godbolt to see how rust optimizes the writes)

Sorry, what do you mean?

ritchie46 · 2023-07-26T17:43:26Z

The failing tests seem related to SIMD (unrelated to this PR).

Licenser · 2023-07-26T19:19:32Z

I checked. It defaults to memset instead of a loop which should optimize to bigger writes (https://rust.godbolt.org/z/edWzh33ss) so 👍 two small optimisations, we don't need the if around the loop, and we don't need to set 0 for len separately :) (see the link)

edit: with removing the extra write and if I'm good to merge :) thanks!

ritchie46 · 2023-07-26T19:36:56Z

Great, shall remove the branch tomorrow.

Licenser · 2023-07-26T19:39:06Z

yay! I'll merge and release it then :)

Licenser

Awesome, thank you :) nice catch, nice fix 👍

Licenser · 2023-07-27T14:21:12Z

released as v0.10.4 :) thanks again for the contribution

fix: don't create Vec with unitialized bytes

86c6dee

ritchie46 closed this Jul 26, 2023

alloc_zeroed

bb976d9

ritchie46 reopened this Jul 26, 2023

manually init

c3cc2de

typo

3ba7c9c

ritchie46 mentioned this pull request Jul 26, 2023

fix(rust, python): fix ub in simd-json pola-rs/polars#10093

Merged

ritchie46 changed the title ~~fix: don't create Vec with unitialized bytes~~ fix: initialize bytes up to len of AlignedBuf. Jul 26, 2023

single memset

946b316

Licenser approved these changes Jul 27, 2023

View reviewed changes

Licenser merged commit cb53426 into simd-lite:main Jul 27, 2023
108 of 180 checks passed

ritchie46 deleted the initialize branch July 31, 2023 11:24

ritchie46 restored the initialize branch July 31, 2023 16:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: initialize bytes up to `len` of `AlignedBuf`. #301

fix: initialize bytes up to `len` of `AlignedBuf`. #301

ritchie46 commented Jul 26, 2023

ritchie46 commented Jul 26, 2023

Licenser commented Jul 26, 2023

ritchie46 commented Jul 26, 2023 •

edited

Loading

ritchie46 commented Jul 26, 2023

ritchie46 commented Jul 26, 2023

Licenser commented Jul 26, 2023

ritchie46 commented Jul 26, 2023

ritchie46 commented Jul 26, 2023

Licenser commented Jul 26, 2023 •

edited

Loading

ritchie46 commented Jul 26, 2023

Licenser commented Jul 26, 2023

Licenser left a comment

Licenser commented Jul 27, 2023

fix: initialize bytes up to len of AlignedBuf. #301

fix: initialize bytes up to len of AlignedBuf. #301

Conversation

ritchie46 commented Jul 26, 2023

Background

ritchie46 commented Jul 26, 2023

Licenser commented Jul 26, 2023

ritchie46 commented Jul 26, 2023 • edited Loading

ritchie46 commented Jul 26, 2023

ritchie46 commented Jul 26, 2023

Licenser commented Jul 26, 2023

ritchie46 commented Jul 26, 2023

ritchie46 commented Jul 26, 2023

Licenser commented Jul 26, 2023 • edited Loading

ritchie46 commented Jul 26, 2023

Licenser commented Jul 26, 2023

Licenser left a comment

Choose a reason for hiding this comment

Licenser commented Jul 27, 2023

fix: initialize bytes up to `len` of `AlignedBuf`. #301

fix: initialize bytes up to `len` of `AlignedBuf`. #301

ritchie46 commented Jul 26, 2023 •

edited

Loading

Licenser commented Jul 26, 2023 •

edited

Loading