Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: initialize bytes up to len of AlignedBuf. #301

Merged
merged 5 commits into from
Jul 27, 2023

Conversation

ritchie46
Copy link
Contributor

It is UB to create Vec with unitialized bytes.

This fixes that by ensuring we only set_len to the length we have written.

Adding a few dbg! statements showing the len and capacity whilst reading this file showed they where off.

[/home/ritchie46/Downloads/simd-json/src/lib.rs:492] len = 1432
[/home/ritchie46/Downloads/simd-json/src/lib.rs:492] input_buffer.capacity = 1496
[/home/ritchie46/Downloads/simd-json/src/lib.rs:492] len = 1431
[/home/ritchie46/Downloads/simd-json/src/lib.rs:492] input_buffer.capacity = 1495

Reading that file in valgrind also showed that we read the uninitialized bytes confirming UB.

Valgrind output:

==180409== Memcheck, a memory error detector
==180409== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==180409== Using Valgrind-3.20.0 and LibVEX; rerun with -h for copyright info
==180409== Command: ./target/debug/memcheck
==180409== 
==180409== Conditional jump or move depends on uninitialised value(s)
==180409==    at 0x363B96: parse_str_ (deser.rs:130)
==180409==    by 0x363B96: simd_json::stage2::<impl simd_json::Deserializer>::build_tape (stage2.rs:414)
==180409==    by 0x38391C: simd_json::Deserializer::from_slice_with_buffers (lib.rs:507)
==180409==    by 0x383425: simd_json::Deserializer::from_slice_with_buffer (lib.rs:463)
==180409==    by 0x383312: simd_json::Deserializer::from_slice (lib.rs:443)
==180409==    by 0x35145B: simd_json::value::borrowed::to_value (borrowed.rs:49)
==180409==    by 0x31596D: polars_json::ndjson::file::parse_value (file.rs:91)
==180409==    by 0x2DCB58: polars_json::ndjson::file::infer (file.rs:117)
==180409==    by 0x2CD455: polars_io::ndjson::core::CoreJsonReader::new (core.rs:175)
==180409==    by 0x2B5911: <polars_io::json::JsonReader<R> as polars_io::SerReader<R>>::finish (mod.rs:272)
==180409==    by 0x2B4024: memcheck::issue::run (issue.rs:11)
==180409==    by 0x2B3E2A: memcheck::main::{{closure}} (main.rs:16)
==180409==    by 0x2B3D8D: memcheck::main (main.rs:15)
==180409== 

Background

I got to this because I have very strange bugs upstream in polars which I cannot replicate and only occur when we compile with fat linking and all optimizations. This made me suspect UB.

pola-rs/polars#9791

pola-rs/polars#10034

@ritchie46 ritchie46 closed this Jul 26, 2023
@ritchie46
Copy link
Contributor Author

Looking more into it.

@Licenser
Copy link
Member

I think the way around this would be to zero the data from len to len + SIMDJSON_PADDING * 2

OTOH it's while reading uninitialized data is UB we don't use the uninitialized data beyond len, the bytes are only read because we read a full register at a time and that might go beyond len but ignore data beyond len.

Not initializing the extra data is an optimisation (it removes a few memory writes)

@ritchie46 ritchie46 reopened this Jul 26, 2023
@ritchie46
Copy link
Contributor Author

ritchie46 commented Jul 26, 2023

OTOH it's while reading uninitialized data is UB we don't use the uninitialized data beyond len

This doesn't matter. As this talk of Ralf Jung explains. Anything that is created as UB can lead to anything because the compiler is allowed to do so. We must just follow the rules: https://www.youtube.com/watch?v=svR0p6fSUYY

Valgrind also confirmed the UB in this case.

I think the way around this would be to zero the data from len to len + SIMDJSON_PADDING * 2

Yes, that is what I did now by alloc_zeroed. There are two strategies we can use.

  1. alloc_zeroed and let the OS initialize memory.
  2. we initialize all memory behind len ourselves.

In any case, we must initialize memory as now we have UB and there are strange bugs. :')

@ritchie46
Copy link
Contributor Author

@Licenser, I went for 2.

@ritchie46
Copy link
Contributor Author

Happy valgrind run now:

==189407== Memcheck, a memory error detector
==189407== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==189407== Using Valgrind-3.20.0 and LibVEX; rerun with -h for copyright info
==189407== Command: ./target/debug/memcheck
==189407== 

@Licenser
Copy link
Member

zeroing the entire buffer is something we really want to avoid, it's not needed and will really slow down the parsing. Let me explain what happens:

When reading the JSON data we read a SIMD full register at a time, (usually 256 bit) for that two things are important:

  1. the data is correctly aligned
  2. the memory we have is a multiple of the register size
  • if 1) isn't true reads will be slower as memory regions and cache don't align nicely meaning we'll have more rights.
  • if 2) isn't true we're at the risk of reading "foreign" memory or overlapping on a page border and crash

The way we deal with that is by allocating a new region of memory that is guaranteed to be aligned and while not being exactly a multiple has at least one extra register worth of space.

The algorithm, being coming from C-land terminates on a zero-byte, so when allocating the new buffer, we terminate it with that zero-byte.

So for example, if the JSON is 40 bytes long, we'd allocate a buffer of say 64 bytes (32 extra bytes to get over the 32 of the register) then write a zero at byte 41 (behind the end of).

The option of learning all memory behind 40 (in this example) with 0's seems like a good approach, the prob with the clear all is that we'll effectively cause ~ 2x as many memory writes as needed once for zeroing the entire data, once for then copying over the original.

Sorry for the long ramble before I saw your comments :D, I'll leave it in. before we merge this lets see I there is a better option then writing a number of 0 bytes, we know the data is aligned and we know we've 2 SIMD registers with to fill with 0's so perhaps we can reduce the writes larger chunks? ('ve not thrown this in godbolt to see how rust optimizes the writes)

@ritchie46
Copy link
Contributor Author

Right, I understand your issue with alloc_zeroed, I removed that. That would indeed do a double initialize.

The current PR only write the remaining bytes. If the capacity is correctly created, this is only a few bytes.

We could also say that we write the minimum of capacity - len and simd_register_width?

we know the data is aligned and we know we've 2 SIMD registers with to fill with 0's so perhaps we can reduce the writes larger chunks? ('ve not thrown this in godbolt to see how rust optimizes the writes)

Sorry, what do you mean?

@ritchie46 ritchie46 changed the title fix: don't create Vec with unitialized bytes fix: initialize bytes up to len of AlignedBuf. Jul 26, 2023
@ritchie46
Copy link
Contributor Author

The failing tests seem related to SIMD (unrelated to this PR).

@Licenser
Copy link
Member

Licenser commented Jul 26, 2023

I checked. It defaults to memset instead of a loop which should optimize to bigger writes (https://rust.godbolt.org/z/edWzh33ss) so 👍 two small optimisations, we don't need the if around the loop, and we don't need to set 0 for len separately :) (see the link)

edit: with removing the extra write and if I'm good to merge :) thanks!

@ritchie46
Copy link
Contributor Author

Great, shall remove the branch tomorrow.

@Licenser
Copy link
Member

yay! I'll merge and release it then :)

Copy link
Member

@Licenser Licenser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you :) nice catch, nice fix 👍

@Licenser Licenser merged commit cb53426 into simd-lite:main Jul 27, 2023
108 of 180 checks passed
@Licenser
Copy link
Member

released as v0.10.4 :) thanks again for the contribution

@ritchie46 ritchie46 deleted the initialize branch July 31, 2023 11:24
@ritchie46 ritchie46 restored the initialize branch July 31, 2023 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants