Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify what is UB #149

Merged
merged 21 commits into from
Aug 16, 2019
Merged
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 23 additions & 5 deletions src/what-unsafe-does.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,18 +16,35 @@ to your program. You definitely *should not* invoke Undefined Behavior.
Unlike C, Undefined Behavior is pretty limited in scope in Rust. All the core
language cares about is preventing the following things:

* Dereferencing null, dangling, or unaligned pointers
* Dereferencing (using the `*` operator on) null, dangling, or unaligned
pointers
RalfJung marked this conversation as resolved.
Show resolved Hide resolved
* Reading [uninitialized memory][]
* Breaking the [pointer aliasing rules][]
* Producing invalid primitive values:
* dangling/null references
* null `fn` pointers
* Producing invalid primitive values (either alone or as a field of a compound
RalfJung marked this conversation as resolved.
Show resolved Hide resolved
type such as `enum`/`struct`/array/tuple):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of our bickering about unions does actually wrap back around to this point:

is union { a: bool; b: bool; } = 3 "producing an invalid value as a field of a compound type"?

(we can probably gloss over this, but it is something to make clearer when we have a better answer)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The answer is "we don't know yet, see rust-lang/unsafe-code-guidelines#73".

So yes, this is a good question, and one that I would prefer we could skip over for now.

* a `bool` that isn't 0 or 1
* an undefined `enum` discriminant
RalfJung marked this conversation as resolved.
Show resolved Hide resolved
* null `fn` pointers
RalfJung marked this conversation as resolved.
Show resolved Hide resolved
* a `char` outside the ranges [0x0, 0xD7FF] and [0xE000, 0x10FFFF]
* A non-utf8 `str`
* a `!` (all values are invalid for this type)
* dangling/null/unaligned references, references that do themselves point to
invalid values, or fat references (to a dynamically sized type) with
RalfJung marked this conversation as resolved.
Show resolved Hide resolved
invalid metadata
RalfJung marked this conversation as resolved.
Show resolved Hide resolved
* a non-utf8 `str`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reword for consistency:

  • a str that isn't valid utf8

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe we want to skip this entirely because this is just a library invariant?

* an uninitialized integer (`i*`/`u*`) or floating point value (`f*`)
* an invalid library type with custom invalid values, such as a `NonNull` or
`NonZero*` that is 0
RalfJung marked this conversation as resolved.
Show resolved Hide resolved
* Unwinding into another language
* Causing a [data race][race]
RalfJung marked this conversation as resolved.
Show resolved Hide resolved
* Executing code compiled with platform features that the current platform does
RalfJung marked this conversation as resolved.
Show resolved Hide resolved
not support (see [`target_feature`])
RalfJung marked this conversation as resolved.
Show resolved Hide resolved

"Producing" a value happens any time a value is assigned, passed to a
function/primitive operation or returned from a function/primitive operation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggested reword and massive clarification:

Many have trouble accepting the consequences of invalid values, so they merit some extra discussion. The claim being made here is a very strong one, so read carefully.

A value is produced whenever it is assigned, passed to something, or returned from something. Keep in mind references get to assume their referents are valid, so you can't even create a reference to an invalid value. Additionally, uninitialized memory is always invalid, so you can't assign it to anything, pass it to anything, return it from anything, or take a reference to it. (Padding bytes are not technically part of a value's memory, and so may be left uninitialized.)

In simple and blunt terms: you cannot ever even suggest the existence of an invalid value. No, it's not ok if you "don't use" or "don't read" the value. Invalid values are instant Undefined Behaviour. The only correct way to manipulate memory that could be invalid is with raw pointers using methods like write and copy. If you want to leave a local variable or struct field uninitialized (or otherwise invalid), you must use a union or enum which clearly indicates at the type level that this memory may contain no values (see MaybeUninit for details).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied most of your suggestions but this one is big enough that it is probably easier to hand the PR off to you. ;) I'd love to do a pass over what you got when you are done, if you don't mind.

I like this new text, as usual in you very pointed style! One comment though:

Additionally, uninitialized memory is always invalid, so you can't assign it to anything

That's not true for MaybeUninit.


A reference/pointer is "dangling" if not all of the bytes it points to are part
of the same allocation. The span of bytes it points to is determined by the
pointer value and the size of the pointee type.

That's it. That's all the causes of Undefined Behavior baked into Rust. Of
course, unsafe functions and traits are free to declare arbitrary other
Expand Down Expand Up @@ -58,3 +75,4 @@ these problems are considered impractical to categorically prevent.
[pointer aliasing rules]: references.html
[uninitialized memory]: uninitialized.html
[race]: races.html
[`target_feature`]: ../reference/attributes/codegen.html#the-target_feature-attribute