Unified identifer for entities & relations #9797

Bluefinger · 2023-09-13T14:27:25Z

Objective

The purpose of this PR is to begin putting together a unified identifier structure that can be used by entities and later components (as entities) as well as relationship pairs for relations, to enable all of these to be able to use the same storages. For the moment, to keep things small and focused, only Entity is being changed to make use of the new Identifier type, keeping Entity's API and serialization/deserialization the same. Further changes are for follow-up PRs.

Solution

Identifier is a wrapper around u64 split into two u32 segments with the idea of being generalised to not impose restrictions on variants. That is for Entity to do. Instead, it is a general API for taking bits to then merge and map into a u64 integer. It exposes low/high methods to return the two value portions as u32 integers, with then the MSB masked for usage as a type flag, enabling entity kind discrimination and future activation/deactivation semantics.

The layout in this PR for Identifier is described as below, going from MSB -> LSB.

|F| High value                    | Low value                      |
|_|_______________________________|________________________________|
|1| 31                            | 32                             |

F = Bit Flags

The high component in this implementation has only 31 bits, but that still leaves 2^31 or 2,147,483,648 values that can be stored still, more than enough for any generation/relation kinds/etc usage. The low part is a full 32-bit index. The flags allow for 1 bit to be used for entity/pair discrimination, as these have different usages for the low/high portions of the Identifier. More bits can be reserved for more variants or activation/deactivation purposes, but this currently has no use in bevy.

More bits could be reserved for future features at the cost of bits for the high component, so how much to reserve is up for discussion. Also, naming of the struct and methods are also subject to further bikeshedding and feedback.

Also, because IDs can have different variants, I wonder if Entity::from_bits needs to return a Result instead of potentially panicking on receiving an invalid ID.

PR is provided as an early WIP to obtain feedback and notes on whether this approach is viable.

Changelog

Added

New Identifier struct for unifying IDs.

Changed

Entity changed to use new Identifier/IdentifierMask as the underlying ID logic.

maniwani · 2023-09-13T16:25:30Z

I don't think Entity should wrap an Identifier. It makes the Entity implementation overly complicated. I would leave Entity untouched and write an impl From<Entity> for Identifier.

Entity, Pair (subject to bikeshedding), and Identifier should have the same structure. Entity is a u32 pair right now, so that's what the other two should be.

#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct Identifier {
    hi: u32,
    lo: u32,
}

If we want them to be u64 (e.g. because 8-byte alignment increases performance or something), that should be the focus of another PR. Likewise, the "active/inactive" flag should be saved for another PR. It's out-of-scope here.

Also, the viability of using a bit pattern like this has already been proven by flecs. We know it works.

Bluefinger · 2023-09-13T16:39:32Z

Having From implementations between the different types so Entity -> Identifier -> Entity etc is a good idea. However, there's some concerns here as to keep in mind:

Guaranteeing layout: AFAIK, Rust does not guarantee the layout of a struct unless you slap a #[repr(C)] on it and it then assumes C semantics for struct layout. Rust will reorder fields as it sees fit. So, either we use u64 and mask things directly, or we slap #[repr(C)] on the likes of Entity/Identifier/etc. Either way, we guarantee the output, which is important.
Not repeating logic: Masking the bits for the high/flag components will have to be repeated across Entity/Identifier, so that the From conversions between them are compatible and lossless. The current implementation would still work with From implementations. We shouldn't rely on internal details of Entity in the first place, but on its API surface, so that it doesn't matter if it is one u64, two u32s, or three kids in a trenchcoat.
Future layout changes will break the serialised format. Hence why we probably should reserve some bits for future proofing/forward compatibility, even if active/inactive flags are not needed right now.

maniwani · 2023-09-13T17:12:27Z

Guaranteeing layout: AFAIK, Rust does not guarantee the layout of a struct unless you slap a #[repr(C)] on it and it then assumes C semantics for struct layout. Rust will reorder fields as it sees fit. So, either we use u64 and mask things directly, or we slap #[repr(C)] on the likes of Entity/Identifier/etc. Either way, we guarantee the output, which is important.

None of these structs need #[repr(C)] unless we need them to be stable for FFI. There's no need to transmute an Identifier into an Entity (or the reverse).

Not repeating logic: Masking the bits for the high/flag components will have to be repeated across Entity/Identifier, so that the From conversions between them are compatible and lossless. The current implementation would still work with From implementations. We shouldn't rely on internal details of Entity in the first place, but on its API surface, so that it doesn't matter if it is one u64, two u32s, or three kids in trenchcoat.

From is not allowed to fail. We can't have an impl From<Identifier> for Entity since an identifier could be for a Pair.

I'm specifically placing importance on keeping the Entity implementation as easy to follow as possible. (I'm also trying to help you make this PR as uncontroversial as possible.) Duplicating a couple lines of code masking some bits is fine, but you don't even need to do that. If the Entity API needs a method that returns its flags, we can just use the Identifier API.

impl Entity {
    pub fn flags(&self) -> IdFlags
        let id: Identifier = self.into();
        id.flags()
    }
}

And checking specific flags shouldn't be part of either API. We should use the bitflags crate to make an IdFlags struct for that.

Future layout changes will break the serialized format. Hence why we probably should reserve some bits for future proofing/forward compatibility, even if active/inactive flags are not needed right now.

Any other flag bits are outside the scope of creating a unified identifier for entities and relationships. Adding this Identifier type and reserving space for more flags it could have should be separate PRs.

Bluefinger · 2023-09-13T19:59:08Z

I'm working on things already (thanks for the feedback!), but I've got some things to raise as I've worked on this so far:

From is not allowed to fail. We can't have an impl From for Entity since an identifier could be for a Pair.

Indeed, we can only go from Entity to Identifier with From, but the other way round requires TryFrom, which I'll be providing as well (which will be useful for deserialization).

I'm specifically placing importance on keeping the Entity implementation as easy to follow as possible. (I'm also trying to help you make this PR as uncontroversial as possible.) Duplicating a couple lines of code masking some bits is fine, but you don't even need to do that. If the Entity API needs a method that returns its flags, we can just use the Identifier API.

And thanks for that! Though I don't think I can leave Entity untouched entirely because a unified ID imposes additional logic & constraints on the internals of Entity. So I'll be leaving the changes I put in with ::new() and the index() & generation() methods in place. I think it makes for a cleaner experience, and also helps keep logic with masking/validation/etc within Identifier. Basically, Entity is becoming an abstraction over Identifier, or specifically, a subset of an Identifier, and so will Pair, etc. I think this is ultimately unavoidable, but we can still take care in how we approach it so to keep it parsable. But the fact that so many methods of Entity are const makes things tricky (can't use From/TryFrom inside const methods).

Also, do we expose Identifier at all? Or keep it as only a restricted type that you can convert to/from, and that's about it.

And checking specific flags shouldn't be part of either API. We should use the bitflags crate to make an IdFlags struct for that.

As for bitflags, I think bringing that in is probably for a different PR. If we are going to expand beyond just one flag, then that will fit in with discussions on how many flags/bits we want to reserve and that's where bitflags will be more useful for. Right now, there's only one bit to toggle for Entity/Pair discrimination, so bringing in a whole dependency seems a bit overkill right at this point.

Bluefinger · 2023-09-14T14:44:39Z

Right, current state of the PR:

Moved masking logic out of `Identifier/Entity` and into `IdentifierMask`

To prep for future work with bitflags, etc, masking/packing logic now lives in IdentifierMask so that it can be shared between Identifier/Entity and in the future Pair and whatever else that needs it without duplicating code. It also means all our packing/masking rules live in one place to be tested/documented. Plus it gets around the const restrictions for Entity/Identifier by not relying on From/TryFrom traits for sharing logic.

`Identifier`/`Entity` `From`/`TryFrom` implementations

From/TryFrom impls are now provided, though because Entity methods are all const, they can't really be used for Entity internals, but most of the masking logic was moved into IdentifierMask anyway so not so much an issue. We now have a failure mode for deserialising Entity, which a u64 value is not a valid Entity id. I've provided an Error type for ID conversion failures, which will be extended once we have need for more variants.

Various tidy-ups

Removed redundant types, organised code a bit better, etc. Should be neater now.

Everything else like naming, conventions, etc, subject to further feedback & bikeshedding.

crates/bevy_ecs/src/entity/mod.rs

crates/bevy_ecs/src/identifier/error.rs

crates/bevy_ecs/src/identifier/kinds.rs

crates/bevy_ecs/src/identifier/masks.rs

crates/bevy_ecs/src/identifier/mod.rs

alice-i-cecile · 2023-09-14T15:35:51Z

Plenty of nits, but to be clear: I'm on board with this direction. This is an abstraction that we're going to want to reuse, and building out a tested, documented API for it now is a good way to split the work.

Bluefinger · 2023-09-14T17:23:52Z

crates/bevy_ecs/src/identifier/masks.rs

+
+        assert_eq!(
+            IdentifierMask::pack_kind_into_high(high, IdKind::Placeholder),
+            0xC0FF_EEEE // Milk and no sugar, please.


I may have had a bit too much fun coming up with test cases... 😄

alice-i-cecile · 2023-09-14T17:59:06Z

Alright, this is looking good now :) Have I missed something, or should this come out of draft now?

Bluefinger · 2023-09-14T18:06:49Z

Alright, this is looking good now :) Have I missed something, or should this come out of draft now?

I just amended the changelog for the PR, but otherwise if everything more or less looks 'good' then I'll set the PR as ready for full review.

alice-i-cecile

Code looks good to me now, and I think that this is a step in the right direction. I'll defer to @maniwani about when we should merge this though.

In the meantime, can you run some benchmarks to make sure that this didn't regress performance? It should all just be transparently in-lined, but compilers are fickle.

Bluefinger · 2023-09-15T17:14:27Z

I did a bunch of benchmarks, but my machine doesn't seem to be very... reliable at outputting numbers without huge variations between runs. My laptop is a Thinkpad A285, AMD Ryzen 2500U 4c/8t processor, 16GB of RAM, etc. In summary though, I see regressions on spawn commands between like 2-5% when my machine is not creating runs with like 50%-100% variations, and the spawn world ones are consistently like 20-30% regression when not having variations of like 100% every run. I basically redid as many of the benchmarks, closing as many apps until I could get something more consistent, but some of these still vary quite a bit per run. Here's what I've found:

group                           master                                 unified_identifier
-----                           ------                                 -------
get_or_spawn/batched            1.04   349.5±37.08µs        ? ?/sec    1.00   335.6±13.15µs        ? ?/sec
get_or_spawn/individual         1.01   548.8±45.00µs        ? ?/sec    1.00   544.8±25.84µs        ? ?/sec
spawn_commands/2000_entities    1.00    206.0±6.15µs        ? ?/sec    1.03   211.8±15.97µs        ? ?/sec
spawn_commands/4000_entities    1.00   415.6±13.13µs        ? ?/sec    1.03   428.0±31.00µs        ? ?/sec
spawn_commands/6000_entities    1.00   628.6±24.46µs        ? ?/sec    1.02   641.0±36.88µs        ? ?/sec
spawn_commands/8000_entities    1.01   840.8±34.00µs        ? ?/sec    1.00   833.0±22.66µs        ? ?/sec
spawn_world/10000_entities      1.00   648.8±49.50µs        ? ?/sec    1.19   774.4±95.25µs        ? ?/sec
spawn_world/1000_entities       1.00     64.0±5.32µs        ? ?/sec    1.28    81.8±14.08µs        ? ?/sec
spawn_world/100_entities        1.00      6.4±0.55µs        ? ?/sec    1.25      8.0±1.39µs        ? ?/sec
spawn_world/10_entities         1.00   631.5±49.78ns        ? ?/sec    1.27  803.9±124.91ns        ? ?/sec
spawn_world/1_entities          1.00     64.2±5.91ns        ? ?/sec    1.29    82.6±13.98ns        ? ?/sec

EDIT: I redid some of the benchmarks to get values that are more consistent than the weird spikes of 100% that I keep getting from my laptop.

Bluefinger · 2023-09-16T17:30:27Z

I reconfigured my laptop a bit to be able to get more consistent result without the weird perf spikes, so I redid both the main/branch benches to see if I still got the same perf results as before. It seems things are not showing the same 20-30% perf regression any more:

group                           main                                   unified
-----                           ----                                   -------
get_or_spawn/batched            1.00   338.1±11.89µs        ? ?/sec    1.00   337.9±11.36µs        ? ?/sec
get_or_spawn/individual         1.00   544.0±51.60µs        ? ?/sec    1.01   549.7±28.57µs        ? ?/sec
spawn_commands/2000_entities    1.00   213.3±14.96µs        ? ?/sec    1.01   214.8±16.76µs        ? ?/sec
spawn_commands/4000_entities    1.00   429.9±31.75µs        ? ?/sec    1.01   433.8±32.14µs        ? ?/sec
spawn_commands/6000_entities    1.00   643.6±31.53µs        ? ?/sec    1.02   658.2±41.09µs        ? ?/sec
spawn_commands/8000_entities    1.00   828.5±19.50µs        ? ?/sec    1.03   856.1±60.16µs        ? ?/sec
spawn_world/10000_entities      1.00   776.1±86.64µs        ? ?/sec    1.04   805.4±93.23µs        ? ?/sec
spawn_world/1000_entities       1.02    81.2±11.74µs        ? ?/sec    1.00    79.3±12.49µs        ? ?/sec
spawn_world/100_entities        1.00      7.8±1.21µs        ? ?/sec    1.00      7.9±1.28µs        ? ?/sec
spawn_world/10_entities         1.00  789.5±121.77ns        ? ?/sec    1.00  792.6±129.18ns        ? ?/sec
spawn_world/1_entities          1.01    81.7±13.00ns        ? ?/sec    1.00    80.7±11.77ns        ? ?/sec

Since my machine is not exactly super reliable, I ask someone else to perhaps benchmark this PR as well, just to confirm what I am seeing.

atlv24 · 2023-09-23T19:14:27Z

Current main vs this branch:

group                           main                                    unified
-----                           ----                                    -------
get_or_spawn/batched            1.00   302.0±34.44µs        ? ?/sec     1.11  336.4±218.86µs        ? ?/sec
get_or_spawn/individual         1.00   463.6±51.26µs        ? ?/sec     1.02   472.9±51.34µs        ? ?/sec
spawn_commands/2000_entities    1.03   176.3±22.37µs        ? ?/sec     1.00    171.1±5.93µs        ? ?/sec
spawn_commands/4000_entities    1.00   339.1±45.48µs        ? ?/sec     1.00   337.5±14.30µs        ? ?/sec
spawn_commands/6000_entities    1.00   496.2±38.05µs        ? ?/sec     1.08  536.8±204.47µs        ? ?/sec
spawn_commands/8000_entities    1.00   670.4±44.49µs        ? ?/sec     1.01   676.9±51.99µs        ? ?/sec
spawn_world/10000_entities      1.00  1129.2±179.11µs        ? ?/sec    1.01  1134.9±173.03µs        ? ?/sec
spawn_world/1000_entities       1.00   115.0±19.69µs        ? ?/sec     1.07   122.7±27.47µs        ? ?/sec
spawn_world/100_entities        1.00     11.4±2.06µs        ? ?/sec     1.00     11.4±2.13µs        ? ?/sec
spawn_world/10_entities         1.00  1118.8±217.67ns        ? ?/sec    1.09  1219.7±343.26ns        ? ?/sec
spawn_world/1_entities          1.01   116.2±19.79ns        ? ?/sec     1.00   114.9±21.12ns        ? ?/sec

atlv24

I like the consistent ordering of index and generation that the constructor ensures but im a bit turned off by the loss of explicitly naming the fields. In practice I dont think this is an issue, as getting the order wrong will be very obviously broken most of the time. Performance seems unaffected (up to noise)

alice-i-cecile · 2024-01-09T19:45:30Z

Going to see if we can get a fresh round of reviews to validate the merge with entity niching.

crates/bevy_ecs/src/entity/mod.rs

crates/bevy_ecs/src/identifier/masks.rs

Bluefinger · 2024-01-10T08:59:17Z

To add further info for review: The generation increment now has to take into account masking in order to prevent overflowing into the flag bits. As such, there is additional overhead being introduced with increment to ensure this operation is safe and doesn't try to do any funny business should values greater than the HIGH_MASK be introduced, either as the index or the increment.

ASM example link: https://rust.godbolt.org/z/4T9o4Gdc6

inc_generation_by:
        and     edi, 2147483647
        and     esi, 2147483647
        lea     ecx, [rsi + rdi]
        mov     eax, ecx
        shr     eax, 31
        add     eax, ecx
        and     eax, 2147483647
        ret

inc_generation_by_one:
        and     edi, 2147483647
        lea     eax, [rdi + 1]
        shr     eax, 31
        add     eax, edi
        inc     eax
        and     eax, 2147483647
        ret

old_inc_generation_by:
        mov     eax, edi
        add     eax, esi
        adc     eax, 0
        ret

Basically, at its core, the instruction count goes from 3 instructions to 7, but avoiding any branches in the process. For increments of one, this becomes 6 instructions.

In general though, perf seems unaffected much by this, since generation incrementing does not appear to be on a hot path. I tested with the many_foxes and perf is more or less the same (ish). This trace is the branch, external trace is main:

Other benches show that the hashing and spawning remains more or less the same, with no noticeable regressions and everything within noise thresholds:

iiYese

LGTM

crates/bevy_ecs/src/entity/mod.rs

crates/bevy_ecs/src/identifier/mod.rs

Co-authored-by: vero <[email protected]>

alice-i-cecile added A-ECS Entities, components, systems, and events C-Enhancement A new feature C-Code-Quality A section of code that is hard to understand or change labels Sep 13, 2023

alice-i-cecile requested review from maniwani and iiYese September 13, 2023 14:32