Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic Event Tracking #17

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jihiggins
Copy link

@jihiggins jihiggins commented Apr 23, 2021

@jihiggins jihiggins changed the title Create Automatic event tracking.md Automatic Event Tracking Apr 23, 2021
@tower120
Copy link

tower120 commented Aug 26, 2021

I'm unsure if this is right place to discuss... But I'm currently working on implementation of semi-lockless event queue (fast lockfree read; write under lock). https://github.com/tower120/lockless_event_queue/blob/master/src/event.rs

It is in somewhat draft version now. It is linked-list of fixed-size chunks. Each chunk have read_count - which is number of readers that completely read chunk to the end. Whenever read_count becomes equal to the readers count - it is safe to delete that chunk. Of course, chunks can be reused (not done yet). Due to being chunk-based - overhead for tracking read_count is barely observable. Erase occurs on chunk basis. Due to this, in order to safely read from chunk, we only need to know its length, that's basically the only atomic op., that occurs during chunk read. In-chunk data/array synchronization based on acquire/release sync barrier on length load/store.

Each EventReader is not thread-safe (they're cheap to get, and you should just have one for each thread/job). Event is thread-safe.

I didn't really dig into the bevy guts yet, but I think that it is possible to integrate lockless_event_queue as bevy's Events.
For example, without in-engine integration, I plan to have Event of each type as resource, and EventReader as local resource for each system.

The caveats that I encounter so far with that design:

  1. In order to completely destroy Event, all associated EventReaders should be down too. (this is implementation specififc)
  2. Due to the fact that there is actually no lock in EventReader, it is somewhat tricky to clean/truncate unread part of event queue. [You may want that, for example, on new level load; Or when some of readers too sleepy, and you want to cut whole queue a bit] But if it is possible to guarantee, that during cut EventReader's doesn't read - that's doable (for example AFTER systems that touch that Event. I think it could be done in more ergonomic way with in-engine integration, namely postpone actual clean call to the moment when affected systems does not run).

P.S. It is also possible to speed up write with mass_push. On the system run gather all event messages as local-system Vec, then with single lock - push them to Event in the end of the system run.

@tower120
Copy link

tower120 commented Sep 1, 2021

I made some progress with my EventQueue.

According to benchmarks, read performance is stunning https://github.com/tower120/lockless_event_queue/blob/91383baff8aeda56bdd80ab60984daae6f3fd992/benches/read_bench.rs#L8 :

EventQueue worst case scenario: 100ms
EventQueue best case scenario: 40ms
Deque: 60ms
Vec: 28ms

Benchmark is single-threaded, to show overhead. Since there is no locks on read operations, in MT case - it is as fast as vec/deque read in MT (namely, memory bandwidth bounded).
I didn't benchmark write overhead yet.

I think, I found how to deal with clean operation without additional locks.
Event contains start_point = *chunk + index and start_point_epoch. On clear call start_point_epoch increased and start_point updated with queue head position.
Each Reader contains start_point_epoch. On start of read session (getting iterator), reader's compared with atomically read event's start_point_epoch. If they mismatch - and event's start_point > reader's current position - reader position updated, and all chunks in between mark as read.

Thus, reader (within read session quants) always work with actual queue.
The only problem that remains - is sleepy readers. To combat them, it is enough to just update their positions, like in read session start. In Bevy, it should be possible to access system's local resources (event readers) once in a while, without actually running systems. Just calling update on readers after clear will force queue to drop its tail.

So, I see it as following:
Events run in AUTO_CLEANUP = false mode, which means that it does not clean not needed chunks automatically. We'll call shrink_to_fit manually.

  • Arc<Event> constructed and shared as local res among all systems that push events. Each pushing system also has LocalRes Vec. When you push event - you actually push to that vec first. After system run - that vec mass_pushed to Event with just one lock.
  • Each pulling system contains EventReader, constructed from Event. Before system run iterator acquired and send to systems. Then iterator destroyed.
  • [Each N frames] After all systems run:
    • All events, with too big length cut.
    • On all readers update is called.
  • [Each frame] After all systems run:
    • On each event shrink_to_fit called.

So.... What do you say? For me, looks like it deal with both performance and memory issues, while being able to pass messages reliably between frames.

@colepoirier
Copy link

@tower120 Are you sure your units are correct? Numbers in the 10s of milliseconds seems outrageously high for the time per operation. Unless this is total time taken? If so I’m sorry I’m not sure I understand the benchmark. Can you please explain it further?

@tower120
Copy link

tower120 commented Sep 1, 2021

@colepoirier that's time of reading 100`000 items. Look at Vec as baseline. Nothing can't be faster than it.
EventQueue also read full queue in several read sessions. In worst case scenario it reads per 4 items.

@colepoirier
Copy link

@tower120 Ah thanks that makes more sense. I would suggest you add information like this and some further context because as it stands right now you’re explanation is lacking in what makes this a good solution and relies on the reader to properly understand your benchmark and implementation.

@alice-i-cecile
Copy link
Member

alice-i-cecile commented Sep 1, 2021

@tower120 Impressive! I'd be interested in seeing a dedicated RFC (or PR with nice clear explanatory text) for this. This type of tool may be very useful to have in place as we tackle UI next, so I'd love a proper write-up.

@tower120
Copy link

tower120 commented Sep 2, 2021

I would like to make this EventQueue as separate crate, cuz I think it have common functionality.
I'll describe principle of operation in a documentation.
I'll make separate RFC describing EventQueue integration into Bevy.

@alice-i-cecile
Copy link
Member

I would like to make this EventQueue as separate crate, cuz I think it have common functionality.
I'll describe principle of operation in a documentation.
I'll make separate RFC describing EventQueue integration into Bevy.

Awesome, sounds like a great choice. It's always nice to help grow the whole Rust ecosystem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants