-
-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: simd swar #134
refactor: simd swar #134
Conversation
Not completely in love with the name |
Bench dumpx64Decent -15% improvements on full req parsing:
M1 AirYou'll
|
Builds off seanmonstar#134 (swar), seanmonstar#138 (Bytes cursor) Cleaner, faster and less macros ! ## Key changes - Broke down header-parsing into clean conceptual steps (whilst being faster !) - Added InnerResult allowing for idiomatic `?` early-exits, removing need for parsing-helper macros - Removed macros.rs, leaving only `byte_map!` (response header-parsing macros should become functions) ### TODO - convert request header-parser, supporting its quirks
Builds off seanmonstar#134 (swar), seanmonstar#138 (Bytes cursor) Cleaner, faster and less macros ! - Broke down header-parsing into clean conceptual steps (whilst being faster !) - Added InnerResult allowing for idiomatic `?` early-exits, removing need for parsing-helper macros - Removed macros.rs, leaving only `byte_map!` (response header-parsing macros should become functions) - convert response header-parser, supporting its quirks
Builds off seanmonstar#134 (swar), seanmonstar#138 (Bytes cursor) Cleaner, faster and less macros ! - Broke down header-parsing into clean conceptual steps (whilst being faster !) - Added InnerResult allowing for idiomatic `?` early-exits, removing need for parsing-helper macros - Removed macros.rs, leaving only `byte_map!` (response header-parsing macros should become functions)
After looking at what this is, I don't think it needs to be disabled with SIMD disabled. The point of that config was mostly so that we could test the scalar code even if build detection wanted to enable SIMD. Otherwise it'd be too hard to test it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was easier to review than I thought when seeing the initial email. Looks great to me, one conflict to fixup. Thanks!
Moves the block-wise validators to a "swar" SIMD backend The core logic of validate => extract => chain is now more evident
2397ccd
to
1f3619b
Compare
Head branch was pushed to by a user without write access
Builds off seanmonstar#134 (swar), seanmonstar#138 (Bytes cursor) Cleaner, faster and less macros ! - Broke down header-parsing into clean conceptual steps (whilst being faster !) - Added InnerResult allowing for idiomatic `?` early-exits, removing need for parsing-helper macros - Removed macros.rs, leaving only `byte_map!` (response header-parsing macros should become functions)
Builds off seanmonstar#134 (swar), seanmonstar#138 (Bytes cursor) Cleaner, faster and less macros ! - Broke down header-parsing into clean conceptual steps (whilst being faster !) - Added InnerResult allowing for idiomatic `?` early-exits, removing need for parsing-helper macros - Removed macros.rs, leaving only `byte_map!` (response header-parsing macros should become functions)
Builds off seanmonstar#134 (swar), seanmonstar#138 (Bytes cursor) Cleaner, faster and less macros ! - Broke down header-parsing into clean conceptual steps (whilst being faster !) - Added InnerResult allowing for idiomatic `?` early-exits, removing need for parsing-helper macros - Removed macros.rs, leaving only `byte_map!` (response header-parsing macros should become functions)
This refactor moves the block-wise validators to a "swar" SIMD backend
The core logic of validate => extract => chain is (IMO) now more evident
(and how we stack validators, using the largest SIMD validator available then finishing with scalar if at end of buffer [uncommon])
Perf wise, this is roughly on par with master, slightly faster on some regards but not 100% fine-tuned or fully benched with neon or avx/sse. In part because it avoids duplicating validating work between SIMD and scalar validators.