Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support simd_json as an optional alternative to serde_json #934

Draft
wants to merge 44 commits into
base: main
Choose a base branch
from

Conversation

bassmanitram
Copy link
Contributor

@bassmanitram bassmanitram commented Oct 18, 2024

THIS IS A DRAFT PR TO MAKE KNOWN WHAT I'M PLAYING AT - PLEASE DO COMMENT AS YOU SEE FIT

The approach is to abstract all original serde_json references to a new lib crate in the workspace - json-impl.

That crate as a feature swicth between serde_json and simd_json. In

The initial phase, is to to get all the other crates to depend on json-impl with the serde feature, instead of serde_json directly, so in the end nothing should change.

The next phase is to activate the simd feature and see what fails, then start mitigating the failures.

📬 Issue #, if available:

✍️ Description of changes:

🔏 By submitting this pull request

  • I confirm that I've ran cargo +nightly fmt.
  • I confirm that I've ran cargo clippy --fix.
  • I confirm that I've made a best effort attempt to update all relevant documentation.
  • I confirm that my contribution is made under the terms of the Apache 2.0 license.

The approach is to abstract _all_  original serde_json references to a new lib crate
in the workspace - json-impl.

That crate as a feature swicth between serde_json and simd_json. In

The initial phase, is to to get all the other crates to depend on json-impl with the serde feature,
instead of serde_json directly, so in the end nothing should change.

The next phase is to activate the simd feature and see what fails, then start mitigating
the failures.
@bassmanitram
Copy link
Contributor Author

I have run NO tests yet, even for the serde_json abstraction.

Martin Bartlett added 2 commits October 19, 2024 17:58
The _problem_ with simd_json is that it requires _mutable_  slices to operate while serde_json does not.

This FIRST PASS (certainly not the end solution) is to provide &mut adapter of the from_slice function for serde_json,
a pseudo-safe adapter of from_str for simd_json, and then change all the places where stuff is flagged as needing to
be mutable to be so - MOSTLY, however, this is in TESTS -  so eventually we'll clean that up.
Now we see the REAL problems:

* No "std::cmp::Eq" available on simd::OwnedValue (at least as it is now)
* SOME use of &str passed to a deser visitor and then fed back into simd_json::from_str ... which needs a mut reference (and is technically unsafe)
@bassmanitram
Copy link
Contributor Author

bassmanitram commented Oct 21, 2024

(Where's the bang-head-against-wall emoji when you need it!) - so, up to now, the mut and unsafe stuff - well we can deal with it and it's nicely encapsulated to the json-impl crate (and probable feature propagation as this matures).

Oh but simd_json isn't done with us yet, and this one requires some serious mods to simd_json - simd-lite/simd-json#398 - so we're kinda blocked until that's solved.

(But, then again, one of my suggestions in that issue is a feature pulling simd_json Value representation closer to serde_json, so PERHAPS we can encapsulate ALL - or most - of the issue mitigations in such a feature)

Martin Bartlett added 25 commits October 22, 2024 13:57
... all we have a couple of missing functions and a mutability
difference/error (in the base crates - haven't gotten onto the main
client interface crates yet.
…ambda_http

Of course, lambda_http IS the target for all this and the thing that will expose
the mut and unsafe aspects of using simd to the world - that's for another day!
…references

And add a very useful "from_bytes" combinator to aid in parsing from
stuff like HTTP bodies (though from_reader is probably cheaper).
Plus the from_vec using an owned vec gets us around
some awkward lifetime issues when trying to write "generic" code for both simd_json and serde_json
THAT is NOT easy to create for simd_json.
We've removed Deserialize from of lambda_http::request::LambdaRequest ...
that makes EVERYTHING compile in the simd_json context ... with the exception
of th calls to lambda_runtime::run which require the expected request type to be
serde::Deserialize.

So now it's back to that issue - which, if I recall well, was the blocker back when I
first started this!
At the moment the only way around the deserializer-must-be-simd issue is
an unsafe "cast" ... and even the type equivalence check that I tried before that
cast will itself not compile, so that is currently commented out.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant