-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast hash/equality for Model objects #129
Comments
Some ideas:
|
Hmmmm when the query finishes on the host, the
the host could then hash the whole buffer and send the result. It'd be nice to have it as a field on That's assuming that hashing the buffer as raw bytes turns out to be fast, I hope so :) Hashing raw bytes will only do what we want if the bytes are sufficiently deterministic, I don't know if that will be the case, but it's a lot easier than hashing objects so I guess it's worth a try :) |
I did give this a shot and it was much worse, fwiw (about 3x worse). |
I tried this as well and it seems a bit faster but not enough to meaningfully change the result. |
It looks like just hacking something in to grab the raw bytes and hash those is definitely faster, 150ms or so per edit for the large json example, cumulatively spent doing hash computations. I also edited my branch to save the hash instead of the Model objects for the cached responses, and compare against that, so we do about half as many total hashes. I haven't played around with the different hash algorithms much to see which is fastest. I tried sha1, md5, and sha256, sha1/md5 were close but sha256 was about twice as slow. Still a lot of time to spend hashing, and I don't love that it is dependent on specific ordering in which the buffer is built up (technically, my other approach also was though, but it would have been relatively trivial to make it not so). Given that this now only ever actually hashes a given Model object once ever, I don't believe it is worth trying to compute the hash on the fly and store it in the buffer. |
I read a bit about what the google3 build does, there is a public doc about a specialized hash, PSHA2, it's based on SHA256 with tweaks to add parallelization so it can run using SIMD instructions, i.e. it uses parallelization within a single CPU. The parallel part of psha256 kicks in for message sizes of 1024 bytes so it looks like it won't be too hard to hit it. Comparing the command line versions of md5sum, sha256sum, sha1sum and psha2sum on my machine using 1GB of input: md5 hits 0.54GB/s, sha256 is 1.08GB/s, sha1 is 1.23GB/s and psha256 is 1.56GB/s. Using the Based on these numbers, if you were using the I wonder how much of that native-performance boost we could get with ffi? But actually these are so important, I could see an argument for directly supporting them in the platform, e.g. put them in |
Some bazel discussion bazelbuild/bazel#22011 mentions blake3, which looks even faster, |
Do you really need to compute hash to compare things? Can't you just (for example) do a simple byte-by-byte comparison of the incoming data? It seems you are just using it a proxy for equality anyway. That being said: I puzzled as to why we are trying to solve all these problems externally to the tools which are supposed to have all information necessary to short cut the what has changed computation. CFE is supposed to have fine grained incremental recompilation of the dependencies, analyzer does not - but should eventually implement it anyway. It seems like we are reinventing the same calculation with macros specific twists. |
If a match means there is no more work to do then it's nice to get a match based on hashes, because then you can get a match without having to store all possible matches as full data.
Neither analyzer nor CFE has a data model that's immediately useful to macros, because they are private to the analyzer and CFE, which means you can't code against them--they change. So, macros have their own data model that is public and stable. (The JSON representation and corresponding binary format and extension types). Macros describe what data they need as a query, the host (analyzer or CFE) converts its own data model to the macro model and sends it in response. Macros usually only care about a part of the code, for example fields in classes with a particular annotation and their types, so what each macro receives is significantly cut down from the full host model. This also means that it should be very common that when a file changes the macro does not have to rerun: something changed but it wasn't what the macro cares about. This investigation is about noticing that the data being sent to a macro is the same as last time, and so the output from last time can be reused. "The same as last time" is easy to check by keeping a hash from last time and comparing. It's true that we could perhaps optimize further by pushing some part of the "same as last time" check before the conversion to the macro data model, so for example if the CFE could compare what changed against the macro query before it even starts to do the conversion. But this would be a lot more work to do, and it's possible than convert-then-compare gets us most of the performance, so we obviously check that first. |
|
I hooked up the CPU profiler, from head but with #134 applied. Here are some noteworthy things, from a profile spanning a single incremental edit:
|
Fwiw, this is my launch_config.json, which assumes you have already generated a benchmark to run ( {
"version": "0.2.0",
"configurations": [
{
"name": "benchmark_debug",
"request": "launch",
"type": "dart",
"program": "pkgs/_macro_tool/bin/main.dart",
"args": [
"--workspace=goldens/foo",
"--packageConfig=.dart_tool/package_config.json",
"--script=goldens/foo/lib/generated/large/a0.dart",
"--host=analyzer",
"--watch"
],
"cwd": "${workspaceFolder}"
},
]
} |
One of the previously recorded tree of performance operations in dart-lang/sdk#55784 (comment) provides details what we do in
Similar data internally. |
See also my previous benchmarks for hashing, Dart vs. Rust. |
Fwiw, the actual hashing is not the problem in this particular case, it is the work to pull out the interesting bits of the objects that we want to hash that is expensive. In my PR I am just using |
As a part of performance work, I have been looking into generating hash functions for Model objects (see my WIP branch). It isn't too bad to make a basic implementation, but it is very slow, taking almost an entire second cumalitively computing hashes for each edit in the large JSON benchmark.
My first approach here is to generate "identityHash" functions which do lookups on the
node
object for each known property, recursively calling "identityHash" on all the nested objects, for exampleInterface.identityHash
looks like this:Ultimately the result of this is that even cached macro phases take an unacceptable amount of time (multiple milliseconds), so we will need to come up with something faster and evaluate exactly what is making this so slow.
The text was updated successfully, but these errors were encountered: