Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically derive a cache descriminator from the wasmer version and selected functions #2250

Merged
merged 30 commits into from
Sep 30, 2024

Conversation

aumetra
Copy link
Member

@aumetra aumetra commented Sep 11, 2024

Closes #2205

The hash is produced as follows:

BLAKE3(
  manual module version,
  wasmer version requirement,
  BLAKE3(function1),
  BLAKE3(function2),
  ...,
  BLAKE3(function_n)
)

And the hashes of the functions are lexicographically sorted since the inventory crate doesn't give us any guarantees about the order of hashes yielded.

@aumetra
Copy link
Member Author

aumetra commented Sep 11, 2024

That is awkward.. for some reason the CI has a different discriminator value than I have on my local machine?

That's not right..

@chipshort
Copy link
Collaborator

Interesting, on my machine, the test also passes.

Copy link
Contributor

@kulikthebird kulikthebird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that we have a mechanism that prevents errors during deserialization of modules in cache. I see some room for potential improvements/simplification of the current approach. If the comments I suggested are implemented, potentially we'll be able to move the only thing that is needed - hash_function - to cw-derive crate. Let's discuss that approach.

packages/vm-derive/src/lib.rs Outdated Show resolved Hide resolved
packages/vm/src/modules/file_system_cache.rs Outdated Show resolved Hide resolved
packages/vm/src/modules/file_system_cache.rs Outdated Show resolved Hide resolved
packages/vm-derive/src/lib.rs Outdated Show resolved Hide resolved
packages/vm/src/modules/file_system_cache.rs Show resolved Hide resolved
packages/vm/src/modules/file_system_cache.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@chipshort chipshort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just changelog entry and the one open comment and then it's good to merge.

@aumetra
Copy link
Member Author

aumetra commented Sep 27, 2024

Ugh, I forgot GitHub Actions always tests a fictional merge commit.. that behaviour throws me off so bad

@chipshort
Copy link
Collaborator

Yup, that also threw me off a few times before, but it's definitely the correct thing to do, since that's what you want to test.

Copy link
Collaborator

@chipshort chipshort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I'm wondering if we also want to backport this?

@aumetra aumetra merged commit b6b0591 into main Sep 30, 2024
38 checks passed
@aumetra aumetra deleted the aw/cacher-hasher branch September 30, 2024 12:57
@webmaster128
Copy link
Member

This looks good at first glance. However, how does a node operator know which directories to delete and which ones to keep? Invalidated caches are not deleted automatically.

@aumetra
Copy link
Member Author

aumetra commented Oct 14, 2024

The idea here was to have them look at the least-recently-used directory and delete the others.
Of course we could add something like a CLI flag to wasmd to output the current ID, something like wasmd --cache-version, which then outputs the hash of the cache version.

Depends on how much we want to proof this impl. But I feel like the LRU approach is fine, not perfect but fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add cost fn hash to caching path
4 participants