Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow usage as a library, not just as a binary. #67

Closed
drahnr opened this issue Aug 12, 2022 · 20 comments
Closed

Allow usage as a library, not just as a binary. #67

drahnr opened this issue Aug 12, 2022 · 20 comments
Labels
C-enhancement Category: raise the bar on expectations E-help-wanted Call for participation: Help is requested to fix this issue. E-mentor Call for participation: Mentorship is available for this issue.

Comments

@drahnr
Copy link

drahnr commented Aug 12, 2022

Providing a library alongside the cli tool would be much appreciated for tools like https://GitHub.com/paritytech/cargo-unleash even before it becomes part of cargo

Related: #61

CC @Xanewok @gnunicorn

@epage
Copy link
Collaborator

epage commented Aug 12, 2022

The cargo-release issue for this is crate-ci/cargo-release#62 (though I suspect we won't auto-select version but check if the version is appropriate).

My main concern for cargo-release will be the performance of cargo-semver-check due to the time it takes to generate the rustdoc json.

btw @drahnr any interest in collaborating on release tools?

@drahnr
Copy link
Author

drahnr commented Aug 12, 2022

Hey @epage - I'd love to, but I am not sure I can carve out any kind of significant time for the remainder of at least this year.

@obi1kenobi
Copy link
Owner

My main concern for cargo-release will be the performance of cargo-semver-check due to the time it takes to generate the rustdoc json.

A few notes to help bring up to speed folks coming across this issue, since @epage and I discussed it in person. TL;DR: JSON generation is likely always going to be the long pole in the runtime, and there's plenty of low-hanging fruit to speed up the checking itself that we haven't bothered with yet because the JSON is the slow part.

When rustdoc JSON is stabilized and docs.rs starts hosting the JSON we'll be able to cut that time in half: we'll download the baseline version's JSON instead of running a git checkout + generating it again. It's also possible that once the JSON format is stabilized, we might be able to get some speedups there as well. Right now, the focus seems to rightly be on stabilization, not performance.

To make the checking itself faster, with fairly trivial effort I think we can get small multiples of 10x speedup on the checking time. With a bit more effort I think we can get more orders of magnitude. Consider that an E-help-wanted + E-mentor opportunity :)

Providing a library alongside the cli tool would be much appreciated

A few other folks have reached out to ask for the same thing, and not necessarily just for use in 3rd party cargo plugins. I'd be happy to do it as time allows, and would appreciate help. I hope you won't mind me renaming the issue to generalize it to the library aspect rather than the cargo plugins.

@obi1kenobi obi1kenobi changed the title Allow usage in 3rd party cargo plugins Allow usage as a library, not just as a binary. Aug 12, 2022
@obi1kenobi obi1kenobi added C-enhancement Category: raise the bar on expectations E-help-wanted Call for participation: Help is requested to fix this issue. E-mentor Call for participation: Mentorship is available for this issue. labels Aug 12, 2022
@MarcoIeni
Copy link
Contributor

Hey Predrag! As we discussed on Twitter, I'm happy to be a "beta tester" for this feature and integrate it in https://github.com/MarcoIeni/release-plz :)
Also, release-plz is meant to be run mainly in CI, so I don't mind the slowdown too much.

@almindor
Copy link

+1 from me on library-fication. I'm working on cargo-vcs a "workspace member summary/checker" to help with managing repository (git) versioning of workspace members.

I've added some rudimentary checks like MSRV but it'd be real nice to integrate cargo-semver-checks into cargo vcs directly.

@epage
Copy link
Collaborator

epage commented Aug 31, 2022

+1 from me on library-fication. I'm working on cargo-vcs a "workspace member summary/checker" to help with managing repository (git) versioning of workspace members.

Is this something to help when a workspace is spread across submodules? It might be useful to explicitly list that as I was a bit confused as to what its purpose was at first.

I've added some rudimentary checks like MSRV but it'd be real nice to integrate cargo-semver-checks into cargo vcs directly.

I'm curious why this needs to be supported directly in cargo-vcs when cargo-semver-checks can work directly on a workspaace.

@almindor
Copy link

+1 from me on library-fication. I'm working on cargo-vcs a "workspace member summary/checker" to help with managing repository (git) versioning of workspace members.

Is this something to help when a workspace is spread across submodules? It might be useful to explicitly list that as I was a bit confused as to what its purpose was at first.

It's for workspaces consisting of members each with separate repositories. The main use-case I was aiming for was switching between "checkout targets" (e.g. branches) saved as profiles to test regressions in cases such as deep API changes where multiple member crates need to be modified for the new API.

Switch all to master in one go, switch back to whatever setup you modified things in one go as well.

Get a nice summary of the workspace repo status etc.

I've added some rudimentary checks like MSRV but it'd be real nice to integrate cargo-semver-checks into cargo vcs directly.

I'm curious why this needs to be supported directly in cargo-vcs when cargo-semver-checks can work directly on a workspaace.

It doesn't. I just thought it'd be a nice bonus to show in cargo vcs status if there was a mismatch issue, similar to showing a MSRV issue between the workspace members. It's more of a "nice to have".

@obi1kenobi
Copy link
Owner

Off-topic but I'd love to hear (not in this issue, perhaps over Twitter DM or email?) about other pieces of data you're interested in getting and reporting / other checks you might want to perform. I'm very interested in the kinds of use cases that might come up from making data more interconnectible, especially if that data is "non-standard" i.e. not just a database or an API designed for the purpose but instead things like "crate MSRV" for example.

@almindor
Copy link

almindor commented Sep 1, 2022

I'd love to discuss this. There's nothing but twitter on your profile though and I don't do twitter, is there any other way I can reach you?

@MarcoIeni
Copy link
Contributor

Can I give this a try?

@obi1kenobi
Copy link
Owner

Of course, that'd be great! If you'd like, I'd be happy to chat about this work throughout the process so that we don't end up in a situation where there's a single big scary PR to merge at the very end.

How can I best help you out and set this effort up for success?

@MarcoIeni
Copy link
Contributor

I plan to work on this tomorrow.
If I have doubts I will ask you, and I will try to make small prs :)
Do you have some hints? Or any idea of how the API should look like?

@obi1kenobi
Copy link
Owner

Sounds great!

Broadly, I'd recommend three things:

  • Let's aim to make the binary just another user of the library, via the thinnest reasonable layer (clap arg parsing, and not too much else). That way all users of the library are on equal footing, and we can't have "it works in the binary but not through the library API" cases.
  • Since cargo-semver-checks is planned to merge into cargo, it would be awesome to make this refactor as compatible with that goal as possible. I'd recommend reading some of the cargo source code and aiming for a similar code structure and compatible APIs as much as reasonably viable.
  • Be conservative with what APIs and types are exposed publicly, and generous with #[non_exhaustive] on public types.

Thanks again for your help with this! I'm excited by the new things this will enable 🙌

@epage
Copy link
Collaborator

epage commented Jan 20, 2023

All first-party cargo commands follow a pattern of a thin CLI on top of an "op". While refactoring to this style would be useful for cargo-semver-check,

  • This extra boundary adds friction to a potentially evolving API
  • This isn't a great API boundary for others to use

So I'm assuming any "checks-as-a-lib" work would be independent of the cargo upstreaming goal.

A challenge we have in cargo code which this could also run into is clear separation of library behavior and CLI policy, like outputting data.

@obi1kenobi
Copy link
Owner

I defer to @epage on the design of the CLI / library interface since he's a cargo maintainer. It sounds like we can have more freedom to design our APIs than I had assumed.

@epage
Copy link
Collaborator

epage commented Jan 20, 2023

Yes, while there is a standard pattern for ops that is fairly limited, a lot of other functionality is exposed in the cargo library below the ops layer.

Personally, I feel using this as a library isn't worth it until the performance is better than I recently saw in #296. Few cases of integration can handle 5-20s delays.

@MarcoIeni
Copy link
Contributor

For my use case, 5 or 20 seconds are fine because the app needs to run primarily in CI.

I'm excited by the new things this will enable 🙌

Me too, let's see what I accomplish tomorrow :)

@obi1kenobi
Copy link
Owner

I think most users would probably use cargo-semver-checks primarily on CI right now. But I also think cargo-semver-checks can, should, and will be faster :)

@MarcoIeni if you're looking for an easy first issue to start getting familiarized with the CLI side of the codebase, try #298?

I'm also hoping that the refactor to expose a library API will make #303 easier to prototype, which is exciting!

@obi1kenobi
Copy link
Owner

Personally, I feel using this as a library isn't worth it until the performance is better than I recently saw in #296. Few cases of integration can handle 5-20s delays.

After looking into this together with @epage, it turns out that the numbers were from a debug build. Semver-checking the same two libraries on my machine took 1.56s and 0.25s respectively, 10.7-14.4x faster than initially reported.

There certainly are massive crates that would take 20s to check, but they'd need to be 5-10x the size of toml_edit for that to be the case. And we still haven't bothered to parallelize with rayon which should buy us another roughly order of magnitude speedup.

I'm back to being not that worried about performance 😄

@obi1kenobi
Copy link
Owner

Initial release in https://github.com/obi1kenobi/cargo-semver-checks/releases/tag/v0.18.2 though obviously we'll continue polishing the library APIs over time.

Regarding performance, there are massive perf gains on the horizon and I'm working on Trustfall to make them happen: https://predr.ag/blog/speeding-up-rust-semver-checking-by-over-2000x/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: raise the bar on expectations E-help-wanted Call for participation: Help is requested to fix this issue. E-mentor Call for participation: Mentorship is available for this issue.
Projects
None yet
Development

No branches or pull requests

5 participants