Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust Port: Part 1 #671

Merged
merged 304 commits into from
Feb 17, 2022
Merged

Rust Port: Part 1 #671

merged 304 commits into from
Feb 17, 2022

Conversation

fosskers
Copy link
Owner

@fosskers fosskers commented Nov 13, 2020

This mega-PR ports Aura to Rust as per the discussion in #657 and the proof-of-concept in PR #662.

Growth of Stripped Release Binary

Notes:

  • Current Haskell-Aura is 8979kb.
  • A number of unrelated commits usually exist between each measured growth point, so the increase isn't always due solely to the new dependency.
Stage Binary Size Increase Dependencies
clap + alpm 886kb N/A
ubyte 911kb +3%
rustyline 1114kb +22%
termcolor 1118kb +0.4%
i18n-embed 1366kb +22% 98
-L complete 1394kb +2% 99
simplelog 1534kb +10% 99
rayon 1650kb +8% 109
pbr 1678kb +2% 113
panic = "abort" 1538kb -8% 113?
open complete 1558kb +1% 123
pacmanconf 1590kb +2% 125
Error instances + no exec-lib 1586kb 125
deps complete 1598kb 0.7% 127
itertools 1622kb +2% 128
-Cc, -Cy and curl 1726kb 137
-C complete 1774kb 140
Rust 2021 2229kb

Other notes:

  • versions (pulling nom) only adds ~30kb.
  • Dec 10: opt-level = "s" reduced final stripped size from 1710 -> 1410

Tracked Issues

Tracked PRs

Before Release

  • Reconsider panic = abort

@Morganamilo
Copy link
Contributor

Getting the command line working right is gonna be fun ;)

image

@Morganamilo
Copy link
Contributor

By the way, why is this split up into so many different creates?

aura aura-arch aura-common aura-core

I do feel this is a bit unnecessary. You could easily have it all as one lib with different modules. Any reason you don't do this?

@fosskers
Copy link
Owner Author

fosskers commented Nov 13, 2020

Getting the command line working right is gonna be fun ;)

Heh, yes. Well even in current Haskell-based Aura, those flags must occur after the capital letter command. Nobody has complained about it up to this point, which either means it hasn't been an issue, or nobody has told me ;)

I think for the flag compat, as with the current implementation, I'm going with a best-effort approach. Will there likely remain some funky edge case that pacman would accept that Aura barfs on? Probably. Would any human actually try to use pacman that way? Probably not. And if they do, and have a serious issue with Aura's inability to handle it, then we can revisit then.

At the moment, this Rust port can already display usage info (i.e. -Sh) for each Pacman command identical to how Pacman itself displays it (modulo the options/flags sections that clap splits them into). This is already much better than what Haskell Aura does, so I'm thrilled.

In the specific case of --dbpath, since it exists for all Pacman commands (and would be relevant for a number of Aura ones too), it's easy enough to set Clap's global = true if we need to.

By the way, why is this split up into so many different creates?

This is by design. Haskell Aura is one-lib-one-exec, and many things were added to it organically over the years. There are often cross-cutting concerns and organizational problems that make the library hard to use. The Rust crate layout is still preliminary, but this would allow us to streamline:

  1. Errors
  2. Log messages
  3. Localization
  4. Dependencies

Then, if anyone ever needs a particular piece of Aura, they aren't bogged down by the concerns of the other pieces. Critically, aura-arch will be the only piece with a direct dependency on alpm. The aura executable will be the only piece directly concerned with how error/log messages are localized. The others could act independently if they need to.

@Morganamilo
Copy link
Contributor

Morganamilo commented Nov 13, 2020

Would any human actually try to use pacman that way? Probably not

I do -_-

While I get the idea of splitting up into so many crates. It seems a bit over board don't you think? Too many crates also becomes a bit of a pain to develop and maintain.

How about aura as the binary? This would contain everything aura specific. aur-lib as a general purpose aur lib?

There's also alpm-utils which I maintain. And is basically what you expect. Helper functions to make doing alpm stuff easier.

the general purpose aur-lib is also something I've wanted to do. The goal would be to split it out of paru and make depends on it.

So if you wanna work together to make it a common dependency I'd be all for that.

@fosskers
Copy link
Owner Author

fosskers commented Nov 13, 2020

I do -_-

😅 By "edge case" I was imagining very strange arg orders + mixtures of long and short options + positional args. I'm confident we can arrive at something workable with clap.

Too many crates also becomes a bit of a pain to develop and maintain.

Yes this is definitely a concern. Of course this port is still in early days, so let me share with you my thought process. By the way, it's quite nice to be discussing these things with someone else who understands the whole scope of the problem 😄

My first step was to take Aura's current layout and ask how this could be better organized, so that logic could be better shared with other tools/frontends/libraries/etc. Here's the first pass:

modules

I elaborate below. The namings above mostly match Aura's current Haskell modules names, so don't worry if it's not clear what they do. Some probably aren't even needed anymore!

A number of the points below are obvious when read, but they're that way because at some point they weren't (or still aren't!) obvious to Haskell Aura. So we have:

  1. The CLI-based aura executable itself.
  • All user interaction occurs here.
  • Providing a logging backend occurs here (whereas the libs use log)
  • All fluent-based localisation occurs here (but only English is baked into the executable).
  • The interpretation/handling of all errors from lower libs occurs here.
  1. Some "core" library that handles common package manager tasks.
  • No user-visible IO occurs here. All results given as Result<Foo, Error>.
  • This library could be used by some other UI (ncurses? gtk?) that wants everything that Aura "does" without any of the specific CLI concerns or localization.
  • It's true that some aspects of this would be Aura specific, while some would be applicable to any such AUR-compatible package manager. Perhaps there's a better way to split them up without just shoving everything into the executable?
  1. A layer over alpm concerned with core concerns of an Arch Linux system.
  • Two simple examples already present in this PR: help with creating Alpm connections, and detecting orphan packages.
  • Could have any name, and doesn't need to be branded with "aura".
  • There is almost certainly some overlap here with alpm-utils. Thank you for all the groundwork you've already done in these Rust libs.
  1. Other modules which are pure Rust and need no libalpm binding.
  2. Core types in Pure Rust common to multiple components.

I'm sure this will evolve as the port continues. Either way, I would be thrilled to collaborate with you on some common components. The more eyes we have on these problems the better - which is one of the reasons I celebrate that there are so many AUR tools, even if they're all quite similar and written in the same languages.

@Morganamilo
Copy link
Contributor

As far as a generic aur-lib goes. I can see maybe moving the split_ functions and the comments/news grabbing,

@Morganamilo
Copy link
Contributor

Also worth noting the split functions need better names.

@Morganamilo
Copy link
Contributor

So I made a repo libaur and added the mentioned functions to it. If you want work on it together and add stuff please do.

https://github.com/Morganamilo/libaur.rs

@fosskers
Copy link
Owner Author

fosskers commented Nov 16, 2020

Amazing, yes I will. So that repo is the place where we intend AUR-specific functionality (say, pulling a PKGBUILD) to live? It would contain things that any AUR tool would need? Or better yet: what shouldn't it contain?

@Morganamilo
Copy link
Contributor

say, pulling a PKGBUILD

In theory, if I didn't already have https://github.com/Morganamilo/aur-fetch.rs :P

what shouldn't it contain?

Anything to do with formatting, or forces formatting one way. If you look at the news code, I made a special effort to sill allow you to format it however you want.

@fosskers
Copy link
Owner Author

In theory, if I didn't already have https://github.com/Morganamilo/aur-fetch.rs :P

Heh, thank you, I may very well borrow that too. I wrote the Haskell variants long ago, but can't exactly use them here.

@Morganamilo
Copy link
Contributor

Hmm, that lib seems to be for the aur-rpc. The rust lib for that is raur, aur-fetch is for cloning and diffing pkgbuilds
.

@fosskers
Copy link
Owner Author

fosskers commented Nov 18, 2020

Ah, gotcha. In that case the PKGBUILD fetching/diffing has always lived directly in Aura, I suppose I didn't think it big enough to break out.

@Morganamilo
Copy link
Contributor

The library handles downloading concurrently, /diffing/marking as seen. There's quite a bit to it.

@fosskers
Copy link
Owner Author

fosskers commented Nov 19, 2020

A sample of improved possibilities now that we have alpm:

2020-11-18-170120_956x511_scrot

@Morganamilo
Copy link
Contributor

Morganamilo commented Nov 19, 2020

I'm still a fan of just using pacman. Alpm is useful for dealing with packages, what is installed, what's available etc. But Using pacman for actual install/removes means you don't have to do the output formatting or worry about error handling and so on.

@fosskers
Copy link
Owner Author

fosskers commented Nov 19, 2020

I'm still a fan of just using pacman.

Me too, actually. This latest code is mostly practice, and there's a good chance I'll rip out the manual forming of a transaction and replace it with a call down to pacman -Rsu (as Haskell Aura does). So far the writing of this has informed me of:

  • further alpm usage (and pacman's C internals)
  • some aspects of module organization
  • text colouring
  • user input

Next I'm going to more forward on localizing the messages with fluent. I also plan to rip out rustyline today, I don't think it's useful here.

@fosskers
Copy link
Owner Author

fosskers commented Nov 19, 2020

In general I think we should be careful about "stealing pacman's thunder". If something is easier / safer for pacman to do, then pacman should do it. If some Aura-based process can be enhanced by alpm, then full-steam-ahead (I'm very excited for these). If a traditionally pacman-based process can be significantly improved in some way via Rust, then we might consider it. For now I don't intend to touch the native Pacman commands (-S, -Q, etc).

@fosskers
Copy link
Owner Author

fosskers commented Nov 20, 2020

Actually I'll keep rustyline - it prevents issues with the arrow keys and Ctrl+C, etc.

@Morganamilo
Copy link
Contributor

Actually I'll keep rustyline - it prevents issues with the arrow keys and Ctrl+C, etc.

Honestly I prefer to keep those issues just to be consistent with Pacman.

@fosskers
Copy link
Owner Author

Oh hah, I just noticed for the first time that Pacman freaks out when arrow keys are pressed too. I'm going to call that a bug on their end.

@fosskers
Copy link
Owner Author

I've achieved the combination of localization + colouring, so we're pretty much full steam ahead. I also got good news today regarding the double flags: clap-rs/clap#2192 (comment)

@fosskers
Copy link
Owner Author

@Morganamilo So clap has been great for generating structs for all our flags, and the help output is wonderful, but do you have a suggestion for how to cleanly convert the parsed struct back into string args, for when pacman is called on the command line? You can see in flags.rs the "dumb" solution, namely the ToArgs trait and writing out a huge string of if statements. Haskell Aura has an easier time with this, since the Pacman flags are parsed a different way (but, help output is heavily sacrificied).

Is there some Rusty solution that comes to mind? Otherwise, iterating through a raw ArgMatches would be nice, but how exactly to do that is an ongoing discussion among the clap folks.

@Morganamilo
Copy link
Contributor

What I do is define an array of pacman args. With clap you could then just iterate over them and call values_of on each flag.

Also I've neverr actually used clap v3 my attempts were with v2 .

@Morganamilo
Copy link
Contributor

Not sure if you have any use for this: https://github.com/Morganamilo/paru/blob/master/src/args.rs

@fosskers
Copy link
Owner Author

A thought just occurred to me. If the clap parse succeeded, then by definition whatever is in the std::env::Args iterator is exactly what could be passed to pacman, as is. All I have to do is collect() on that Iterator!

@fosskers
Copy link
Owner Author

fosskers commented Nov 22, 2020

At the moment I don't plan to release Aura v4 until clap gets out of beta. I think v3 is necessary though, given all the Trait auto-deriving I'm doing. Either way, I do plan a beta period for Aura itself where people can try out the Rust version and report any issues before I officially scrap the old version.

@Morganamilo
Copy link
Contributor

A thought just occurred to me. If the clap parse succeeded, then by definition whatever is in the std::env::Args iterator is exactly what could be passed to pacman, as is. All I have to do is collect() on that Iterator!

In paru at least there's a ton of extra flags. forr example paru -Syu --devel Does aura not do that?

@fosskers
Copy link
Owner Author

fosskers commented Nov 22, 2020

All AUR-based functionality is put under an -A command. The pacman commands are left totally untouched. Some people don't like that they're separated, but it's one of the things that makes Aura what it is. I decided in the very beginning to do it this way so that I'd never have to "fight" with the Pacman namespace.

@fosskers
Copy link
Owner Author

^ Works like a charm.

@Morganamilo
Copy link
Contributor

So looking at the latest commits. Quick reminder I have libs for downloading aur packages and dep solving. They are async based but I did make versions that were optionally async.

It looks like II may have never pushed the branches.

Let me know if you're interested in the APIs.

https://github.com/Morganamilo/aur-fetch.rs
https://github.com/Morganamilo/aur-depends

@fosskers
Copy link
Owner Author

fosskers commented Feb 8, 2022

Thanks I'll take a look! Yes keeping async out of things is important for me. Nothing against async per se, I just want to keep things simple and binary size down.

@fosskers
Copy link
Owner Author

fosskers commented Feb 8, 2022

@Morganamilo when you have a chance, also please let me know about Morganamilo/pacmanconf.rs#4. If that's merged, I can release r2d2-alpm generally, and then I plan to merge this branch here into master and continue the work in new, smaller branches.

@Morganamilo
Copy link
Contributor

By the way i don't get what r2d2-alpm is for. Alpm has parallel downloads itself.

@fosskers
Copy link
Owner Author

It's not for package downloads, it's for when you want multiple Alpm handles within a multithreaded computation like a map (etc.) from a ParallelIterator from rayon.

@fosskers fosskers changed the title Rust Port Rust Port: Part 1 Feb 17, 2022
@fosskers fosskers marked this pull request as ready for review February 17, 2022 03:14
@fosskers fosskers merged commit 795d6f0 into master Feb 17, 2022
@fosskers fosskers deleted the colin/rust branch February 17, 2022 03:16
@fosskers
Copy link
Owner Author

@Morganamilo
Copy link
Contributor

Hmm okay. Is that actually needed though?

Pacman tends to cache everything, so multiple handles means more cache misses and more time spent loading from disk. Plus the synchronisation overhead and all that.

Maybe you used to use this in aura which uses the pacman command? But when using alpm I don't see it making much sense.

@fosskers
Copy link
Owner Author

Perhaps the amount of concurrency I'm doing won't pay off in the end. I'm measuring overall resolution times, and will write a strictly serial version afterwards to see how it compares. If it's just as good, then I can run off a single Alpm or look into aur-depends (which I haven't forgotten about).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants