Skip to content
This repository has been archived by the owner on Jan 8, 2024. It is now read-only.

Speed up sources download #13

Open
gasinvein opened this issue Nov 10, 2020 · 14 comments
Open

Speed up sources download #13

gasinvein opened this issue Nov 10, 2020 · 14 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@gasinvein
Copy link
Member

gasinvein commented Nov 10, 2020

Proton is a git repository with numerous git modules, some of which are huge (namely wine and gstreamer). And we have many flatpak-builder modules (two for each Proton component).

flatpak-builder fetches the whole Proton repo with all git submodules for each module, what results in heavy I/O and incredibly long download/checkout times (in fact, on Flathub checkouts take even more time than actual compilation).

We should do something about it. The only solution I see is splitting single git source into multiple archive sources. Does anyone has other ideas?

@gasinvein gasinvein added enhancement New feature or request help wanted Extra attention is needed labels Dec 10, 2020
@fabianhjr
Copy link

fabianhjr commented Dec 19, 2020

@gasinvein
Copy link
Member Author

gasinvein commented Dec 19, 2020

flatpak-builder should make shallow-clones by default when possible (there is an option to disable it explicitly). Do you think it might be not using it for submodules?

@fabianhjr
Copy link

I think it isn't but I a not familiar with the codebase and the shallow submodules are an extra setting from --depth main repo shallow fetching.

@gasinvein
Copy link
Member Author

As far a I understand, flatpak-builder doesn't clone git repos with submodules, but instead extracts submodules list from the main repo and mirrors each submodule individualy. So any recursion options should be irrelevant in this case. Yet I can be wrong.

@gasinvein
Copy link
Member Author

@barthalion My local tests suggest that running flatpak-builder with --disable-updates almost completely removes the issue. I'm guessing the flathub's build bot doesn't use this option? If so - maybe it should use it, given that it downloads sources prior to starting the build?

@barthalion
Copy link
Member

Not really sure. Sources are pre-downloaded, but build machines have also local cache – what happens if requested commit is not available in the local clone?

@gasinvein
Copy link
Member Author

I'm not sure what the local cache is in this context. Aren't sources are downloaded anew on each build? If so, how requested commit could be unavailable?
I mean, if we run something like flatpak-builder --download-only, and then flatpak-builder --disable-updates, everything should be in place?

@barthalion
Copy link
Member

I've looked at this again and yes, sources are being downloaded as a separate step but on a mirror node, not runners. So passing --disable-updates will just cause f-b to fail due to missing source code on actual builders.

@gasinvein
Copy link
Member Author

gasinvein commented Nov 13, 2021

This is getting worse over time as new components are being added to Proton (increasing the modules number in this flatpak).
Basically we do git fetch m*s times, where m is the number of flatpak-builder modules and s is the number of git submodules in the source repo, so each addition to either increases build times significantly.

@barthalion Can we run f-b --download-only followed by f-b --disable-updates on runners as the build step?

@barthalion
Copy link
Member

I know we talked about it, but I still fail to understand what exactly --download-only source would solve here. We no longer have sources worker, and so the only "build command" that is executed is this:

            command = ['flatpak-builder', '-v', '--force-clean', '--sandbox', '--delete-build-dirs',
                       '--user', fb_deps_args,
                       util.Property('extra_fb_args'),
                       '--mirror-screenshots-url=https://dl.flathub.org/repo/screenshots', '--repo', 'repo',
                       util.Interpolate('--extra-sources=%(prop:builddir)s/../downloads'),
                       '--default-branch', util.Property('flathub_default_branch'),
                       '--subject', util.Property('flathub_subject'),
                       '--add-tag=upstream-maintained' if builds.is_upstream_maintained(id) else '--remove-tag=upstream-maintained',
                       'builddir', util.Interpolate('%(prop:flathub_manifest)s')]

How is --download-only in a separate step going to help?

@gasinvein
Copy link
Member Author

--download-only by itself isn't going to help, it's --disable-updates what makes difference here. If the build is ran with --disable-updates, flatpak-builder skips fetching git sources from remotes and just copies whatever is already cached.

@barthalion
Copy link
Member

But it's still going to take a significant amount of time to execute --download-only, doesn't it?

@gasinvein
Copy link
Member Author

Yeah, just re-checked that and it seems like it. So, my proposal to run --download-only followed by --disable-updates probably doesn't make sense.

@gasinvein
Copy link
Member Author

But still, maybe we could run builds with --disable-updates on Flathub? If it's not an option to enable it for all builds, maybe it could be gated by some flathub.json option?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants