Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pre-fetch all dependencies for build envs without network #5175

Closed
vrothberg opened this issue May 8, 2018 · 5 comments
Closed

pre-fetch all dependencies for build envs without network #5175

vrothberg opened this issue May 8, 2018 · 5 comments
Assignees

Comments

@vrothberg
Copy link

This is a follow-up discussion from the newly introduced gvisor project: google/gvisor#31

The problem context is that most Linux distributions (in my case openSUSE/SUSE) disable network connections when building a package (e.g., with https://openbuildservice.org/). Fetching external dependencies at build-time, however, is an integral part of Bazel, which, in my humble opinion, may explain why many distributions do not yet package Bazel. For some projects that are build with Bazel, we could find workarounds, but the more dependencies a project has the more complicated is the process to untangle the deps.

The problem I am trying to solve to package gvisor is to pre-fetch all dependencies of gvisor into a specific folder in order to subsequently compress them into tar ball, which can then be used in the build context (no network). In order to do so, I made use of the --repository_cache option (for both, fetch and build), but did not succeed as Bazel was still missing some dependencies. Hence the question: Is this a supported use-case, and if, how can I make it work?

@aehlig
Copy link
Contributor

aehlig commented May 15, 2018

First of all, I'm well aware that bazel's story for external dependencies is not in good shape. (Historically, bazel was used within google where all dependencies are in the one single repository, so external dependencies have been added as an after-thought.) We're working on it, but it is a long way to go, given all the quick hacks that have been added before a systematic approach was attempt. Also, the tendency of a lot projects to insist on building all dependencies themselves is unfortunate, but probably also motivated by that historical background.

Concerning your question, a simply bazel build of your target for sure will fetch all needed dependencies (for that architecture). Note, however, that a lookup in the --repository_cache will only happen

  • for downloads of files (not, e.g., external git repositories), and
  • only if a sha256 checksum is specified.

So, most probably a patching of the WORKSPACE file is necessary. (I don't know in detail how go_repositories work and how they can be made fetching from a cache.)

Besides using --repository_cache there are also the following other options that might be interesting for distributions.

  • Using --override_repository or patching the WORKSPACE file to use local_repository, you can make bazel look at a local directory. Having a local checkout of dependencies might be nicer for a source-package than some hash-sum indexed cache directory.
  • The option --experimental_distdir allows to specify a directory to look for files to download first (by name, but again, only taken if a checksum is specified and matches). In this way, the file names are a bit more meaningful, and it might easier to share common dependencies between projects.

But, in summary: no, there is no supported use case of "just fetch everything and cache it". An unsupported way is to build and then look into the external subdirectory of the directory given by bazel info execution_root. There, at least, you have all the external sources that were used for that build.

Given the even more urgent problems we currently have with external dependencies, I also doubt that a satisfying solution will happen soon. Sorry for not having any better news. I hope you can get along with one of the work-arounds described.

@vrothberg
Copy link
Author

@AheliG, thanks a lot for your detailed reply. I will look into it again and provide feedback here.

@aehlig
Copy link
Contributor

aehlig commented May 17, 2018

https://bazel-review.googlesource.com/c/bazel/+/56490 shows how a set of dependencies can be fetched and put into an archive. d703a5e added a new option --experimental_repository_resolved_file that records all the invocations of (Skylark) repository rules that actually did happen. That, together, provides everything to fetch and pack all the needed files (not convenient, but at least possible); making go rules use that distdir is still a different task. (Note that d703a5e will only be available in bazel 0.15.0, so the release after the one currently in flight; but you can already try it by building bazel at head.)

Also, it doesn't help the underlying problem that projects like to bundle random snapshots of random other projects; but it is not the task of a build system to change upstream's development model.

bazel-io pushed a commit that referenced this issue May 24, 2018
...and point --experimental_distdir there, so that offline builds
are again possible out of the distribution archive.

Related #5175.
Fixes #5202.
To be cherry-picked for #5056.

Change-Id: I634296e9d83e4e18ed966b42f35acc63061259d9
PiperOrigin-RevId: 197866998
lfpino pushed a commit that referenced this issue May 24, 2018
...and point --experimental_distdir there, so that offline builds
are again possible out of the distribution archive.

Related #5175.
Fixes #5202.
To be cherry-picked for #5056.

Change-Id: I634296e9d83e4e18ed966b42f35acc63061259d9
PiperOrigin-RevId: 197866998
ulfjack pushed a commit that referenced this issue May 25, 2018
...and point --experimental_distdir there, so that offline builds
are again possible out of the distribution archive.

Related #5175.
Fixes #5202.
To be cherry-picked for #5056.

Change-Id: I634296e9d83e4e18ed966b42f35acc63061259d9
PiperOrigin-RevId: 197866998
bazel-io pushed a commit that referenced this issue Jun 7, 2018
The option --experimental_distdir has been introduced 4 months
ago and was completely unproblematic ever since. Moreover, it
is now used productively, both in our own bootstrapping process[1],
as well as in external packaging of projects using bazel[2]. So
make this option non-experimental. We still keep the old name as
an alternative to not break existing uses.

Related: #5175.

RELNOTES: The --distdir option is no longer experimental. This
  option allows to specify additional directories to look for
  files before trying to fetch them from the network. Files from
  any of the distdirs are only used if a checksum for the file
  is specified and both, the filename and the checksum, match.

[1] Commit 3c9cd82
[2] https://github.com/gentoo/gentoo/blob/7379cdb578b0c070c846c3fa9f71470e2c5d1320/sci-libs/tensorflow/tensorflow-1.8.0-r1.ebuild#L168

Change-Id: I536238f9bdbad6b4f7222b4f6a1464d70d9f3be3
PiperOrigin-RevId: 199637265
bazel-io pushed a commit that referenced this issue Jun 14, 2018
Make all external repositories depend on an additional SkyValue controllable
via commands, so support unconditional fetching of all external repositories,
as it is needed by the the `sync` command.

Improves on #5175, provides a work around for #4907.

Change-Id: I30033614c1a2fad3f1363b85ff69cf92f697c255
PiperOrigin-RevId: 200543985
ArielleA pushed a commit to ArielleA/bazel that referenced this issue Jun 19, 2018
Make all external repositories depend on an additional SkyValue controllable
via commands, so support unconditional fetching of all external repositories,
as it is needed by the the `sync` command.

Improves on bazelbuild#5175, provides a work around for bazelbuild#4907.

Change-Id: I30033614c1a2fad3f1363b85ff69cf92f697c255
PiperOrigin-RevId: 200543985
werkt pushed a commit to werkt/bazel that referenced this issue Aug 2, 2018
The option --experimental_distdir has been introduced 4 months
ago and was completely unproblematic ever since. Moreover, it
is now used productively, both in our own bootstrapping process[1],
as well as in external packaging of projects using bazel[2]. So
make this option non-experimental. We still keep the old name as
an alternative to not break existing uses.

Related: bazelbuild#5175.

RELNOTES: The --distdir option is no longer experimental. This
  option allows to specify additional directories to look for
  files before trying to fetch them from the network. Files from
  any of the distdirs are only used if a checksum for the file
  is specified and both, the filename and the checksum, match.

[1] Commit 3c9cd82
[2] https://github.com/gentoo/gentoo/blob/7379cdb578b0c070c846c3fa9f71470e2c5d1320/sci-libs/tensorflow/tensorflow-1.8.0-r1.ebuild#L168

Change-Id: I536238f9bdbad6b4f7222b4f6a1464d70d9f3be3
PiperOrigin-RevId: 199637265
werkt pushed a commit to werkt/bazel that referenced this issue Aug 2, 2018
Make all external repositories depend on an additional SkyValue controllable
via commands, so support unconditional fetching of all external repositories,
as it is needed by the the `sync` command.

Improves on bazelbuild#5175, provides a work around for bazelbuild#4907.

Change-Id: I30033614c1a2fad3f1363b85ff69cf92f697c255
PiperOrigin-RevId: 200543985
@vrothberg
Copy link
Author

According to google/gvisor#31 (comment) the issue has been resolved. Thank you very much!

@SoftwareApe
Copy link

SoftwareApe commented Mar 2, 2023

@vrothberg According to google/gvisor#31 (comment) the issue is being worked on, with a reference to this issue here. So I think it's not resolved, at least that doesn't follow from that comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants