Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brainstorming: Running reproducible-build checks just before publishing #2666

Closed
Dentrax opened this issue Nov 12, 2021 · 13 comments
Closed
Labels
enhancement New feature or request wontfix This will not be worked on

Comments

@Dentrax
Copy link
Member

Dentrax commented Nov 12, 2021

Is your feature request related to a problem? Please describe.
Several days ago, @naveensrinivasan created something noticeable sigstore/cosign#1019 issue to cosign. This project has been using goreleaser. So, we (@developer-guy @erkanzileli) decided to move the discussion here for further details and make some brainstorming: "What kind of automated action would have we take in the GoReleaser to avoid this inconsistency?"

Let's have in mind that GoReleaser already have a support for deterministic / reproducible builds if IIUC: 0d4f605 1 The motivation behind is this issue is extending the power of this deterministic builds:

Describe the solution you'd like

I think we should run this tests just BEFORE publishing pipeline. That said, we better NOT to publish (sending the release) across distribution platforms in case any tests fails. So, here is a small user story that defines what tests mean actually and referencing to:

As a builder (i.e., user, developer, maintainer, etc.),
I want to build the entire project using a build system that provided by project (i.e., makefile, justfile, custom shell script automation, etc.),
So that I can compare the checksum of the output binary against the binary that I downloaded from releases page

Between the Signing and Publish stages, we can consider bring brand-new friend to platter of the pipelines; something like: check, sanity, etc. (This one might be not necessary since we can run this tests end of the Build stage) to add Reproducible test pipe. Here are my two cents:

High-level Overview

demo

sanity:
  reproducible_build:
    # notice this! i am calling build script here
    # i.e. build.sh, just build, etc.
    cmd: make build
    envs:
      - CGO_ENABLED: 0
    # how many times should we run this?
    runs: 5
    checks:
     - goos: linux
       goarch: arm64
       goarm: 7
       # assert the checksum against the output
       # that we defined in "binary" field in the each "builds" stages
       # notice that we did not assert "amd64" here!
       # we should append all the ARCH combinations here that we want to release
       # see the trade-off warning below!
       expect_build: myproject-linux-arm64-v7

- id: linux-arm64-v7
  binary: myproject-linux-{{ .Arch }}
  main: ./cmd/main
  goos:
    - linux
  goarch:
    - amd64
    - arm64
  goarm:
    - 7

^ Et voilà!

Trade-off warning: We probably build the projects twice as much. One by from GoReleaser and one by from custom build system; and each combination of:

  • GOOS
  • GOARCH
  • LDFLAGS
  • CGO_ENABLED
  • ... you name it!

PROS:

  • easy to write custom checks

CONS:

  • aforementioned trade-off concern
  • hard to maintain all build matrix (combinations)

Describe alternatives you've considered
Since GoReleaser already have a support for deterministic builds, are we still willing to add post-checks? Would it bring unnecessary complex? Overengineering?

If we already use custom build system or scripts in the repository, would not it is that easy to entry point of build system to builds stage like the following spec:

- id: linux-arm64-v7
  binary: myproject-linux-{{ .Arch }}
  # notice the brand new "cmd" field
  cmd: make build
  # some envs to pass into Makefile
  envs:
   - CGO_ENABLED: 0
   - GOOS: linux
   - GOARCH: arm64
   - GOARM: 7

PROS:

  • no two times duplicate building
  • no need to write additional post-check steps
  • consistent build system for both developer and GoReleaser (guaranteeing same checksum?)

CONS:

  • unnecessary complexity for such a building "simple" Go app?

Additional context

Reproducible Builds project generally desirable that building the same source code with the same set of tools is reproducible, i.e. the output is always exactly the same. This makes it possible to verify that the build infrastructure for a binary distribution or embedded system has not been subverted. This can also make it easier to verify that a source or tool change does not make any difference to the resulting binaries. 2

Any feedback would be greatly appreciated!

cc: @cpanato @shibumi @dekkagaijin

Footnotes

  1. https://github.com/goreleaser/goreleaser/blob/master/www/docs/customization/build.md#reproducible-builds

  2. https://www.kernel.org/doc/html/latest/kbuild/reproducible-builds.html#reproducible-builds

@Dentrax Dentrax added the enhancement New feature or request label Nov 12, 2021
@naveensrinivasan
Copy link
Contributor

@Dentrax This is cool!

@shibumi
Copy link

shibumi commented Nov 12, 2021

Hi @Dentrax, very cool idea. I am not sure about the exact purpose of this. Do you want to make goreleaser build reproducible binaries or do you want to test go programs for reproducible builds via goreleaser? The second one is very very difficult.

If we speak about the latter, we have to address a few issues:

  1. You have to provide an isolated build environment: Setting a constant system time, isolating the build process from each other, that they do not interfere, etc.

  2. You have to inject "chaos" into the build process for testing purposes. I think the reproducible builds people call this "reproducible build fuzzing". During that process you try to interfere with the build environment and in the end of the process the system should build the same binary in two totally different build environments.

The reproducible builds crew wrote a few tools for this purpose: https://reproducible-builds.org/tools/
For example:

  • disorderfs: Problems with unstable order of inputs or other variations introduced by filesystems can sometimes be hard to track down. disorderfs is an overlay FUSE filesystem that deliberately introduces non-determinism into filesystem metadata. For example, it can randomize the order in which directory entries are read.

I am not 100% sure if goreleaser wants to go this path. If goreleaser really wants to do this, I would suggest adding it as a plugin, because this might be a bigger project. Also, there are nicer tools for testing reproducibility such like reprotest, diffoscope or rebuilderd

I am by far, away from being an expert on this topic. So I might be totally wrong, too :D

Let's invite @kpcyrd to this thread. @kpcyrd might know more.

@kpcyrd
Copy link

kpcyrd commented Nov 13, 2021

I think @shibumi summarized it very well, if you want to detect possible problems you could use:

reprotest -vv --vary=-domain_host --source-pattern 'Makefile src/' 'make build' ./a.out

This would:

  • copy Makefile and the contents of src/ into a temporary build folder
  • run make build
  • compare ./a.out of both builds for differences

To implement this:

So that I can compare the [checksum of the] output binary against the binary that I downloaded from releases page

You'd need to consider that the system you're rebuilding on might be using a different operating system, different version of goreleaser, a different go compiler or also different binutils. This is a problem you can't solve with "build multiple times on the same computer and see if the result is always the same", you need some kind of plan on how to setup a similar environment. Linux distributions generally approach this problem with buildinfo files (which is kind of SBOM) and I wrote about using docker for this.

@Dentrax
Copy link
Member Author

Dentrax commented Nov 15, 2021

Thanks for such great explanations! I clearly understood the concerns and pretty much the idea. One point that I have been wondering:

You'd need to consider that the system you're rebuilding on might be using a different operating system, different version of goreleaser, a different go compiler or also different binutils. This is a problem you can't solve with "build multiple times on the same computer and see if the result is always the same"

I slightly confused here. Please correct my if I'm wrong: To verify my build is reproducible, I have to use the same environment to compile source. If goreleaser use Linux ARM v7 with Go 1.17 to build my source, then I need to recompile the source using exactly the same environment, which is linux, armv7 and Go 1.17. I should do this operation on the same and different physical computer. Eventually, I expect the all checksums are exactly the same, right? This is where we see that this is a problem that is not so easy to solve.

I think we should find a way out to create different build environments in automated way. Simply creating a pipeline build matrix on GitHub would not provide that build consistency and truth?

@caarlos0
Copy link
Member

If the idea is to create reproducible builds, goreleaser can already do this

For instance, goreleaser builds themselves are reproducible:

image

That's after building 2 times with go run . build --single-target --rm-dist

The keypoints are here: https://goreleaser.com/customization/build/?h=repro#reproducible-builds
You can also enable go mod proxying to be extra sure: https://goreleaser.com/customization/gomod/

Verifying the builds are reproducible would be a bit out of scope I think... one can use goreleaser build multiple times and compare the checksums before actually releasing if they want to ensure that.

In any case, will open a PR to cosign doing the necessary changes...

@caarlos0
Copy link
Member

sigstore/cosign#1053

The other bits were already there, just go mod proxy was missing

@caarlos0
Copy link
Member

closing as explained on previous comments.

Thanks for the writeup @Dentrax !

@caarlos0 caarlos0 added the wontfix This will not be worked on label Nov 21, 2021
@Dentrax
Copy link
Member Author

Dentrax commented Jul 29, 2022

For someone who is stopping by here, want to drop the following repo as a reference: https://github.com/capnspacehook/gorepro

@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2022

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 5, 2022
@Dentrax Dentrax reopened this Dec 6, 2022
@Dentrax
Copy link
Member Author

Dentrax commented Dec 6, 2022

Reopening so that people who have ideas on this can contribute!

@caarlos0
Copy link
Member

caarlos0 commented Dec 6, 2022

it was already closed as wontfix though 🤔 and the issue is locked...

did something change? whats the goal?

@Dentrax
Copy link
Member Author

Dentrax commented Dec 6, 2022

Oh, I was on mobile and thought that it closed by stale bot. My bad. 🤞

@Dentrax Dentrax closed this as completed Dec 6, 2022
@caarlos0
Copy link
Member

caarlos0 commented Dec 6, 2022

no prob!

@caarlos0 caarlos0 closed this as not planned Won't fix, can't repro, duplicate, stale Dec 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

5 participants