Brainstorming: Running reproducible-build checks just before publishing #2666

Dentrax · 2021-11-12T21:49:56Z

Is your feature request related to a problem? Please describe.
Several days ago, @naveensrinivasan created something noticeable sigstore/cosign#1019 issue to cosign. This project has been using goreleaser. So, we (@developer-guy @erkanzileli) decided to move the discussion here for further details and make some brainstorming: "What kind of automated action would have we take in the GoReleaser to avoid this inconsistency?"

Let's have in mind that GoReleaser already have a support for deterministic / reproducible builds if IIUC: 0d4f605 ¹ The motivation behind is this issue is extending the power of this deterministic builds:

Describe the solution you'd like

I think we should run this tests just BEFORE publishing pipeline. That said, we better NOT to publish (sending the release) across distribution platforms in case any tests fails. So, here is a small user story that defines what tests mean actually and referencing to:

As a builder (i.e., user, developer, maintainer, etc.),
I want to build the entire project using a build system that provided by project (i.e., makefile, justfile, custom shell script automation, etc.),
So that I can compare the checksum of the output binary against the binary that I downloaded from releases page

Between the Signing and Publish stages, we can consider bring brand-new friend to platter of the pipelines; something like: check, sanity, etc. (This one might be not necessary since we can run this tests end of the Build stage) to add Reproducible test pipe. Here are my two cents:

High-level Overview

sanity:
  reproducible_build:
    # notice this! i am calling build script here
    # i.e. build.sh, just build, etc.
    cmd: make build
    envs:
      - CGO_ENABLED: 0
    # how many times should we run this?
    runs: 5
    checks:
     - goos: linux
       goarch: arm64
       goarm: 7
       # assert the checksum against the output
       # that we defined in "binary" field in the each "builds" stages
       # notice that we did not assert "amd64" here!
       # we should append all the ARCH combinations here that we want to release
       # see the trade-off warning below!
       expect_build: myproject-linux-arm64-v7

- id: linux-arm64-v7
  binary: myproject-linux-{{ .Arch }}
  main: ./cmd/main
  goos:
    - linux
  goarch:
    - amd64
    - arm64
  goarm:
    - 7

^ Et voilà!

Trade-off warning: We probably build the projects twice as much. One by from GoReleaser and one by from custom build system; and each combination of:

GOOS
GOARCH
LDFLAGS
CGO_ENABLED
... you name it!

PROS:

easy to write custom checks

CONS:

aforementioned trade-off concern
hard to maintain all build matrix (combinations)

Describe alternatives you've considered
Since GoReleaser already have a support for deterministic builds, are we still willing to add post-checks? Would it bring unnecessary complex? Overengineering?

If we already use custom build system or scripts in the repository, would not it is that easy to entry point of build system to builds stage like the following spec:

- id: linux-arm64-v7
  binary: myproject-linux-{{ .Arch }}
  # notice the brand new "cmd" field
  cmd: make build
  # some envs to pass into Makefile
  envs:
   - CGO_ENABLED: 0
   - GOOS: linux
   - GOARCH: arm64
   - GOARM: 7

PROS:

no two times duplicate building
no need to write additional post-check steps
consistent build system for both developer and GoReleaser (guaranteeing same checksum?)

CONS:

unnecessary complexity for such a building "simple" Go app?

Additional context

Reproducible Builds project generally desirable that building the same source code with the same set of tools is reproducible, i.e. the output is always exactly the same. This makes it possible to verify that the build infrastructure for a binary distribution or embedded system has not been subverted. This can also make it easier to verify that a source or tool change does not make any difference to the resulting binaries. ²

Any feedback would be greatly appreciated!

cc: @cpanato @shibumi @dekkagaijin

The text was updated successfully, but these errors were encountered:

naveensrinivasan · 2021-11-12T22:12:30Z

@Dentrax This is cool!

shibumi · 2021-11-12T22:40:03Z

Hi @Dentrax, very cool idea. I am not sure about the exact purpose of this. Do you want to make goreleaser build reproducible binaries or do you want to test go programs for reproducible builds via goreleaser? The second one is very very difficult.

If we speak about the latter, we have to address a few issues:

You have to provide an isolated build environment: Setting a constant system time, isolating the build process from each other, that they do not interfere, etc.
You have to inject "chaos" into the build process for testing purposes. I think the reproducible builds people call this "reproducible build fuzzing". During that process you try to interfere with the build environment and in the end of the process the system should build the same binary in two totally different build environments.

The reproducible builds crew wrote a few tools for this purpose: https://reproducible-builds.org/tools/
For example:

disorderfs: Problems with unstable order of inputs or other variations introduced by filesystems can sometimes be hard to track down. disorderfs is an overlay FUSE filesystem that deliberately introduces non-determinism into filesystem metadata. For example, it can randomize the order in which directory entries are read.

I am not 100% sure if goreleaser wants to go this path. If goreleaser really wants to do this, I would suggest adding it as a plugin, because this might be a bigger project. Also, there are nicer tools for testing reproducibility such like reprotest, diffoscope or rebuilderd

I am by far, away from being an expert on this topic. So I might be totally wrong, too :D

Let's invite @kpcyrd to this thread. @kpcyrd might know more.

kpcyrd · 2021-11-13T01:34:39Z

I think @shibumi summarized it very well, if you want to detect possible problems you could use:

reprotest -vv --vary=-domain_host --source-pattern 'Makefile src/' 'make build' ./a.out

This would:

copy Makefile and the contents of src/ into a temporary build folder
run make build
compare ./a.out of both builds for differences

To implement this:

So that I can compare the [checksum of the] output binary against the binary that I downloaded from releases page

You'd need to consider that the system you're rebuilding on might be using a different operating system, different version of goreleaser, a different go compiler or also different binutils. This is a problem you can't solve with "build multiple times on the same computer and see if the result is always the same", you need some kind of plan on how to setup a similar environment. Linux distributions generally approach this problem with buildinfo files (which is kind of SBOM) and I wrote about using docker for this.

Dentrax · 2021-11-15T08:49:06Z

Thanks for such great explanations! I clearly understood the concerns and pretty much the idea. One point that I have been wondering:

You'd need to consider that the system you're rebuilding on might be using a different operating system, different version of goreleaser, a different go compiler or also different binutils. This is a problem you can't solve with "build multiple times on the same computer and see if the result is always the same"

I slightly confused here. Please correct my if I'm wrong: To verify my build is reproducible, I have to use the same environment to compile source. If goreleaser use Linux ARM v7 with Go 1.17 to build my source, then I need to recompile the source using exactly the same environment, which is linux, armv7 and Go 1.17. I should do this operation on the same and different physical computer. Eventually, I expect the all checksums are exactly the same, right? This is where we see that this is a problem that is not so easy to solve.

I think we should find a way out to create different build environments in automated way. Simply creating a pipeline build matrix on GitHub would not provide that build consistency and truth?

caarlos0 · 2021-11-15T13:56:55Z

If the idea is to create reproducible builds, goreleaser can already do this

For instance, goreleaser builds themselves are reproducible:

That's after building 2 times with go run . build --single-target --rm-dist

The keypoints are here: https://goreleaser.com/customization/build/?h=repro#reproducible-builds
You can also enable go mod proxying to be extra sure: https://goreleaser.com/customization/gomod/

Verifying the builds are reproducible would be a bit out of scope I think... one can use goreleaser build multiple times and compare the checksums before actually releasing if they want to ensure that.

In any case, will open a PR to cosign doing the necessary changes...

caarlos0 · 2021-11-15T14:23:25Z

sigstore/cosign#1053

The other bits were already there, just go mod proxy was missing

caarlos0 · 2021-11-21T15:53:57Z

closing as explained on previous comments.

Thanks for the writeup @Dentrax !

Dentrax · 2022-07-29T10:59:24Z

For someone who is stopping by here, want to drop the following repo as a reference: https://github.com/capnspacehook/gorepro

github-actions · 2022-12-05T14:14:18Z

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Dentrax · 2022-12-06T08:54:16Z

Reopening so that people who have ideas on this can contribute!

caarlos0 · 2022-12-06T12:13:23Z

it was already closed as wontfix though 🤔 and the issue is locked...

did something change? whats the goal?

Dentrax · 2022-12-06T12:26:23Z

Oh, I was on mobile and thought that it closed by stale bot. My bad. 🤞

caarlos0 · 2022-12-06T12:56:52Z

no prob!

Dentrax added the enhancement New feature or request label Nov 12, 2021

caarlos0 closed this as completed Nov 21, 2021

caarlos0 added the wontfix This will not be worked on label Nov 21, 2021

github-actions bot locked as resolved and limited conversation to collaborators Dec 5, 2022

Dentrax reopened this Dec 6, 2022

Dentrax closed this as completed Dec 6, 2022

caarlos0 closed this as not planned Won't fix, can't repro, duplicate, stale Dec 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Brainstorming: Running reproducible-build checks just before publishing #2666

Brainstorming: Running reproducible-build checks just before publishing #2666

Dentrax commented Nov 12, 2021

naveensrinivasan commented Nov 12, 2021

shibumi commented Nov 12, 2021

kpcyrd commented Nov 13, 2021

Dentrax commented Nov 15, 2021

caarlos0 commented Nov 15, 2021

caarlos0 commented Nov 15, 2021

caarlos0 commented Nov 21, 2021

Dentrax commented Jul 29, 2022

github-actions bot commented Dec 5, 2022

Dentrax commented Dec 6, 2022

caarlos0 commented Dec 6, 2022

Dentrax commented Dec 6, 2022

caarlos0 commented Dec 6, 2022

Brainstorming: Running reproducible-build checks just before publishing #2666

Brainstorming: Running reproducible-build checks just before publishing #2666

Comments

Dentrax commented Nov 12, 2021

Footnotes

naveensrinivasan commented Nov 12, 2021

shibumi commented Nov 12, 2021

kpcyrd commented Nov 13, 2021

Dentrax commented Nov 15, 2021

caarlos0 commented Nov 15, 2021

caarlos0 commented Nov 15, 2021

caarlos0 commented Nov 21, 2021

Dentrax commented Jul 29, 2022

github-actions bot commented Dec 5, 2022

Dentrax commented Dec 6, 2022

caarlos0 commented Dec 6, 2022

Dentrax commented Dec 6, 2022

caarlos0 commented Dec 6, 2022