Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move some setup out of the docker file to fix taskcluster problems. #10762

Merged
merged 4 commits into from
May 3, 2018

Conversation

jgraham
Copy link
Contributor

@jgraham jgraham commented May 2, 2018

This change is Reviewable

@jgraham jgraham requested review from gsnedders and foolip May 2, 2018 12:15
Copy link
Member

@foolip foolip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % documentation for everything I don't understand :)

RUN mkdir -p /home/test/.fonts && \
cp web-platform-tests/fonts/Ahem.ttf ~/.fonts && \
fc-cache -f -v
RUN mkdir -p /home/test/web-platform-tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment explaining why this isn't just git clone https://github.com/w3c/web-platform-tests + WORKDIR /home/test/web-platform-tests?

And also comment that the reason for having a checkout as part of the image is to save on network traffic or speed things up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the checkout as part of the image was the previous state, but now we don't. It probably did make things a little faster, but I imagine the gains would decline as the image gets older and on the other hand it does make it much slower to iterate on the docker image. In the future we could maybe consider a task to build a base image containing a checkout, updated e.g. weekly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, so the whole point of this now is to not run git fetch. I did not spot that, comment would be good :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have the comment say "the remote is added but not fetched, the right revision is fetched inside the container".

But... then I have to wonder why bother setting this up in the Dockerfile at all, a plain git clone inside the container would be fewer lines of code and easier to understand.


cd web-platform-tests
git pull --depth=50
git fetch origin -n -q --depth=50
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the --depth=50 a bit pointless now? Presumably we'll be updating the image once in a while, and does it even work to do a shallow fetch from within a full clone?


cd web-platform-tests
git pull --depth=50
git fetch origin -n -q --depth=50
git checkout -b master origin/master
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a separate step that checks out the right revision?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, although I have a followup to do this more obviously.

In particular:

* Don't try to install Ahem into the docker image since this fails if
  Ahem ever changes
* Do the souce checkout in the startup script. This should ensure that
  subsequent wpt commands work with the latest code, and fixes some of
  the intermittent errors we have seen.
Copy link
Member

@foolip foolip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking for more comments and wondering about --depth=50 :)

wget https://dl.google.com/linux/direct/$deb_archive
if [[ $BROWSER == "chrome"* ]] || [[ "$BROWSER" == all ]]
then
# Install Chome dev
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preexisting typo: Chome


cd ~/web-platform-tests
git fetch ${REPO} ${BRANCH} -n -q --depth=50
git checkout -b build ${REV}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create a build branch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? It avoids a useless warning about checking out a refless revision, avoids having to suppress that warning, and potentially makes things a little easier to work with for interactive debugging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so there's a warning to suppress :) If not for that, it'd just be more code.

# Install Chome dev
deb_archive=google-chrome-unstable_current_amd64.deb
wget https://dl.google.com/linux/direct/$deb_archive
if [[ $BROWSER == "chrome"* ]] || [[ "$BROWSER" == all ]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the asterix after "chrome"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes it a prefix match. Otherwise chrome-dev isn't going to match :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but if "chrome" is given we'll still use Chrome Dev... I guess this doesn't matter unless we want to also run something other than Chrome Dev though.


cd web-platform-tests
git pull --depth=50
REPO=${1:-origin}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/REPO/REMOTE/

git pull --depth=50
REPO=${1:-origin}
BRANCH=${2:-master}
REV=${3:-master}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think FETCH_HEAD or origin/master would be better here. Otherwise, when you get rid of the "build" branch as I've suggested, the checkout step will sometimes create a new branch called master and sometimes not.

RUN mkdir -p /home/test/.fonts && \
cp web-platform-tests/fonts/Ahem.ttf ~/.fonts && \
fc-cache -f -v
RUN mkdir -p /home/test/web-platform-tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have the comment say "the remote is added but not fetched, the right revision is fetched inside the container".

But... then I have to wonder why bother setting this up in the Dockerfile at all, a plain git clone inside the container would be fewer lines of code and easier to understand.

BROWSER=${4:-all}

cd ~/web-platform-tests
git fetch ${REPO} ${BRANCH} -n -q --depth=50
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --depth=50 still seems like a bit of a liability, making it impossible to restart a failed job if it failed more than 50 commits ago, and if we merge in other repos (like csswg-test) that might be just the previous commit.

How much time does it save to use --depth=50?

An alternative, which seems to work, is to make Taskcluster depend on the merge_pr_* tags. Then we could do git fetch origin merge_pr_10791 --depth=1 here.

At minimum, a comment explaining why --depth=50 is a good idea would be appreciated :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes like 7 minutes longer on my machine to do a full clone.

I'm not sure that we want to make it only work on tagged revisions, because that makes testing it harder. I could probably arrange to keep pulling more commits until rev-parse on the target succeeds, but I think we should avoid that until we have a way to retry builds since getting that right seems error-prone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's call it a "later" problem.

Also optimise things so we don't download Chrome unless actually required.
Copy link
Member

@foolip foolip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope it works now :)

git pull --depth=50
if [[ ! `git rev-parse --verify -q ${REV}` ]];
then
# But if for some reason the commit under test isn't in that range, we give in and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix, thanks!

cd ~

# Initially we just fetch 50 commits in order to save several minutes of fetching
git clone ${REMOTE} --single-branch --branch ${BRANCH} --no-checkout -q --depth=50 web-platform-tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional nit: could call the directory wpt since it will be called that eventually, and shorter paths are nice in logs.

@jgraham jgraham merged commit 843547c into master May 3, 2018
@foolip foolip deleted the tc_less_dockerfile branch May 7, 2018 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants