Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

💫 Continuous deployment of dev builds, wheels etc. #1843

Closed
ines opened this issue Jan 15, 2018 · 13 comments
Closed

💫 Continuous deployment of dev builds, wheels etc. #1843

ines opened this issue Jan 15, 2018 · 13 comments
Labels
enhancement Feature requests and improvements help wanted Contributions welcome! meta Meta topics, e.g. repo organisation and issue management

Comments

@ines
Copy link
Member

ines commented Jan 15, 2018

This issue details some of the changes we want to make to the spaCy build process, to be able to ship updates faster, and make it easier for developers to test new changes. Introducing spacy-nightly for v2.0 last year was the first step in that direction. We've also started using a better system of dedicated feature branches and pull requests (e.g. #1408, #1424, #1792), with our 💫 emoji indicating issues and PRs by @honnibal or me.

Continuous deployment of dev builds

The new build process should trigger on each push to master, run the tests, build spaCy and, if successful, upload it to spacy-nightly.

For example, let's say you've contributed a bug fix or a new feature. Once your PR is merged and the master build is successful, it will be automatically published as a .dev version on spacy-nightly. This lets you – and others – use and test it immediately, without having to build spaCy from source. Production users can test their application against the latest state of development at any time. It also makes it easier to debug new features and helps ensure stability of the official releases.

Development releases are specified by adding .dev[N] to the version identifier, for example, 2.0.6.dev0. They'll only be installed if you tell pip to do so explicitly:

pip install spacy-nightly==2.0.6.dev0
pip install spacy-nightly --pre

Development versions sort before all other versions – so 2.0.6.dev0 < 2.0.6. Also see the PEP 440 spec on public version identifiers.

Publishing the dev builds on spacy-nightly should be easy to achieve by adding a condition to the setup.py, which replaces the package name if it's run with a certain flag indicating a dev build. The model compatibility check is also set up to support development versions.

Wheels

Travis, Appveyor and Circle CI also support building and uploading binary wheels. We'd like to have wheel deployment also enabled both for the nightly releases and full versions. Wheels are much faster to install and do not require a compiler to be installed on the target machine.

Required changes

Apart from the advantages of being able to install the latest master version easily, nothing will change for spaCy users and contributors. But here's what we're going to be doing:

  • Ensure all updates to the documentation and other non-source parts of the repository are committed with [ci skip] to avoid builds.
  • Keep using feature branches and pull requests for updates we publish – never push directly to master (duh).
  • Integrate the dev builds into our current CI workflow, make all required changes to the setup.py and add documentation.
@ines ines added enhancement Feature requests and improvements meta Meta topics, e.g. repo organisation and issue management labels Jan 15, 2018
@willprice
Copy link
Contributor

I'd be happy to help out with some of this, especially pushing wheels as any downstream projects using spacy have to endure the time spent compiling spacy for their CI!

@honnibal
Copy link
Member

@willprice Thanks!

The question is, what's the best way to get you working on this? Maybe a good start would be if you fork one of our small libraries such as cymem, and get the wheels building on your fork?

If this subgoal seems like a waste of time and you can just get a fork of spaCy working, by all means go ahead!

@willprice
Copy link
Contributor

willprice commented Jan 21, 2018

Hi @honnibal, I think forking sounds like a good idea, I need a safe proving ground where I can fiddle without causing issues ;). I won't be able to do a proper 'github fork' as you can't easily have separate CI for forks (at least on travis), so I'll just clone the repo and push up my own version and link it here.

Questions:

  1. I'm not sure it makes sense for each CI platform to push wheels to PyPI, so do we just want one platform to push wheels? However in the event that the chose CI platform goes down then we lose wheel pushes, so we might want a way of having multiple platforms push wheels. Thoughts?

  2. To add the .dev[N] suffix to versions we'll need to store the current N version in the repo, are you happy having CI committing the change to the repo?

  3. With the [ci skip] tag in commit messages, do you want anything done at all, or just skip everything (including building docs)?

  4. Can you elaborate on the last point "
    Integrate the dev builds into our current CI workflow, make all required changes to the setup.py and add documentation." as I'm not sure what this means.

In the mean time I'll get stuck in...

@willprice
Copy link
Contributor

Forked here: https://github.com/willprice/spacy

@willprice
Copy link
Contributor

Some of the external dependencies also need to build a binary dist for spacy also need to be built:

Starred packages are those managed by explosion so could feasibly be changed to build bdist wheels too. thinc seems to have one of the longest compile times.

@honnibal
Copy link
Member

Great action on this -- thanks.

  1. I'm not sure it makes sense for each CI platform to push wheels to PyPi

Unfortunately I think it's necessary to use at least appveyor and Travis. I'm not sure why conda-forge uses CircleCI as well, but I wouldn't be surprised if there's a good reason based on some missing requirement on Travis. Another potential problem is that sometimes builds queue for days on Travis for OSX.

For now I think it probably makes sense to start with Travis, and then do Appveyor for the Windows wheels.

  1. To add the .dev[N] suffix to versions we'll need to store the current N version in the repo, are you happy having CI committing the change to the repo?

Could the CI query PyPi to find the latest version number? Assuming the CI has to make at least a git tag for each release, we could also use that to find the next version.

These are possibly bad ideas --- I just want to enumerate the options.

  1. With the [ci skip] tag in commit messages, do you want anything done at all, or just skip everything (including building docs)?

@ines ?

  1. Can you elaborate on the last point "
    Integrate the dev builds into our current CI workflow, make all required changes to the setup.py and add documentation." as I'm not sure what this means.

@ines ?

@willprice
Copy link
Contributor

Windows might in fact be the easiest to do as the appveyor config is nice and standalone and building binary wheels is substantially easier on windows than it is for linux (which requires this whole docker centos 5 setup). Probably won't have time until the weekend to make much progress but will keep you updated :)

@honnibal
Copy link
Member

Re dependencies:

I think if we reach out to the developers of wrapt and cytoolz they would be keen to adopt a solution we develop. I'm not sure how actively maintained ujson is. I'd like to remove the dependency on the regex library to improve the speed of the tokenizer.

About that msgpack library: we need an extension library msgpack-numpy. I think this was why I used the msgpack-python library instead of msgpack, but maybe I just messed up?

@ines
Copy link
Member Author

ines commented Jan 23, 2018

With the [ci skip] tag in commit messages, do you want anything done at all, or just skip everything (including building docs)?

This should already be handled by default (at least by Travis, and I'm pretty sure Appveyor as well). So if we use a build as the trigger or first step, we shouldn't have to do anything here. (@honnibal and I just have to remember to tag the commit messages of docs PRs accordingly when we merge.)

Can you elaborate on the last point " Integrate the dev builds into our current CI workflow, make all required changes to the setup.py and add documentation." as I'm not sure what this means.

Don't worry about the documentation at this point – we'll take care of this later. Aside from that, I mostly meant changing the CI config files and setting up the dev builds (e.g. @honnibal's suggestions above) and changing the setup.py to allow an additional flag (or similar) to indicate the dev builds. If it's a dev build, spacy-nightly is used as the package name instead of spacy, so it'll be available as a different package on pip.

@willprice
Copy link
Contributor

This should already be handled by default (at least by Travis, and I'm pretty sure Appveyor as well). So if we use a build as the trigger or first step, we shouldn't have to do anything here. (@honnibal and I just have to remember to tag the commit messages of docs PRs accordingly when we merge.)

You're absolutely right, both Travis and Appveyor support [skip ci] tags. I've noted this above.

Don't worry about the documentation at this point – we'll take care of this later. Aside from that, I mostly meant changing the CI config files and setting up the dev builds (e.g. @honnibal's suggestions above) and changing the setup.py to allow an additional flag (or similar) to indicate the dev builds. If it's a dev build, spacy-nightly is used as the package name instead of spacy, so it'll be available as a different package on pip.

Excellent, I've not found an easy way of overriding the package name in setuptools yet, I might have to use sed to update spacy/about.py for dev builds.

@ines
Copy link
Member Author

ines commented Jan 23, 2018

Excellent, I've not found an easy way of overriding the package name in setuptools yet,

I thought we might just be able to do this here?

spaCy/setup.py

Lines 173 to 174 in 7e6dc28

setup(
name=about['__title__'],

And then do something like this:

package_name = about['__title_nightly__'] if IS_DEV_BUILD else about['__title__']

But maybe I'm wrong and haven't thought this through – I'm definitely not a setuptools expert.

@ines ines added the help wanted Contributions welcome! label Apr 29, 2018
@ines
Copy link
Member Author

ines commented Dec 6, 2018

@ines ines closed this as completed Dec 6, 2018
@lock
Copy link

lock bot commented Jan 5, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Jan 5, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement Feature requests and improvements help wanted Contributions welcome! meta Meta topics, e.g. repo organisation and issue management
Projects
None yet
Development

No branches or pull requests

3 participants