Weasel lets you manage and share end-to-end workflows for
different use cases and domains, and orchestrate training, packaging and
serving your custom pipelines. You can start off by cloning a pre-defined
project template, adjust it to fit your needs, load in your data, train a
pipeline, export it as a Python package, upload your outputs to a remote storage
and share your results with your team. Weasel can be used via the
weasel
command and we provide templates in our
projects
repo.
The easiest way to get started is to clone a project template and run it – for example, this end-to-end template that lets you train a spaCy part-of-speech tagger and dependency parser on a Universal Dependencies treebank.
python -m weasel clone pipelines/tagger_parser_ud
Note
Our
projects
repo includes various project templates for different NLP tasks, models, workflows and integrations that you can clone and run. The easiest way to get started is to pick a template, clone it and start modifying it!
Get started with the documentation:
- Learn how to create a Weasel workflow
- Working with directory and assets
- Running custom scripts
- Using remote storage
- Weasel integrations
- Command line interface description
Weasel is a standalone replacement for spaCy Projects. There are a few backward incompatibilities that you should be aware of:
- The
SPACY_CONFIG_OVERRIDES
environment variable is no longer checked. You can set configuration overrides usingWEASEL_CONFIG_OVERRIDES
. - Support for the
spacy_version
configuration key has been dropped. - Support for the
check_requirements
configuration key has been dropped. - Support for
SPACY_PROJECT_USE_GIT_VERSION
environment variable has been dropped. - Error codes are now Weasel-specific, and do not follow spaCy error codes.
Weasel checks for the first three incompatibilities and will issue a warning if you're using it with spaCy-specific configuration options.