This documentation contains a set of guidelines to help you during the contribution process. We are happy to welcome all the contributions from anyone willing to resolve the bugs in the code present and fix any issues in this repository. You can even help us in developing some new features by raising relevant issues that will be reviewed by the PySyft Team.
If you're going to be contributing to PySyft, you'll want to have a fast developer cycle and the ability to experience how your code will be used in a data science context. For this, we recommend the following development flow.
Step 1) Uninstall your default version of PySyft
pip uninstall syft
Step 2) Install Syft as a folder backed reference. From the root of the PySyft repository run the following.
pip install -e .
Step 3) Launch jupyter notebook
jupyter lab
Step 4) experiment within notebooks with what you wnat the end data scientist experience to be and then merge code in to the codebase as you go.
All our development is done using Git and Github. If you're not too familiar with Git and Github, start by reviewing this guide.
On https://github.com/OpenMined/PySyft/issues you can find all open Issues. You can find a detailed explanation on how to work with issues below under Issue Allocation.
To contribute to PySyft you will need to fork the OpenMind/PySyft repository. Then you can work risk-free on your fork.
Fork this Repository. This will create a local copy (Remote Repository) of this repository to your Github profile. Keep a reference to the original project in upstream
remote.
$ git clone https://github.com/<your-username>/<repo-name>
$ cd <repo-name>
$ git remote add upstream https://github.com/<upstream-owner>/<repo-name>
To sync your fork (remote) with the OpenMined/PySyft (upstream) repository please see this Guide on how to sync your fork or follow the given commands.
$ git remote update
$ git checkout <branch-name>
$ git rebase upstream/<branch-name>
PySyft uses the python package pre-commit
to make sure the correct formatting (black & flake) is applied.
You can install it via pip install pre-commit
Then you just need to call pre-commit install
This can all also be done by running make install_hooks
To install the development version of the package, once the dev
version of the requirements have been satisified, one should:
-
Follow the instructions as laid out in INSTALLATION.md to complete the installation process.
-
Make a clone of PySyft repo on one's local machine at the terminal
-
Set up the pre-commit hook as described above in Setting up Pre-Commit Hook
-
Do the following two steps:
cd PySyft pip install -e .
NOTE: If you are using a virtual environment, please be sure to use the correct executable for pip
or python
instead.
You can follow along this example to learn how to deploy PySyft workers and start playing around.
If you are new to the project and want to get into the code, we recommend picking an issue with the label "good first issue". These issues should only require general programming knowledge and little to none insights into the project.
Each issue someone is currently working on should have an assignee. If you want to contribute to an issue someone else is already working on please make sure to get in contact with that person via slack or github and organize yourself.
If you want to work on an open issue, please post a comment telling that you will work on that issue, we will assign you as the assignee then.
Caution: We try our best to keep the assignee up-to-date, but as we are all humans with our own schedule delays are possible, so make sure to check the comments once before you start working on an issue even when no one is assigned to it.
Always make sure to create the necessary tests and keep test coverage at 100%. You can always ask for help in slack or via github if you don't feel confident about your tests.
We aim to have a 100% test coverage, and the GitHub Actions CI will fail if the coverage is below this value. You can evaluate your coverage using the following commands.
coverage run --omit=*/venv/*,setup.py,.eggs/* setup.py test
coverage report --fail-under 100 -m
PySyft is using pytest
to execute the test cases.
Sometimes you want to test functions that hold multiple arguments, which again can have multiple values. To test this, please parametrize your tests.
Example:
@pytest.mark.parametrize(
"compress, compressScheme", [(True, "lz4"), (False, "lz4")]
)
def test_hooked_tensor(self, compress, compressScheme):
TorchHook(torch)
t = Tensor(numpy.random.random((100, 100)))
t_serialized = serialize(t, compress=compress, compressScheme=compressScheme)
t_serialized_deserialized = deserialize(
t_serialized, compressed=compress, compressScheme=compressScheme
)
assert (t == t_serialized_deserialized).all()
Constants related to PySyft Serde protocol are located in separate repository: OpenMined/syft-proto.
All classes that need to be serialized have to be listed in the proto.json
file and have unique code value.
Updating lists of simplifiers and detailers in syft/serde/native_serde.py
, syft/serde/serde.py
, syft/serde/torch_serde.py
or renaming/moving related classes can make unit tests fail because proto.json
won't be in sync with PySyft code anymore.
Use following process:
- Fork OpenMined/syft-proto and create new branch.
- In your PySyft branch, update
pip-dep/requirements.txt
file to havegit+git://github.com/<your_account>/syft-proto@<branch>#egg=syft-proto
instead ofsyft-proto>=*
. - Make required changes in your PySyft and syft-proto branches.
helpers/update_types.py
can help updateproto.json
automatically. - Create PRs in PySyft and syft-proto repos.
- PRs should pass CI checks.
- After syft-proto PR is merged, new version of syft-proto will be published automatically. You can look up new version in PyPI .
- Before merging PySyft PR, update
pip-dep/requirements.txt
to revert fromgit+git://github.com/<your_account>/syft-proto@<branch>#egg=syft-proto
tosyft-proto>=<new version>
.
To ensure code quality and make sure other people can understand your changes, you have to document your code. For documentation we are using the Google Python Style Rules which can be found here. A well wrote example can we viewed here.
You documentation should not describe the obvious, but explain what's the intention behind the code and how you tried to realize your intention.
You should also document non self-explanatory code fragments e.g. complicated for-loops. Again please do not just describe what each line is doing but also explain the idea behind the code fragment and why you decided to use that exact solution.
For better merge compatibility each import is within a separate line. Multiple imports from one package are written in one line each.
Example:
from syft.serde import serialize
from syft.serde import deserialize
sphinx-apidoc -f -o docs/modules/ syft/
The codebase contains static type hints for code clarity and catching errors prior to runtime. If you're adding type hints, please run the static type checker to ensure the type annotations you added are correct via:
mypy syft
Due to issue #2323 you can ignore existing type issues found by mypy.
As with any software project, it's important to keep the amount of code to a minimum, so keep code duplication to a minimum!
If you are contributing a notebook, please ensure you install the requirements for testing notebooks locally. pip install -r pip-dep/requirements_notebooks.txt
.
Also please add tests for it in the tests/notebook/test_notebooks.py
file. There are plenty of examples, for questions about the notebook tests please feel free to reference https://github.com/fdroessler.
At any point in time you can create a pull request, so others can see your changes and give you feedback.
Please create all pull requests to the master
branch.
If your PR is still work in progress and not ready to be merged please add a [WIP]
at the start of the title.
Example:[WIP] Serialization of PointerTensor
After each commit GitHub Actions will check your new code against the formatting guidelines (should not cause any problems when you setup your pre-commit hook) and execute the tests to check if the test coverage is high enough.
We will only merge PRs that pass the GitHub Actions checks.
If your check fails, don't worry, you will still be able to make changes and make your code pass the checks.
For support in contributing to this project and like to follow along with any code changes to the library, please join the #code_pysyft Slack channel. Click here to join our Slack community!