-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define a data privacy and usage policy #388
Comments
This is an excellent issue and I am excited to develop this further. Here are a few quick reactions at a high level:
The PIA process will likely allow 2i2c to define an ontology for the various data in scope. That ontology will include things like intellectual property created by the user, raw data from public or private sensor sources, personally identifiable information, and riskier data sets like medical or financial records. My view is that 2i2c should take a leading and opinionated approach here aligned with "open science" best practices. |
This is super helpful, thanks for this extra information (I know very little about organizational considerations for data privacy). I think it will take a while to go through the full exercise that you describe, and in the meantime there are organizations asking us what our policy is right now. Should we just say "we have no policy"? Or perhaps we can agree upon an informal language that at least conveys our values and approach even if it is not a rigorous policy? |
We should ask those organizations for a PIA and to collaborate with us. We want to know from them what they want our data policy to be. For Syzygy, we mention non-profit, hosted on Compute Canada, and minimal PII retention and get approved right away. The transparency on some PII as part of the 2i2c plan will likely need to be addressed with "open science" values. |
@colliand that's a good idea - @ericvd-ucb do you think one or more of the community colleges would be willing to brainstorm with us what their ideal user privacy agreement would be? |
Update: one-off policy being used@sgibson91 needs a data policy to cover the data collected for an SSI fellowship project she's working on, so I've gotten approval from CS&S to have a one-off use of the policy defined here (adapted from SSI). We should then define a more long-term policy for 2i2c that we can use with the hubs as well. |
We now have a privacy policy defined here: https://docs.2i2c.org/user/topics/policy/privacy/ Can this be closed? @jnywong would this work for your needs right now? It also feels like this page is not discoverable if you weren't able to find it, so do you have thoughts on a better place to link it? |
Thanks, Chris! I did manage to find this page before but I don't think it quite works for my needs for now, since it refers mainly to data that is held on hub infrastructure rather than the type of data I will be collecting from the training feedback surveys. I could expand https://docs.2i2c.org/user/topics/policy/privacy/ to incorporate what I need since I would prefer to link upstream to a SSOT in the Hub Service Guide. |
Context
Many communities want some a guarantee that we will not abuse our control over their data. In some cases, this may be a legal requirement (for example, working with communities that follow GDPR guidelines).
We should should define a policy that gives communities confidence that we will not use their data in any way that they do not wish.
Reference policies
Here are a few policies that we could use for inspiration:
Proposed language
2i2c Pilot Hubs user data policy
User data generated by using a 2i2c Hub is controlled by the users, not 2i2c. 2i2c does not retain any ownership or privileges for user data on the hubs that it deploys as a part of this pilot. The infrastructure that 2i2c deploys (e.g., JupyterHub and Kubernetes) does log some information about user behavior, such as sign-on timestamps and aggregated usage over time. This information may be used by 2i2c in diagnostics to improve hub deployments, or as aggregated statistics in order to demonstrate usage and interest for the purposes of grants etc. However, it will not share this data or any derivatives of this data (beyond aggregate statistics or visualizations) with any third parties.
Task and updates
The text was updated successfully, but these errors were encountered: