Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executing Consent Requests via the Privacy Request Execution Layer [#2146] #2125

Merged
merged 17 commits into from
Jan 10, 2023

Conversation

pattisdr
Copy link
Contributor

@pattisdr pattisdr commented Jan 4, 2023

Closes #2146
👉 Note: Requesting to merge into consent-propagation feature branch, not main
✅ Note: Half of this PR are new tests, actual code change here is about ~500 lines.

Description Of Changes

This PR adds the foundation to execute consent requests via the privacy request execution framework. When consent preferences are changed, a PrivacyRequest is created (with a Policy > Consent Rule attached) to propagate those consent preferences to third party services, server-side.

ConsentRequest and Consent records are still created per usual.

POC Changes

  • Updates config.json to reflect new consent defaults: Adds default_consent_policy as the policy_key for consent requests, and also adds executable=True to all the default consentOptions
  • Seeds the db with a default consent Policy and a default Consent Rule (action_type=consent) on startup
  • Adds a DB schema migration that adds an optional PrivacyRequest.executable_consent_preferences column where we store the user's desired consent preferences that are executable (useful for record keeping/retries, etc, storing the current state at the time the request was submitted). Also add an optional consentrequest.privacy_request_id FK to link whether a PrivacyRequest was queued to execute consent preferences
  • PATCH {{host}}/consent-request/{{consent_request_id}}/preferences now loads the config.json file as a last step to see which preferences are executable and filters out consent preferences that are not. It passes on those executable consent preferences to create and queue a PrivacyRequest to propagate executable consent preferences to third-party systems.
  • Update frequently-used method _get_consent_request_and_provided_identity to return a tuple containing both the ConsentRequest and ProvidedIdentity since there are cases where we also want the original ConsentRequest
  • Add new method DatasetConfig.get_dataset_with_stubbed_collection that creates a src.fides.api.ops.graph.config.Dataset from a DatasetConfig. It ignores any collections defined and just creates a single Collection representing the Dataset as a whole. This will be used to build the Consent Graph where we just want one node per Dataset instead of one node per Collection.
  • Request execution has multiple steps. We add a consent step after access and erasure, and only run consent if the attached Policy has a consent rule. We're also more careful about only running access and erasure steps if we have access and erasure rules. I'm also limiting the graph so nodes must be of SaasConnector-type, but this is subject to change as we further refine how the graph is going to be built.
  • Execution layer adds multiple consent-related methods to parallel existing methods for access and erasure. Namely, run_consent_request builds a very simple graph, where each Dataset has one node representing it. None of the nodes are dependent upon each other. Every node just takes in identity_data and every node outputs whether the consent request on that node was successful.
  • Saas connector mocks making a consent request but it only creates logs for now.

Steps to Confirm

  • nox -s test_env
  • Login to the admin app
  • In addition to the mongo and postgres connector that have been generated for you, create a Mailchimp (or other connector of type Saas).
    • It is not important that your credentials are correct, because currently the saas connector will not actually make a consent request
  • Go the Privacy Center (have identify verification off for now)
  • Click Manage your consent
  • Type in your email
  • On the Manage your Consent screen, toggle a few of the options.
  • Click Save
  • Return to admin app privacy requests panel http://localhost:3000/privacy-requests
  • Notice new request with Default Consent Policy attached
  • Approve Request
  • View details
    • Note that there are no mongo or postgres logs because they are attached to saas connectors.
    • Note that the mailchimp collection has an in processing and a complete log.
  • Check fides container for logs. Mailchimp logs should look like:
2023-01-10 01:21:47.708 [INFO] (saas_connector:run_consent_request:421): Mocking consent request - actual logic should go here for saas connectors
2023-01-10 01:21:47.709 [INFO] (saas_connector:run_consent_request:426): Demo only! Testing available consent params! identity_data: {'email': '[email protected]'}, executable_preferences: [{'use': 'advertising', 'opt_in': False}, {'use': 'advertising.first_party', 'opt_in': True}, {'use': 'improve', 'opt_in': False}]
  • Update the config.json in the privacy center to set some of the executable options to False. And follow the steps above. When updating consent preferences, note the log:
2023-01-10 01:24:30.649 [INFO] (consent_request_endpoints:load_executable_consent_options:246): Consent option: 'advertising' is not executable.

The queued privacy request also doesn't have the advertising preference saved.

Screenshots

Screenshot 2023-01-05 at 13 06 57

Screenshot 2023-01-05 at 13 07 38

Pre-Merge Checklist

Illustrating that we can execute consent requests as privacy requests and share the same execution layer.
@pattisdr pattisdr added the do not merge Please don't merge yet, bad things will happen if you do label Jan 4, 2023
src/fides/api/ctl/database/seed.py Outdated Show resolved Hide resolved
output: bool = self.connector.run_consent_request(
self.traversal_node, self.resources.policy, self.resources.request, identity
)
self.log_end(ActionType.consent)
Copy link
Contributor Author

@pattisdr pattisdr Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parallel GraphTask.erasure_request (and access_request) also caches that the erasure was completed on this node so if we restart from failure, we don't re-attempt this node. This is used to update the graph on retry.

self.resources.cache_erasure(
            f"{self.key}", output
        ) 

There's a lot of good reasons to not try to re-run an erasure request on a collection if it's already been completed, but I assume it is okay to retry a request for consent.

We can revisit if needed later.

src/fides/api/ops/models/policy.py Outdated Show resolved Hide resolved
src/fides/api/ops/models/privacy_request.py Show resolved Hide resolved
src/fides/api/ctl/database/seed.py Outdated Show resolved Hide resolved
src/fides/api/ops/service/connectors/saas_connector.py Outdated Show resolved Hide resolved
# Conflicts:
#	src/fides/api/ops/api/v1/endpoints/consent_request_endpoints.py
… this at the GraphTask level instead of the individual connector level.
…s where there is at most one node per Dataset. This is dependent on knowing how consent request details would be stored on saas configs. I am assuming they are stored at the dataset level somewhere, not the collection level.

- Instead of passing in a DatasetGraph with all the Datasets and all the Collections to run_consent_request, instead pass in a Datasets with a single stubbed Collection with no links to other collections.
- Update `run_consent_request` to not use traverse.traverse.  Instead, built a custom dictionary by hand that just has one node for every dataset, mapped to a tuple of the function to run (GraphTask.consent_request and the identity data).  The nodes have no dependencies.  Each node just takes in identity data.

There are more details to be worked out here.
@pattisdr
Copy link
Contributor Author

pattisdr commented Jan 4, 2023

Note: commit 4fc9ec7 experiments with how to create a simpler sort of graph for ConsentRequests. It creates a graph with at most one stubbed collection per dataset type, as I imagine consent requests will largely be requests against the larger service instead of to multiple specific collections.

I may be wrong on these assumptions so the easiest way forward here is to understand where we're going to define how to talk to a third party service. Once I know how/where that's defined, then I can build a graph with that information. The larger point here is that we're flexible on how we build the graph for consent requests. (It would be useful if we can still get them into the Dataset > Collection structure)

If you want to play around with a graph that has all the nodes instead of one node per dataset, try the previous commit.

@pattisdr pattisdr changed the title [POC][Draft]Executing Consent Requests via the Execution Layer [POC][Draft]Executing Consent Requests via the Execution Layer [#2099] Jan 4, 2023
@seanpreston
Copy link
Contributor

seanpreston commented Jan 5, 2023

Some initial comments:

Note postgres and mongo collections have skipped logs

I'm not sure where I stand with this — if the collection is skipped on one hand it's good to show that it was visited, but in a pure consent request these aren't relevant to the overall traversal and so if the user is expecting them to be skipped entirely, it could be more confusing to be showing them. We could always investigate display options to mitigate, e.g. greying them out or highlighting they were skipped more obviously from the view somehow.

Screenshot 2023-01-05 at 12 34 55

PATCH {{host}}/consent-request/{{consent_request_id}}/preferences now creates and queues a separate PrivacyRequest to execute those consent preferences

This will need to happen for specific consent prefs when a flag is set in the post data.

where we're going to define how to talk to a third party service

I think it's safe to say this will be inside the a saas connector yaml, whether that's consent only for one connector, or interlinked with an existing saas connector for now.

@pattisdr
Copy link
Contributor Author

pattisdr commented Jan 8, 2023

I'm not sure where I stand with this — if the collection is skipped on one hand it's good to show that it was visited, but in a pure consent request these aren't relevant to the overall traversal and so if the user is expecting them to be skipped entirely, it could be more confusing to be showing them.

@seanpreston totally agree, it is not a problem to make the graph more targeted. Adam and I have been talking about different ways to do this, what can immediately be eliminated from the graph.

EDIT: newer commits are only adding DatasetConfigs that are of saas type, but this is subject to change. we have lots of flexibility here.

@pattisdr
Copy link
Contributor Author

pattisdr commented Jan 8, 2023

PATCH {{host}}/consent-request/{{consent_request_id}}/preferences now creates and queues a separate PrivacyRequest to execute those consent preferences

This will need to happen for specific consent prefs when a flag is set in the post data.

@seanpreston currently this is checked at the GraphTask level - when we go to execute the task, we filter out which preferences are not executable.

EDIT: Latest commit is filtering out consent preferences that are not executable before the the privacy request is queued only being given executable preferences

…t creation.

Remove test that consent actions are not supported.
…e executable.

- Instead, store in the config.json file.
- Load whether consent options are executable directly from config.json file in the backend
- Additionally, only persist the consent preferences that are executable to the PrivacyRequest.
-  Rename privacyrequest.consent_preferences to privacyrequest.executable_consent_preferences
- Allow an optional Policy key to be passed in when setting consent preferences.  If no policy key is passed in, we use the default consent policy.
…ent preferences into its own method as I've made "set_consent_preferences" very long.
- Adjust how I'm creating a graph with one node to return a Dataset with one stubbed Collection regardless of whether Collections exist on the current node.
- Only add saas datasets to the graph for now, but this is subject to change.
@pattisdr pattisdr changed the base branch from main to consent-propagation January 9, 2023 19:14
@pattisdr pattisdr removed the do not merge Please don't merge yet, bad things will happen if you do label Jan 9, 2023
…ivacy request execution framework as a means to propagate consent preferences. Subject to change as certain details are tweaked.

- Break some code out into individual functions that are more self-contained and more easily tested.
- Add dataset annotations for new db fields.
- Break out consent model tests into their own file, separate from privacy request model tests.
@pattisdr pattisdr added the run unsafe ci checks Runs fides-related CI checks that require sensitive credentials label Jan 10, 2023
@pattisdr pattisdr marked this pull request as ready for review January 10, 2023 01:17
@pattisdr pattisdr changed the title [POC][Draft]Executing Consent Requests via the Execution Layer [#2099] Executing Consent Requests via the Privacy Request Execution Layer [#2099][#2146] Jan 10, 2023
Comment on lines 1286 to 1288
- name: executable_consent_preferences
data_categories: [ system.operations ]
data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and above we're just adding new annotations for our fides dataset

Comment on lines 66 to +70
"url": "https://example.com/privacy#analytics",
"default": true,
"highlight": false,
"cookieKeys": []
"cookieKeys": [],
"executable": true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going ahead and adding executable:true here for all the consentOptions. The backend loads this file directly to have one source of truth.

Copy link
Contributor

@adamsachs adamsachs Jan 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • agreed offline that we should be setting executable:false as a default
  • agreed offline that in initial implementation, we'll only really support a single data use being executable:true and its preference that will drive whether an opt_in or opt_out request is executed for all connectors with consent requests enabled
  • agreed offline that the backend will not load this file, and instead the front-end (i.e. privacy center) will fully determine which data uses are executable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for summarizing Adam 👍

@@ -37,6 +37,7 @@
"email": "required",
"phone": "optional"
},
"policy_key": "default_consent_policy",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not yet being used and should be passed to set_consent_preferences in #2122

Comment on lines +284 to +307

log.info("Creating: Default Consent Policy")
consent_policy = Policy.create_or_update(
db=db_session,
data={
"name": "Default Consent Policy",
"key": DEFAULT_CONSENT_POLICY,
"execution_timeframe": 45,
"client_id": client_id,
},
)

log.info("Creating: Default Consent Rule")
Rule.create_or_update(
db=db_session,
data={
"action_type": ActionType.consent.value,
"name": "Default Consent Rule",
"key": DEFAULT_CONSENT_RULE,
"policy_id": consent_policy.id,
"client_id": client_id,
},
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I just seed the database with a new default Policy with a Consent Rule attached. The primary purpose of this resource is to let the privacy request execution workflow know that this is a consent request! No RuleTargets are required for ConsentRules. Relevant configuration is pulled from the config.json file.

op.add_column(
"privacyrequest",
sa.Column(
"executable_consent_preferences",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This optional list of consent preferences persists to the PrivacyRequest what consent options we're updating, and in particular, just the executable ones. That way the execution workflow doesn't have to worry about what is and isn't executable, it can use the whole list.

Comment on lines +102 to +116
def get_dataset_with_stubbed_collection(self) -> Dataset:
"""
Return a Dataset with a single mock Collection for use in building a graph
where we only want one node per dataset, instead of one node per collection. Note that
the expectation is that there would be no dependencies between nodes on the eventual graph, and the graph
doesn't require information stored at the collection-level.

The single Collection will be the resource that gets practically added to the graph, but the intent
is that this single node represents the overall Dataset, and will execute Dataset-level requests,
not Collection-level requests.
"""
dataset_graph: Dataset = self.get_graph()
stubbed_collection = Collection(name=dataset_graph.name, fields=[], after=set())

dataset_graph.collections = [stubbed_collection]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I am creating a Dataset with a single stubbed Collection, assuming all the information to create the request will exist at the Dataset level, and I don't need information on Collections.

This is what allows us to later build a graph with one node per dataset.

Comment on lines -100 to +102
if action_type in [ActionType.consent.value, ActionType.update.value]:
if action_type in [ActionType.update.value]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already technically had consent rules, we're just allowing ourselves to use them!

Comment on lines +407 to +435
def run_consent_request(
self,
node: TraversalNode,
policy: Policy,
privacy_request: PrivacyRequest,
identity_data: Dict[str, Any],
executable_preferences: List[Consent],
) -> bool:
"""Execute a consent request. Return whether the consent request to the third party succeeded.

Executable_preferences have already been filtered to just consent preferences the customer has deemed executable.

Return True if 200 OK
"""
logger.info(
"Mocking consent request - actual logic should go here for saas connectors",
node.address.value,
)

logger.info(
"Demo only! Testing available consent params! identity_data: {}, executable_preferences: {}",
identity_data,
[
{"use": preference.data_use, "opt_in": preference.opt_in}
for preference in executable_preferences
],
)
return True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently I am adding a run_consent_request method to the SaasConnector. This is where we'd make requests to third-party endpoints to propagate consent. Extra logging is added here to demonstrate what is available but should be removed before release.

Comment on lines 424 to 439
if policy.get_rules_for_action(
action_type=ActionType.consent
) and can_run_checkpoint(
request_checkpoint=CurrentStep.consent,
from_checkpoint=resume_step,
):

await run_consent_request(
privacy_request=privacy_request,
policy=policy,
graph=build_consent_dataset_graph(datasets),
connection_configs=connection_configs,
identity=identity_data,
session=session,
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an additional step to the existing request runner, similar to how we have steps above for access and erasure.

Comment on lines +263 to +268
identity = Identity()
setattr(
identity,
provided_identity.field_name.value, # type:ignore[attr-defined]
provided_identity.encrypted_value["value"], # type:ignore[index]
) # Pull the information on the ProvidedIdentity for the ConsentRequest to pass along to create a PrivacyRequest
Copy link
Contributor Author

@pattisdr pattisdr Jan 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that a separate ProvidedIdentity is created for the PrivacyRequest for now. So there's one for the ConsentRequest too but I think this is okay to start.

@pattisdr pattisdr changed the title Executing Consent Requests via the Privacy Request Execution Layer [#2099][#2146] Executing Consent Requests via the Privacy Request Execution Layer [#2146] Jan 10, 2023
@pattisdr pattisdr self-assigned this Jan 10, 2023
…ilures are properly caught cause the privacy request status to update.
@pattisdr pattisdr removed the run unsafe ci checks Runs fides-related CI checks that require sensitive credentials label Jan 10, 2023
@adamsachs
Copy link
Contributor

i left some comments inline that mainly summarize what's been discussed offline as next steps. generally, though, this looks really good to me! i think once those follow up points have been addressed, i'll re-review, but i expect this should be just about ready to go.

really nice work finding a relatively quick/simple framework integration here that gives us a solid foundation for future consent propagation support.

…preferences to be more generic as things are evolving here.

- Adjust rule validation logic for when trying to create a new consent rule and adding to a Policy that already has a consent rule.
@pattisdr
Copy link
Contributor Author

Discussed with Adam, we're going to go ahead and merge this, as it is the foundation for the consent propagation work. (merging into consent-propagation, not main.) Further changes needed are reticketed.

@pattisdr pattisdr merged commit fd0c6cf into consent-propagation Jan 10, 2023
@pattisdr pattisdr deleted the consent-as-privacy-request-poc branch January 10, 2023 17:58
@pattisdr pattisdr mentioned this pull request Jan 17, 2023
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants