Skip to content
This repository has been archived by the owner on Nov 30, 2022. It is now read-only.

Python overrides for SaaS config endpoints #815

Closed
galvana opened this issue Jul 6, 2022 · 0 comments · Fixed by #986
Closed

Python overrides for SaaS config endpoints #815

galvana opened this issue Jul 6, 2022 · 0 comments · Fixed by #986
Assignees
Labels
enhancement New feature or request

Comments

@galvana
Copy link
Collaborator

galvana commented Jul 6, 2022

Is your feature request related to a specific problem?

We currently denote the logic needed to access a SaaS endpoint by configuring a request inside of a SaaS config:

endpoints:
    - name: messages
      requests:
        read:
          method: GET
          path: /3.0/conversations/<conversation_id>/messages
          param_values:
            - name: conversation_id
              references:
                - dataset: mailchimp_connector_example
                  field: conversations.id
                  direction: from
          data_path: conversation_messages
          postprocessors:
            - strategy: filter
              configuration:
                field: from_email
                value:
                  identity: email
...

This allows us to abstract out a few common use-cases such as how to get the data needed to make the request, how to post-process an API response, and how to paginate to get more results. The problem arises when a certain endpoint requires logic that cannot be described by our SaaS config (for example building non-standard request or more complex post-processing/pagination). The SaaS config could be extended to account for these non-standard scenarios but it would complicate the main SaaS request workflow and make it more difficult to support/troubleshoot.

Describe the solution you'd like

Add functionality to be able to define a Python override function (i.e. a request_override) for any endpoint in a SaaS config, e.g.:

endpoints:
    - name: messages
      requests:
        read:
          request_override: mailchimp_messages_access
          param_values:
            - name: conversation_id
              references:
                - dataset: mailchimp_override_connector_example
                  field: conversations.id
                  direction: from

Note that the param_values field is still needed since this information is used for graph traversal and to denote which values are needed to make this request. The grouped_inputs may also be defined and leveraged. Also note that this means when a request_override field is specified, if any other fields on the request config are defined besides param_values or grouped_inputs, the config will be rejected as invalid; if a request_override field is not specified, then the config validation works as it did before -- the path and method fields are required, and all other fields are optional.

The request_override field will point to a function that has been defined with the appropriate input parameters and returns the appropriate data type; this function will need to be registered with a central override function registry, using a function decorator. We will likely want to add contributor/developer documentation to provide a clear reference for how these overrides can be properly implemented and registered, as it's meant to be a true extension point.

Using the above example, the mailchimp_messages_access function may look something like this:

@SaaSRequestOverrideFactory.register(
    "mailchimp_messages_access", [SaaSRequestType.READ]
)
def mailchimp_messages_access(
    node: TraversalNode,
    policy: Policy,
    privacy_request: PrivacyRequest,
    input_data: Dict[str, List[Any]],
    secrets: Dict[str, Any],
) -> List[Row]:

    # gather request params
    conversation_ids = input_data.get("conversation_id")

    # build and execute request for each input data value
    processed_data = []
    if conversation_ids:
        for conversation_id in conversation_ids:
            try:
                response = get(
                    url=f'https://{secrets["domain"]}/3.0/conversations/{conversation_id}/messages',
                    auth=(secrets["username"], secrets["api_key"]),
                )

            # here we mimic the sort of error handling done in the core framework
            # by the AuthenticatedClient. Extenders can chose to handle errors within
            # their implementation as they wish.
            except Exception as exc:  # pylint: disable=W0703
                if config.dev_mode:  # pylint: disable=R1720
                    raise ConnectionException(
                        f"Operational Error connecting to Mailchimp API with error: {exc}"
                    )
                else:
                    raise ConnectionException(
                        "Operational Error connecting to MailchimpAPI."
                    )
            if not response.ok:
                raise ClientUnsuccessfulException(status_code=response.status_code)

            # unwrap and post-process response
            response_data = pydash.get(response.json(), "conversation_messages")
            filtered_data = pydash.filter_(
                response_data,
                {"from_email": privacy_request.get_cached_identity_data().get("email")},
            )

            # build up final result
            processed_data.extend(filtered_data)

    return processed_data

Additional Information

A more detailed proof-of-concept can be found here #311. Note that this POC does not leverage the registration/factory decorator pattern that we've decided to use going forward.

Override Interface Specification (API)

For override functions of read requests, your method signature must have the following arguments and return type in order for the override to work as expected:

def my_read_override_function(
    node: TraversalNode,
    policy: Policy,
    privacy_request: PrivacyRequest,
    input_data: Dict[str, List[Any]],
    secrets: Dict[str, Any],
) -> List[Row]:

For override functions of update, delete, or data_protection_request requests, your method signature must have the following arguments and return type in order for the override to work as expected:

def my_update_override_function(
    param_values_per_row: List[Dict[str, Any]],
    policy: Policy,
    privacy_request: PrivacyRequest,
    secrets: Dict[str, Any],
) -> int:

To register your override function, you must include the following decorator, where the first argument specifies the "id" by which the function is referenced in your SaaS configs, and the second argument is a List of SaaSRequestType enum values: @SaaSRequestOverrideFactory.register( "my_read_override_function", [SaaSRequestType.READ] )

The SaaSRequestType enum values are:

class SaaSRequestType(Enum):
    """
    An `Enum` containing the different possible types of SaaS requests
    """

    READ = "read'"
    UPDATE = "update"
    DATA_PROTECTION_REQUEST = "data_protection_request"
    DELETE = "delete"
@galvana galvana added the enhancement New feature or request label Jul 6, 2022
@Kelsey-Ethyca Kelsey-Ethyca changed the title [SaaS Connectors] Python overrides for SaaS config endpoints Python overrides for SaaS config endpoints Jul 19, 2022
@adamsachs adamsachs mentioned this issue Jul 28, 2022
10 tasks
@adamsachs adamsachs linked a pull request Jul 28, 2022 that will close this issue
10 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants