Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ETL-397] Create s3-event-config lambda and dependencies to add s3 notification configuration #50

Merged
merged 7 commits into from
May 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,4 @@ synapseclient = "~=2.7"
pandas = "<1.5"
moto = "~=4.1"
datacompy = "~=0.8"
docker = "~=6.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's dependency of using mock_lambda from moto3 but for some reason installing moto3 doesn't automatically install docker as a dependency

7 changes: 7 additions & 0 deletions config/develop/namespaced/s3-event-config-lambda-role.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
template:
path: s3-event-config-lambda-role.yaml
stack_name: "{{ stack_group_config.namespace }}-s3-event-config-lambda-role"
parameters:
S3SourceBucketName: {{ stack_group_config.input_bucket_name }}
stack_tags:
{{ stack_group_config.default_stack_tags }}
15 changes: 15 additions & 0 deletions config/develop/namespaced/s3-event-config-lambda.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
template:
type: sam
path: src/lambda_function/s3_event_config/template.yaml
artifact_bucket_name: {{ stack_group_config.cloudformation_artifact_bucket_name }}
artifact_prefix: '{{ stack_group_config.namespace }}/src/lambda'
dependencies:
- develop/namespaced/s3-event-config-lambda-role.yaml
- develop/namespaced/s3-to-glue-lambda.yaml
stack_name: '{{ stack_group_config.namespace }}-lambda-S3EventConfig'
stack_tags: {{ stack_group_config.default_stack_tags }}
parameters:
Namespace: {{ stack_group_config.namespace }}
S3ToGlueFunctionArn: !stack_output_external "{{ stack_group_config.namespace }}-lambda-S3ToGlue::S3ToGlueFunctionArn"
S3EventConfigRoleArn: !stack_output_external "{{ stack_group_config.namespace }}-s3-event-config-lambda-role::RoleArn"
S3SourceBucketName: {{ stack_group_config.input_bucket_name }}
3 changes: 2 additions & 1 deletion config/develop/namespaced/s3-to-glue-lambda.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
template:
type: sam
path: src/lambda_function/template.yaml
path: src/lambda_function/s3_to_glue/template.yaml
artifact_bucket_name: {{ stack_group_config.cloudformation_artifact_bucket_name }}
artifact_prefix: '{{ stack_group_config.namespace }}/src/lambda'
dependencies:
Expand All @@ -9,5 +9,6 @@ dependencies:
stack_name: '{{ stack_group_config.namespace }}-lambda-S3ToGlue'
stack_tags: {{ stack_group_config.default_stack_tags }}
parameters:
S3SourceBucketName: {{ stack_group_config.input_bucket_name }}
S3ToGlueRoleArn: !stack_output_external s3-to-glue-lambda-role::RoleArn
PrimaryWorkflowName: !stack_output_external "{{ stack_group_config.namespace }}-glue-workflow::WorkflowName"
7 changes: 7 additions & 0 deletions config/prod/namespaced/s3-event-config-lambda-role.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
template:
path: s3-event-config-lambda-role.yaml
stack_name: "{{ stack_group_config.namespace }}-s3-event-config-lambda-role"
parameters:
S3SourceBucketName: {{ stack_group_config.input_bucket_name }}
stack_tags:
{{ stack_group_config.default_stack_tags }}
15 changes: 15 additions & 0 deletions config/prod/namespaced/s3-event-config-lambda.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
template:
type: sam
path: src/lambda_function/s3_event_config/template.yaml
artifact_bucket_name: {{ stack_group_config.cloudformation_artifact_bucket_name }}
artifact_prefix: '{{ stack_group_config.namespace }}/src/lambda'
dependencies:
- prod/namespaced/s3-event-config-lambda-role.yaml
- prod/namespaced/s3-to-glue-lambda.yaml
stack_name: '{{ stack_group_config.namespace }}-lambda-S3EventConfig'
stack_tags: {{ stack_group_config.default_stack_tags }}
parameters:
Namespace: {{ stack_group_config.namespace }}
S3ToGlueFunctionArn: !stack_output_external "{{ stack_group_config.namespace }}-lambda-S3ToGlue::S3ToGlueFunctionArn"
S3EventConfigRoleArn: !stack_output_external "{{ stack_group_config.namespace }}-s3-event-config-lambda-role::RoleArn"
S3SourceBucketName: {{ stack_group_config.input_bucket_name }}
3 changes: 2 additions & 1 deletion config/prod/namespaced/s3-to-glue-lambda.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
template:
type: sam
path: src/lambda_function/template.yaml
path: src/lambda_function/s3_to_glue/template.yaml
artifact_bucket_name: {{ stack_group_config.cloudformation_artifact_bucket_name }}
artifact_prefix: '{{ stack_group_config.namespace }}/src/lambda'
dependencies:
Expand All @@ -9,5 +9,6 @@ dependencies:
stack_name: '{{ stack_group_config.namespace }}-lambda-S3ToGlue'
stack_tags: {{ stack_group_config.default_stack_tags }}
parameters:
S3SourceBucketName: {{ stack_group_config.input_bucket_name }}
S3ToGlueRoleArn: !stack_output_external s3-to-glue-lambda-role::RoleArn
PrimaryWorkflowName: !stack_output_external "{{ stack_group_config.namespace }}-glue-workflow::WorkflowName"
42 changes: 42 additions & 0 deletions src/lambda_function/s3_event_config/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# s3_event_config lambda

The s3_event_config lambda is triggered by a github action during deployment or manually through the AWS console.

It will then put a S3 event notification configuration into
the input data bucket which allows the input data bucket to
trigger the S3 to JSON lambda with S3 new object notifications whenever new objects are added
to it and eventually lead to the start of the S3-to-JSON workflow.

## Event format

The events that will trigger the s3-event-config-lambda
should be in the form of something like:

```
{
"RequestType": "Create"
}
```

Where the allowed RequestType values are:
- "Create"
- "Update"
- "Delete"

## Launching Lambda stack in AWS

There are two main stacks involved in the s3_event_config lambda. They are the
`s3_event_config lambda role` stack and the `s3_event_config lambda` stack.

Note that they depend on the `s3 to json` lambda stacks.

### Sceptre

#### Launching in development

Run the following command to create the lambda stack in your AWS account. Note this will
also create the lambda event config IAM role stack as well as well as any other dependencies of this stack:

```shell script
sceptre --var namespace='test-namespace' launch develop/namespaced/s3-event-config-lambda.yaml
```
Empty file.
87 changes: 87 additions & 0 deletions src/lambda_function/s3_event_config/app.py
thomasyu888 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
"""
This Lambda app responds to an external trigger (usually github action or aws console) and
puts a s3 event notification configuration for the S3 to Glue lambda in the
input data S3 bucket set by the environment variable `S3_SOURCE_BUCKET_NAME`.

This Lambda app also has the option of deleting the notification configuration
from an S3 bucket
"""
import os
import json
import logging

import boto3

logger = logging.getLogger()
logger.setLevel(logging.INFO)

REQUEST_TYPE_VALS = ["Delete", "Create", "Update"]


def lambda_handler(event, context):
s3 = boto3.client("s3")
logger.info(f"Received event: {json.dumps(event, indent=2)}")
if event["RequestType"] == "Delete":
logger.info(f'Request Type:{event["RequestType"]}')
delete_notification(s3, bucket=os.environ["S3_SOURCE_BUCKET_NAME"])
logger.info("Sending response to custom resource after Delete")
elif event["RequestType"] in ["Update", "Create"]:
logger.info(f'Request Type: {event["RequestType"]}')
add_notification(
s3,
lambda_arn=os.environ["S3_TO_GLUE_FUNCTION_ARN"],
bucket=os.environ["S3_SOURCE_BUCKET_NAME"],
bucket_key_prefix=os.environ["BUCKET_KEY_PREFIX"],
)
logger.info("Sending response to custom resource")
else:
err_msg = f"The 'RequestType' key should have one of the following values: {REQUEST_TYPE_VALS}"
raise KeyError(err_msg)


def add_notification(
s3_client: boto3.client,
lambda_arn: str,
bucket: str,
bucket_key_prefix: str,
):
"""Adds the S3 notification configuration to an existing bucket

Args:
s3_client (boto3.client) : s3 client to use for s3 event config
lambda_arn (str): Arn of the lambda s3 event config function
bucket (str): bucket name of the s3 bucket to add the config to
bucket_key_prefix (str): bucket key prefix for where to look for s3 object notifications
"""
s3_client.put_bucket_notification_configuration(
Bucket=bucket,
NotificationConfiguration={
"LambdaFunctionConfigurations": [
{
"LambdaFunctionArn": lambda_arn,
"Events": ["s3:ObjectCreated:*"],
"Filter": {
"Key": {
"FilterRules": [
{"Name": "prefix", "Value": bucket_key_prefix}
]
}
},
}
]
},
)
logger.info("Put request completed....")


def delete_notification(s3_client: boto3.client, bucket: str):
"""Deletes the S3 notification configuration from an existing bucket

Args:
s3_client (boto3.client) : s3 client to use for s3 event config
bucket (str): bucket name of the s3 bucket to delete the config in
"""
s3_client.put_bucket_notification_configuration(
Bucket=bucket, NotificationConfiguration={}
)
logger.info("Delete request completed....")
53 changes: 53 additions & 0 deletions src/lambda_function/s3_event_config/template.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
AWSTemplateFormatVersion: '2010-09-09'

Transform: AWS::Serverless-2016-10-31

Description: >
SAM Template for s3-event-config lambda function

Parameters:
Namespace:
Type: String
Description: >-
The namespace string used for the bucket key prefix

S3ToGlueFunctionArn:
Type: String
Description: Arn for the S3 Event Config Lambda Function

S3EventConfigRoleArn:
Type: String
Description: Arn for the S3 Event Config Lambda Role

S3SourceBucketName:
Type: String
Description: Name of the S3 bucket where source data are stored.

LambdaPythonVersion:
Type: String
Description: Python version to use for this lambda function
Default: "3.9"


Resources:
S3EventConfigFunction:
Type: AWS::Serverless::Function
Properties:
PackageType: Zip
CodeUri: ./
Handler: app.lambda_handler
Runtime: !Sub "python${LambdaPythonVersion}"
Role: !Ref S3EventConfigRoleArn
Timeout: 30
Environment:
Variables:
S3_SOURCE_BUCKET_NAME: !Ref S3SourceBucketName
S3_TO_GLUE_FUNCTION_ARN: !Ref S3ToGlueFunctionArn
BUCKET_KEY_PREFIX: !Ref Namespace

Outputs:
S3EventConfigFunctionArn:
Description: Arn of the S3EventConfigFunction function
Value: !GetAtt S3EventConfigFunction.Arn
Export:
Name: !Sub "${AWS::Region}-${AWS::StackName}-S3EventConfigFunctionArn"
Original file line number Diff line number Diff line change
Expand Up @@ -27,18 +27,18 @@ Use the SAM CLI to build and test your lambda locally.
Build your application with the `sam build` command.

```bash
cd src/lambda_function
cd src/lambda_function/s3_to_glue/
sam build
```

## Test events

### Creating/modifying test events

The file `single-record.json` in `src/lambda_function/events` contains a
The file `single-record.json` in `src/lambda_function/s3_to_glue/events` contains a
dummy event for an S3 event trigger. You can generate your own test events
for single or multiple records with the script at
`src/lambda_function/events/generate_test_event.py`.
`src/lambda_function/s3_to_glue/events/generate_test_event.py`.

### Invoking test events

Expand All @@ -52,7 +52,7 @@ if you are testing a stack deployed as part of a feature branch.
To invoke the lambda with the test event:

```bash
cd src/lambda_function
cd src/lambda_function/s3_to_glue
sam local invoke -e events/single-record.json --env-vars test-env-vars.json
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ Description: >

Parameters:

S3SourceBucketName:
Type: String
Description: Name of the S3 bucket where source data are stored.

S3ToGlueRoleArn:
Type: String
Description: Arn for the S3 to Glue Lambda Role
Expand All @@ -27,7 +31,7 @@ Resources:
Type: AWS::Serverless::Function
Properties:
PackageType: Zip
CodeUri: ./s3_to_glue
CodeUri: ./
Handler: app.lambda_handler
Runtime: !Sub "python${LambdaPythonVersion}"
Role: !Ref S3ToGlueRoleArn
Expand All @@ -36,6 +40,16 @@ Resources:
Variables:
PRIMARY_WORKFLOW_NAME: !Ref PrimaryWorkflowName

LambdaInvokePermission:
Type: AWS::Lambda::Permission
Properties:
FunctionName: !GetAtt S3ToGlueFunction.Arn
Action: lambda:InvokeFunction
Principal: s3.amazonaws.com
SourceAccount: !Ref 'AWS::AccountId'
SourceArn: !Sub 'arn:aws:s3:::${S3SourceBucketName}'


Outputs:
S3ToGlueFunctionArn:
Description: Arn of the S3ToGlueFunction function
Expand Down
57 changes: 57 additions & 0 deletions templates/s3-event-config-lambda-role.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
AWSTemplateFormatVersion: '2010-09-09'

Transform: AWS::Serverless-2016-10-31

Description: >
An IAM Role for the S3 Event Config lambda allowing one to put
s3 event notification configuration in the source bucket

Parameters:
S3SourceBucketName:
Type: String
Description: Name of the S3 bucket where source data are stored.


Resources:
S3EventConfigRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
Action:
- sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
Policies:
- PolicyName: PutS3NotificationConfiguration
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- 's3:GetBucketNotification'
- 's3:PutBucketNotification'
Resource:
- !Sub arn:aws:s3:::${S3SourceBucketName}
- Effect: Allow
Action:
- 'logs:CreateLogGroup'
- 'logs:CreateLogStream'
- 'logs:PutLogEvents'
Resource: 'arn:aws:logs:*:*:*'

Outputs:
RoleName:
Value: !Ref S3EventConfigRole
Export:
Name: !Sub '${AWS::Region}-${AWS::StackName}-RoleName'

RoleArn:
Value: !GetAtt S3EventConfigRole.Arn
Export:
Name: !Sub '${AWS::Region}-${AWS::StackName}-RoleArn'
Loading