Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ETL-414] Add script to generate test S3 event notification #44

Merged
merged 2 commits into from
Apr 28, 2023

Conversation

philerooski
Copy link
Contributor

This script can generate a S3 event notification test event for a single S3 object or for all S3 objects which exist directly under a specific S3 key prefix. Using this test event with our lambda simulates an object create/update event in an S3 bucket.

@philerooski philerooski requested a review from a team as a code owner April 27, 2023 21:13
default="main/2023-01-12T22--02--17Z_77fefff8-b0e2-4c1b-b0c5-405554c92368",
help=("Optional. Name of the Synapse dataset containing test data. "
"Default is `main/2023-01-12T22--02--17Z_77fefff8-b0e2-4c1b-b0c5-405554c92368` "))
parser.add_argument("--input-key-prefix",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realize for recover data, if we reprocess, we can only reprocess everything at this point or if we somehow know the specific filename? Are the individual recover zipped files grouped by anything in particular other than the datetime it was created?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the individual recover zipped files grouped by anything

That's a discussion we could have with digital health. They are the ones copying the data from the ingestion bucket to the input bucket, so we could come up with a partitioning structure.

In practice we would only ever submit the entire dataset. We need to remove deleted samples and those could appear anywhere.

Copy link
Contributor

@rxu17 rxu17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Copy link
Member

@thomasyu888 thomasyu888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome - just some questions/comments prior to approval.

src/lambda_function/events/generate_test_event.py Outdated Show resolved Hide resolved
src/lambda_function/events/generate_test_event.py Outdated Show resolved Hide resolved
@philerooski philerooski temporarily deployed to develop April 28, 2023 18:02 — with GitHub Actions Inactive
@philerooski philerooski temporarily deployed to develop April 28, 2023 18:04 — with GitHub Actions Inactive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you planning on adding a bootstrap cron trigger that does these event triggers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Are you wondering how we will submit these events as part of our tests or as part of the production pipeline?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @philerooski. ^ Both

Copy link
Member

@thomasyu888 thomasyu888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🪐 LGTM! I'm going to pre-approve, but just had some minor comments/questions.

@philerooski philerooski temporarily deployed to develop April 28, 2023 18:28 — with GitHub Actions Inactive
@philerooski philerooski temporarily deployed to develop April 28, 2023 18:29 — with GitHub Actions Inactive
@philerooski philerooski merged commit 4704845 into main Apr 28, 2023
@philerooski philerooski deleted the etl-414 branch April 28, 2023 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants