Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create ETL task to map columns #36

Merged
merged 6 commits into from
Jun 21, 2024

Conversation

Yun-Wu
Copy link
Contributor

@Yun-Wu Yun-Wu commented Jun 21, 2024

  • Add a column mapping task
  • Add the task to registry
  • Add unittest

#34

@cisaacstern
Copy link
Collaborator

I know you didn't ask for review yet, but just wanted to say this is looking awesome so far!!

Installing pre-commit and running pre-commit run --all-files should resolve the linting issues 🙏

@Yun-Wu
Copy link
Contributor Author

Yun-Wu commented Jun 21, 2024

@cisaacstern thanks for the prompt review haha! Can you take a look again?

Copy link
Collaborator

@cisaacstern cisaacstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Commented on a few small nits.

The only thing from the linked issue I don't see explicitly covered here is:

"The function should provide a warning when critical columns (e.g., 'geometry') are being dropped."

Not sure if you'd like to address that now or in a follow-on. Approving this to indicate that either is fine for me.

In general, I really like the pattern of a full module of tests in for a single task and all its edge cases, that focuses on the behavior of the function in "pure python" mode (without any of the distributed behaviors interfering with the clarity of the logic). I think we should adopt this pattern for all of the task testing. The testing I have for the existing tasks is too opaque!

Copy link
Collaborator

@cisaacstern cisaacstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks Yun!!

One last typo noted below. Otherwise, if you're happy here, I'm good to merge!

ecoscope_workflows/tasks/transformation/_column_mapping.py Outdated Show resolved Hide resolved
@cisaacstern cisaacstern merged commit 2c9b03b into wildlife-dynamics:main Jun 21, 2024
4 checks passed
@cisaacstern cisaacstern mentioned this pull request Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants