Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Update SparkSource to have proper comparable that inspects SparkOptions #3819

Closed
wants to merge 3 commits into from

Conversation

thechopkins
Copy link

What this PR does / why we need it:
This updates the SparkSource data type to be properly comparable. The previous form defaults to using the __eq__ implementation from the DataSource class. This misses updates to additional properties on the SparkSource class (i.e. anything that feeds into the SparkOptions class).

This PR resolves this and adds tests to confirm the updated functionality. This means that FeatureStore.plan() and FeatureStore.apply() will properly pick up changes to SparkSources correctly.

@thechopkins thechopkins changed the title fix: [Update SparkSource to have proper comparable that inspects SparkOptions] fix: pdate SparkSource to have proper comparable that inspects SparkOptions Oct 25, 2023
@thechopkins thechopkins changed the title fix: pdate SparkSource to have proper comparable that inspects SparkOptions fix: Update SparkSource to have proper comparable that inspects SparkOptions Oct 25, 2023
Signed-off-by: Calvin Hopkins <[email protected]>
Signed-off-by: Calvin Hopkins <[email protected]>
# Note: Python requires redefining hash in child classes that override __eq__
def __hash__(self):

return super().__hash__()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also include self.spark_options?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't necessarily need to as the two pieces of logic are used for different things. Take a look at the redshift_source.py example for how it is implemented elsewhere.

@franciscojavierarceo
Copy link
Member

looks like linter is failing here. @thechopkins can you update?

@jeremyary
Copy link
Collaborator

closing out as this was addressed via the mentioned PR 4028. Thanks!

@jeremyary jeremyary closed this Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants