Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix broken proto conversion methods for data sources #2603

Merged
merged 15 commits into from
Apr 24, 2022

Conversation

felixwang9817
Copy link
Collaborator

@felixwang9817 felixwang9817 commented Apr 23, 2022

What this PR does / why we need it: The Redshift and Snowflake data sources did not implement conversion to and from protos correctly. This PR fixes that logic and adds tests. It also cleans up a bunch of data source classes by removing unnecessary properties.

Which issue(s) this PR fixes:

Fixes #2581

@codecov-commenter
Copy link

codecov-commenter commented Apr 23, 2022

Codecov Report

Merging #2603 (0a0cc58) into master (0ca6297) will increase coverage by 0.11%.
The diff coverage is 91.78%.

@@            Coverage Diff             @@
##           master    #2603      +/-   ##
==========================================
+ Coverage   82.28%   82.40%   +0.11%     
==========================================
  Files         155      155              
  Lines       12854    12788      -66     
==========================================
- Hits        10577    10538      -39     
+ Misses       2277     2250      -27     
Flag Coverage Δ
integrationtests 72.04% <70.58%> (-0.56%) ⬇️
unittests 60.45% <79.45%> (+0.33%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
..._stores/contrib/postgres_offline_store/postgres.py 33.33% <0.00%> (ø)
...ffline_stores/contrib/spark_offline_store/spark.py 37.01% <0.00%> (ø)
...ffline_stores/contrib/trino_offline_store/trino.py 41.61% <0.00%> (ø)
...stores/contrib/trino_offline_store/trino_source.py 56.17% <0.00%> (ø)
...python/feast/infra/offline_stores/offline_store.py 83.33% <ø> (ø)
sdk/python/feast/infra/passthrough_provider.py 100.00% <ø> (ø)
...ion/feature_repos/universal/data_source_creator.py 80.95% <ø> (ø)
...tion/feature_repos/universal/data_sources/trino.py 43.75% <ø> (ø)
...n/tests/integration/registration/test_inference.py 100.00% <ø> (ø)
sdk/python/tests/unit/test_feature_views.py 100.00% <ø> (ø)
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0ca6297...0a0cc58. Read the comment docs.

Copy link
Member

@achals achals left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@@ -183,6 +179,7 @@ def to_proto(self) -> DataSourceProto:
A DataSourceProto object.
"""
data_source_proto = DataSourceProto(
name=self.name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this the main bug leading to duplicate data sources?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this was the root cause, see #2581 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

assert DataSource.from_proto(kafka_source.to_proto()) == kafka_source
assert DataSource.from_proto(kinesis_source.to_proto()) == kinesis_source
assert DataSource.from_proto(push_source.to_proto()) == push_source
assert DataSource.from_proto(request_source.to_proto()) == request_source
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: achals, felixwang9817

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [achals,felixwang9817]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@achals
Copy link
Member

achals commented Apr 24, 2022

We should patch release this soon-ish.

@achals
Copy link
Member

achals commented Apr 24, 2022

/lgtm

@feast-ci-bot feast-ci-bot merged commit 00ed65a into feast-dev:master Apr 24, 2022
kevjumba pushed a commit that referenced this pull request Apr 28, 2022
* Fix Snowflake proto conversion and add test

Signed-off-by: Felix Wang <[email protected]>

* Add proto conversion test for FileSource

Signed-off-by: Felix Wang <[email protected]>

* Fix Redshift proto conversion and add test

Signed-off-by: Felix Wang <[email protected]>

* Add proto conversion test for BigQuerySource

Signed-off-by: Felix Wang <[email protected]>

* Fix tests to use DataSource.from_proto

Signed-off-by: Felix Wang <[email protected]>

* Add proto conversion test for KafkaSource

Signed-off-by: Felix Wang <[email protected]>

* Add proto conversion test for KinesisSource

Signed-off-by: Felix Wang <[email protected]>

* Add proto conversion test for PushSource

Signed-off-by: Felix Wang <[email protected]>

* Add proto conversion test for PushSource

Signed-off-by: Felix Wang <[email protected]>

* Add name and other fixes

Signed-off-by: Felix Wang <[email protected]>

* Fix proto conversion tests

Signed-off-by: Felix Wang <[email protected]>

* Add tags to test

Signed-off-by: Felix Wang <[email protected]>

* Fix BigQuerySource bug

Signed-off-by: Felix Wang <[email protected]>

* Fix bug in RedshiftSource and TrinoSource

Signed-off-by: Felix Wang <[email protected]>

* Remove references to event_timestamp_column

Signed-off-by: Felix Wang <[email protected]>
kevjumba pushed a commit that referenced this pull request Apr 28, 2022
## [0.20.2](v0.20.1...v0.20.2) (2022-04-28)

### Bug Fixes

* Feature with timestamp type is incorrectly interpreted by Go FS ([#2588](#2588)) ([3ec943a](3ec943a))
* Fix AWS bootstrap template ([#2604](#2604)) ([6df5a49](6df5a49))
* Fix broken proto conversion methods for data sources ([#2603](#2603)) ([c391216](c391216))
* Remove ci extra from the feature transformation server dockerfile ([#2618](#2618)) ([a7437fa](a7437fa))
* Update field api to add tag parameter corresponding to labels in Feature. ([#2610](#2610)) ([40962fc](40962fc))
* Use timestamp type when converting unixtimestamp feature type to arrow ([#2593](#2593)) ([a1c3ee3](a1c3ee3))
achals pushed a commit that referenced this pull request May 13, 2022
# [0.21.0](v0.20.0...v0.21.0) (2022-05-13)

### Bug Fixes

* Addresses ZeroDivisionError when materializing file source with same timestamps ([#2551](#2551)) ([1e398d9](1e398d9))
* Asynchronously refresh registry for the feast ui command ([#2672](#2672)) ([1b09ca2](1b09ca2))
* Build platform specific python packages with ci-build-wheel ([#2555](#2555)) ([b10a4cf](b10a4cf))
* Delete data sources from registry when using the diffing logic ([#2669](#2669)) ([fc00ca8](fc00ca8))
* Enforce kw args featureservice ([#2575](#2575)) ([160d7b7](160d7b7))
* Enforce kw args in datasources ([#2567](#2567)) ([0b7ec53](0b7ec53))
* Feature logging to Redshift is broken ([#2655](#2655)) ([479cd51](479cd51))
* Feature service to templates ([#2649](#2649)) ([1e02066](1e02066))
* Feature with timestamp type is incorrectly interpreted by Go FS ([#2588](#2588)) ([e3d9588](e3d9588))
* Fix `__hash__` methods ([#2556](#2556)) ([ebb7dfe](ebb7dfe))
* Fix AWS bootstrap template ([#2604](#2604)) ([c94a69c](c94a69c))
* Fix broken proto conversion methods for data sources ([#2603](#2603)) ([00ed65a](00ed65a))
* Fix case where on demand feature view tab is broken if no custom tabs are passed.  ([#2682](#2682)) ([01d3568](01d3568))
* Fix DynamoDB fetches when there are entities that are not found ([#2573](#2573)) ([7076fe0](7076fe0))
* Fix Feast UI parser to work with new APIs ([#2668](#2668)) ([8d76751](8d76751))
* Fix java server after odfv update ([#2602](#2602)) ([0ca6297](0ca6297))
* Fix materialization with ttl=0 bug ([#2666](#2666)) ([ab78702](ab78702))
* Fix push sources and add docs / tests pushing via the python feature server ([#2561](#2561)) ([e8e418e](e8e418e))
* Fixed data mapping errors for Snowflake ([#2558](#2558)) ([53c2ce2](53c2ce2))
* Forcing ODFV udfs to be __main__ module and fixing false positive duplicate data source warning ([#2677](#2677)) ([2ce33cd](2ce33cd))
* Include the ui/build directory, and remove package data ([#2681](#2681)) ([0384f5f](0384f5f))
* Infer features for feature services when they depend on feature views without schemas ([#2653](#2653)) ([87c194c](87c194c))
* Pin dependencies to nearest major version ([#2647](#2647)) ([bb72b7c](bb72b7c))
* Pin pip<22.1 to get around breaking change in pip==22.1 ([#2678](#2678)) ([d3e01bc](d3e01bc))
* Punt deprecation warnings and clean up some warnings. ([#2670](#2670)) ([f775d2e](f775d2e))
* Reject undefined features when using `get_historical_features` or `get_online_features` ([#2665](#2665)) ([36849fb](36849fb))
* Remove ci extra from the feature transformation server dockerfile ([#2618](#2618)) ([25613b4](25613b4))
* Remove incorrect call to logging.basicConfig ([#2676](#2676)) ([8cbf51c](8cbf51c))
* Small typo in CLI ([#2578](#2578)) ([f372981](f372981))
* Switch from `join_key` to `join_keys` in tests and docs ([#2580](#2580)) ([d66c931](d66c931))
* Teardown trino container correctly after tests ([#2562](#2562)) ([72f1558](72f1558))
* Update build_go_protos to use a consistent python path ([#2550](#2550)) ([f136f8c](f136f8c))
* Update data source timestamp inference error message to make sense ([#2636](#2636)) ([3eaf6b7](3eaf6b7))
* Update field api to add tag parameter corresponding to labels in Feature. ([#2610](#2610)) ([689d20b](689d20b))
* Update java integration tests and add more logging ([#2637](#2637)) ([10e23b4](10e23b4))
* Update on demand feature view api ([#2587](#2587)) ([38cd7f9](38cd7f9))
* Update RedisCluster to use redis-py official implementation ([#2554](#2554)) ([ce5606f](ce5606f))
* Use cwd when getting module path ([#2577](#2577)) ([b550e59](b550e59))
* Use ParquetDataset for Schema Inference ([#2686](#2686)) ([4f85e3e](4f85e3e))
* Use timestamp type when converting unixtimestamp feature type to arrow ([#2593](#2593)) ([c439611](c439611))

### Features

* Add hbase online store support in feast ([#2590](#2590)) ([c9eda79](c9eda79))
* Adding SSL options for Postgres ([#2644](#2644)) ([0e809c2](0e809c2))
* Allow Feast UI to be spun up with CLI command: feast ui ([#2667](#2667)) ([44ca9f5](44ca9f5))
* Allow to pass secrets and environment variables to transformation service ([#2632](#2632)) ([ffa33ad](ffa33ad))
* CLI command 'feast serve' should start go-based server if flag is enabled ([#2617](#2617)) ([f3ff812](f3ff812))
* Create stream and batch feature view abstractions ([#2559](#2559)) ([d1f76e5](d1f76e5))
* Postgres supported as Registry, Online store, and Offline store ([#2401](#2401)) ([ed2f979](ed2f979))
* Support entity fields in feature view `schema` parameter by dropping them ([#2568](#2568)) ([c8fcc35](c8fcc35))
* Write logged features to an offline store (Python API) ([#2574](#2574)) ([134dc5f](134dc5f))
* Write logged features to Offline Store (Go - Python integration) ([#2621](#2621)) ([ccad832](ccad832))

### Reverts

* Revert "chore: Deprecate value type (#2611)" (#2643) ([4fbdfb1](4fbdfb1)), closes [#2611](#2611) [#2643](#2643)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duplicate data source in feature registry
4 participants