Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Remove unwanted excessive splitting of gcs path, so expected gcs parquet paths are returned from BigQueryRetrievalJob.to_remote_storage() #3730

Conversation

crispin-ki
Copy link
Contributor

What this PR does / why we need it:
See #3712

Which issue(s) this PR fixes:

Fixes #3712

@crispin-ki crispin-ki changed the title Remove unwanted excessive splitting of gcs path fix: Remove unwanted excessive splitting of gcs path, so expected gcs parquet paths are returned from BigQueryRetrievalJob.to_remote_storage() Aug 15, 2023
@crispin-ki
Copy link
Contributor Author

Tagging @achals - as I see you did the initial PR to implement this in #2918

@crispin-ki crispin-ki force-pushed the fix-to-remote-storage-return-value-for-big-query-retrieval-job branch from ffc19b4 to 38b2cba Compare August 15, 2023 12:20
@achals
Copy link
Member

achals commented Aug 15, 2023

/lgtm
/ok-to-test

@crispin-ki
Copy link
Contributor Author

crispin-ki commented Aug 15, 2023

@achals thanks for setting off the python unit tests. You can see some are failing, but not due to the changes here. I've in fact opened a new issue, with a potential solution to fix the failing tests here: #3731 . Let me know if I can include that fix for the tests in this PR, or if you'd rather a new separate PR for that fix.

Update: I've just gone ahead and done the PR for fixing the tests here: #3734
Once approved, will merge that PR, rebase this branch off master and then the tests will pass in this PR too

@crispin-ki crispin-ki force-pushed the fix-to-remote-storage-return-value-for-big-query-retrieval-job branch 3 times, most recently from 6598497 to 435bfdb Compare August 16, 2023 20:58
@crispin-ki
Copy link
Contributor Author

crispin-ki commented Aug 16, 2023

Update: the PR for fixing the tests here: #3734 has been merged and l've rebased this branch off master.

The unit tests now pass for python 3.8 ubuntu latest but not for 3.9, 3.10 ubuntu latest or 3.8 macOS. Not quite sure why they now fail at Install Dependencies stage, as they were passing before here. It's worth noting that the 3.10 ubuntu latest python unit tests also did pass on a previous commit's workflow runs here (I force-pushed the same change but a different commit hash to retrigger workflows a couple of times to see if the failures are consistent, and they seem not to be)

@adchia
Copy link
Collaborator

adchia commented Sep 5, 2023

Can you rebase to master and try this again? Fixed the CI issues (but I can't push back to your branch)

@crispin-ki crispin-ki force-pushed the fix-to-remote-storage-return-value-for-big-query-retrieval-job branch from 435bfdb to d82e07c Compare September 5, 2023 11:14
@adchia adchia merged commit f2c5988 into feast-dev:master Sep 5, 2023
14 of 15 checks passed
adchia pushed a commit that referenced this pull request Sep 7, 2023
# [0.34.0](v0.33.0...v0.34.0) (2023-09-07)

### Bug Fixes

* Add NUMERIC to bq_to_feast type map ([#3719](#3719)) ([6474b4b](6474b4b))
* Fix python unit tests ([#3734](#3734)) ([e81684d](e81684d))
* Handle unknown postgres source types gracefully ([#3634](#3634)) ([d7041f4](d7041f4))
* Pin protobuf version to avoid seg fault on some machines ([028cc20](028cc20))
* Remove unwanted excessive splitting of gcs path, so expected gcs parquet paths are returned from BigQueryRetrievalJob.to_remote_storage() ([#3730](#3730)) ([f2c5988](f2c5988))
* Run store.plan() only when need it. ([#3708](#3708)) ([7bc7c47](7bc7c47))
* Saved datasets no longer break CLI registry-dump command ([#3717](#3717)) ([f28ccc2](f28ccc2))
* Update py3.8 ci requirements for cython 3.0 release ([#3735](#3735)) ([1695c13](1695c13))

### Features

* Enhance customization of Trino connections when using Trino-based Offline Stores ([#3699](#3699)) ([ed7535e](ed7535e))
* Implement gRPC server to ingest streaming features ([#3687](#3687)) ([a3fcd1f](a3fcd1f))
james-crabtree-sp pushed a commit to sailpoint/feast that referenced this pull request Sep 14, 2023
… parquet paths are returned from BigQueryRetrievalJob.to_remote_storage() (feast-dev#3730)

Remove unwanted excessive splitting of gcs path

Signed-off-by: Crispin Logan <[email protected]>
james-crabtree-sp pushed a commit to sailpoint/feast that referenced this pull request Sep 14, 2023
# [0.34.0](feast-dev/feast@v0.33.0...v0.34.0) (2023-09-07)

### Bug Fixes

* Add NUMERIC to bq_to_feast type map ([feast-dev#3719](feast-dev#3719)) ([6474b4b](feast-dev@6474b4b))
* Fix python unit tests ([feast-dev#3734](feast-dev#3734)) ([e81684d](feast-dev@e81684d))
* Handle unknown postgres source types gracefully ([feast-dev#3634](feast-dev#3634)) ([d7041f4](feast-dev@d7041f4))
* Pin protobuf version to avoid seg fault on some machines ([028cc20](feast-dev@028cc20))
* Remove unwanted excessive splitting of gcs path, so expected gcs parquet paths are returned from BigQueryRetrievalJob.to_remote_storage() ([feast-dev#3730](feast-dev#3730)) ([f2c5988](feast-dev@f2c5988))
* Run store.plan() only when need it. ([feast-dev#3708](feast-dev#3708)) ([7bc7c47](feast-dev@7bc7c47))
* Saved datasets no longer break CLI registry-dump command ([feast-dev#3717](feast-dev#3717)) ([f28ccc2](feast-dev@f28ccc2))
* Update py3.8 ci requirements for cython 3.0 release ([feast-dev#3735](feast-dev#3735)) ([1695c13](feast-dev@1695c13))

### Features

* Enhance customization of Trino connections when using Trino-based Offline Stores ([feast-dev#3699](feast-dev#3699)) ([ed7535e](feast-dev@ed7535e))
* Implement gRPC server to ingest streaming features ([feast-dev#3687](feast-dev#3687)) ([a3fcd1f](feast-dev@a3fcd1f))
james-crabtree-sp pushed a commit to sailpoint/feast that referenced this pull request Sep 14, 2023
… parquet paths are returned from BigQueryRetrievalJob.to_remote_storage() (feast-dev#3730)

Remove unwanted excessive splitting of gcs path

Signed-off-by: Crispin Logan <[email protected]>
james-crabtree-sp pushed a commit to sailpoint/feast that referenced this pull request Sep 14, 2023
# [0.34.0](feast-dev/feast@v0.33.0...v0.34.0) (2023-09-07)

### Bug Fixes

* Add NUMERIC to bq_to_feast type map ([feast-dev#3719](feast-dev#3719)) ([6474b4b](feast-dev@6474b4b))
* Fix python unit tests ([feast-dev#3734](feast-dev#3734)) ([e81684d](feast-dev@e81684d))
* Handle unknown postgres source types gracefully ([feast-dev#3634](feast-dev#3634)) ([d7041f4](feast-dev@d7041f4))
* Pin protobuf version to avoid seg fault on some machines ([028cc20](feast-dev@028cc20))
* Remove unwanted excessive splitting of gcs path, so expected gcs parquet paths are returned from BigQueryRetrievalJob.to_remote_storage() ([feast-dev#3730](feast-dev#3730)) ([f2c5988](feast-dev@f2c5988))
* Run store.plan() only when need it. ([feast-dev#3708](feast-dev#3708)) ([7bc7c47](feast-dev@7bc7c47))
* Saved datasets no longer break CLI registry-dump command ([feast-dev#3717](feast-dev#3717)) ([f28ccc2](feast-dev@f28ccc2))
* Update py3.8 ci requirements for cython 3.0 release ([feast-dev#3735](feast-dev#3735)) ([1695c13](feast-dev@1695c13))

### Features

* Enhance customization of Trino connections when using Trino-based Offline Stores ([feast-dev#3699](feast-dev#3699)) ([ed7535e](feast-dev@ed7535e))
* Implement gRPC server to ingest streaming features ([feast-dev#3687](feast-dev#3687)) ([a3fcd1f](feast-dev@a3fcd1f))
zseta pushed a commit to zseta/feast that referenced this pull request Feb 7, 2024
… parquet paths are returned from BigQueryRetrievalJob.to_remote_storage() (feast-dev#3730)

Remove unwanted excessive splitting of gcs path

Signed-off-by: Crispin Logan <[email protected]>
Signed-off-by: Attila Toth <[email protected]>
zseta pushed a commit to zseta/feast that referenced this pull request Feb 7, 2024
# [0.34.0](feast-dev/feast@v0.33.0...v0.34.0) (2023-09-07)

### Bug Fixes

* Add NUMERIC to bq_to_feast type map ([feast-dev#3719](feast-dev#3719)) ([6474b4b](feast-dev@6474b4b))
* Fix python unit tests ([feast-dev#3734](feast-dev#3734)) ([e81684d](feast-dev@e81684d))
* Handle unknown postgres source types gracefully ([feast-dev#3634](feast-dev#3634)) ([d7041f4](feast-dev@d7041f4))
* Pin protobuf version to avoid seg fault on some machines ([028cc20](feast-dev@028cc20))
* Remove unwanted excessive splitting of gcs path, so expected gcs parquet paths are returned from BigQueryRetrievalJob.to_remote_storage() ([feast-dev#3730](feast-dev#3730)) ([f2c5988](feast-dev@f2c5988))
* Run store.plan() only when need it. ([feast-dev#3708](feast-dev#3708)) ([7bc7c47](feast-dev@7bc7c47))
* Saved datasets no longer break CLI registry-dump command ([feast-dev#3717](feast-dev#3717)) ([f28ccc2](feast-dev@f28ccc2))
* Update py3.8 ci requirements for cython 3.0 release ([feast-dev#3735](feast-dev#3735)) ([1695c13](feast-dev@1695c13))

### Features

* Enhance customization of Trino connections when using Trino-based Offline Stores ([feast-dev#3699](feast-dev#3699)) ([ed7535e](feast-dev@ed7535e))
* Implement gRPC server to ingest streaming features ([feast-dev#3687](feast-dev#3687)) ([a3fcd1f](feast-dev@a3fcd1f))

Signed-off-by: Attila Toth <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants