-
Notifications
You must be signed in to change notification settings - Fork 999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Use string as a substitute for unregistered types during schema inference #3646
Conversation
… inference Signed-off-by: phil.park <[email protected]>
/assign @zhilingc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: felixwang9817, phil-park The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
# [0.32.0](v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([#3618](#3618)) ([bf740d2](bf740d2)) * Broken non-root path with projects-list.json ([#3665](#3665)) ([4861af0](4861af0)) * Clean up snowflake to_spark_df() ([#3607](#3607)) ([e8e643e](e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([#3640](#3640)) ([ef4ef32](ef4ef32)) * Fix scan datasize to 0 for inference schema ([#3628](#3628)) ([c3dd74e](c3dd74e)) * Fix timestamp consistency in push api ([#3614](#3614)) ([9b227d7](9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([#3630](#3630)) ([478caec](478caec)) * Implements connection pool for postgres online store ([#3633](#3633)) ([059509a](059509a)) * Manage redis pipe's context ([#3655](#3655)) ([48e0971](48e0971)) * Missing Catalog argument in athena connector ([#3661](#3661)) ([f6d3caf](f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([#3680](#3680)) ([1c01035](1c01035)) ### Features * Add gunicorn for serve with multiprocess ([#3636](#3636)) ([4de7faf](4de7faf)) * Use string as a substitute for unregistered types during schema inference ([#3646](#3646)) ([c474ccd](c474ccd))
* ci: Add bigtable cleanup script Signed-off-by: Danny C <[email protected]> * fix: Missing Catalog argument in athena connector (feast-dev#3661) update Catalog argument in athena connector Signed-off-by: Gyumin Lee <[email protected]> Co-authored-by: Gyumin Lee <[email protected]> * ci: Disable flaky lambda materialization test Signed-off-by: Danny C <[email protected]> * fix: Broken non-root path with projects-list.json (feast-dev#3665) ensure correct precedence with the two operators Signed-off-by: Ben Fletcher <[email protected]> * fix: Manage redis pipe's context (feast-dev#3655) Signed-off-by: Jiwon Park <[email protected]> * chore: Bump tough-cookie from 4.0.0 to 4.1.3 in /sdk/python/feast/ui (feast-dev#3677) Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.0.0 to 4.1.3. - [Release notes](https://github.com/salesforce/tough-cookie/releases) - [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md) - [Commits](salesforce/tough-cookie@v4.0.0...v4.1.3) --- updated-dependencies: - dependency-name: tough-cookie dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump tough-cookie from 4.0.0 to 4.1.3 in /ui (feast-dev#3676) Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.0.0 to 4.1.3. - [Release notes](https://github.com/salesforce/tough-cookie/releases) - [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md) - [Commits](salesforce/tough-cookie@v4.0.0...v4.1.3) --- updated-dependencies: - dependency-name: tough-cookie dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix: For SQL registry, increase max data_source_name length to 255 (feast-dev#3630) * sql.py data_sources.data_source_name String(255) Extend the limit of the data_source_name field from 50 to 255. Signed-off-by: Ross Donnachie <[email protected]> * fix: Optimize bytes processed when retrieving entity df schema to 0 (feast-dev#3680) feat: Optimize bytes processed when retrieving entity df schema to 0 Signed-off-by: Hai Nguyen <[email protected]> * fix: Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python (feast-dev#3640) * fix! KeyError: __dummy on entityless fv Signed-off-by: williamfoschiera <[email protected]> * fix! join_keys typing. Signed-off-by: williamfoschiera <[email protected]> --------- Signed-off-by: williamfoschiera <[email protected]> Co-authored-by: williamfoschiera <[email protected]> * chore: Bump protobufjs from 7.1.1 to 7.2.4 in /ui (feast-dev#3674) Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.1.1 to 7.2.4. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md) - [Commits](protobufjs/protobuf.js@protobufjs-v7.1.1...protobufjs-v7.2.4) --- updated-dependencies: - dependency-name: protobufjs dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump protobufjs from 7.1.2 to 7.2.4 in /sdk/python/feast/ui (feast-dev#3675) Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.1.2 to 7.2.4. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md) - [Commits](protobufjs/protobuf.js@protobufjs-v7.1.2...protobufjs-v7.2.4) --- updated-dependencies: - dependency-name: protobufjs dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump semver from 6.3.0 to 6.3.1 in /ui (feast-dev#3678) Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1. - [Release notes](https://github.com/npm/node-semver/releases) - [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md) - [Commits](npm/node-semver@v6.3.0...v6.3.1) --- updated-dependencies: - dependency-name: semver dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump semver from 6.3.0 to 6.3.1 in /sdk/python/feast/ui (feast-dev#3679) Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1. - [Release notes](https://github.com/npm/node-semver/releases) - [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md) - [Commits](npm/node-semver@v6.3.0...v6.3.1) --- updated-dependencies: - dependency-name: semver dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore: Bump google.golang.org/grpc from 1.47.0 to 1.53.0 (feast-dev#3670) Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.47.0 to 1.53.0. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](grpc/grpc-go@v1.47.0...v1.53.0) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(release): release 0.32.0 # [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2)) * Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0)) * Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32)) * Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e)) * Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec)) * Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a)) * Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971)) * Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035)) ### Features * Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf)) * Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd)) * fix: Redshift push ignores schema (feast-dev#3671) * Add fully-qualified-table-name Redshift prop Signed-off-by: Robin Neufeld <[email protected]> * pre-commit Signed-off-by: Robin Neufeld <[email protected]> * Docstring Signed-off-by: Robin Neufeld <[email protected]> * Test fully_qualified_table_name Signed-off-by: Robin Neufeld <[email protected]> * Simplify logic Signed-off-by: Robin Neufeld <[email protected]> * pre-commit Signed-off-by: Robin Neufeld <[email protected]> * pre-commit Signed-off-by: Robin Neufeld <[email protected]> * Test offline_write_batch Signed-off-by: Robin Neufeld <[email protected]> * Bump to trigger CI Signed-off-by: Robin Neufeld <[email protected]> * another bump for ci Signed-off-by: Robin Neufeld <[email protected]> --------- Signed-off-by: Robin Neufeld <[email protected]> * fix: Add aws-sts dependency in java sdk so that S3 client acquires IRSA role (feast-dev#3696) Add aws-sts dependency in java sdk Signed-off-by: harmeet-singh-discovery <[email protected]> * Adding initial update changes * Added formatting changes * Revert "Merge branch 'feast-dev:master' into msudhir/add-vector-update-functionality" This reverts commit 8487678, reversing changes made to 0578b9b. * Added more tests and functionality * updating tests * updated functionality and added more tests * correcting a test case * Making formatting corrections and changeing log * Improved tests and added functionality to convert feast schema to milvus readable schema * Added PR Review comments * Fixed failing test --------- Signed-off-by: Danny C <[email protected]> Signed-off-by: Gyumin Lee <[email protected]> Signed-off-by: Ben Fletcher <[email protected]> Signed-off-by: Jiwon Park <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Ross Donnachie <[email protected]> Signed-off-by: Hai Nguyen <[email protected]> Signed-off-by: williamfoschiera <[email protected]> Signed-off-by: Robin Neufeld <[email protected]> Signed-off-by: harmeet-singh-discovery <[email protected]> Co-authored-by: Danny C <[email protected]> Co-authored-by: 이규민 <[email protected]> Co-authored-by: Gyumin Lee <[email protected]> Co-authored-by: Ben Fletcher <[email protected]> Co-authored-by: Jiwon Park <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Ross Donnachie <[email protected]> Co-authored-by: Harry <[email protected]> Co-authored-by: William Foschiera <[email protected]> Co-authored-by: williamfoschiera <[email protected]> Co-authored-by: feast-ci-bot <[email protected]> Co-authored-by: Robin Neufeld <[email protected]> Co-authored-by: harmeet-singh-discovery <[email protected]> Co-authored-by: Manisha Sudhir <[email protected]>
# [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2)) * Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0)) * Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32)) * Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e)) * Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec)) * Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a)) * Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971)) * Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035)) ### Features * Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf)) * Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd)) Signed-off-by: Attila Toth <[email protected]>
… inference (feast-dev#3646) Signed-off-by: phil.park <[email protected]> Signed-off-by: zerafachris PERSONAL <[email protected]>
# [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2)) * Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0)) * Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32)) * Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e)) * Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec)) * Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a)) * Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971)) * Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035)) ### Features * Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf)) * Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd)) Signed-off-by: zerafachris PERSONAL <[email protected]>
# [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2)) * Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0)) * Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32)) * Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e)) * Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec)) * Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a)) * Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971)) * Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035)) ### Features * Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf)) * Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd))
# [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2)) * Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0)) * Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32)) * Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e)) * Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec)) * Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a)) * Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971)) * Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035)) ### Features * Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf)) * Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd))
What this PR does / why we need it:
For column types that exist in bigquery but don't currently exist in Feast's typemap (e.g. DATE), most of them can be used as strings.
Currently, attempting schema inference will result in an error saying that there is no type.
In order to use schema inference more actively, it seems reasonable to infer these columns as string type.
Which issue(s) this PR fixes:
Fixes #