Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registry store plugin #1812

Merged
merged 11 commits into from
Sep 14, 2021
Merged

Conversation

DvirDukhan
Copy link
Collaborator

What this PR does / why we need it:
Main feature: This PR allows adding third-party registry stores, to allow saving the registry in custom storage services other than s3/gcs/files.
Additional changes:

  1. Moved aws/gcp/local RegistryStore implementation to their respected files (together with their providers)
  2. Registry class is now being initialized with RegistryConfig.
    Which issue(s) this PR fixes:

None

Does this PR introduce a user-facing change?:

This PR adds a new optional field under the registry configuration, called `registry_store_provider`.  This allows loading a third-party class, implementing `RegistryStore`.  The third-party class name must end with `RegisrtryStore`, for example `foo.registry_store.FooRegistryStore`.

@feast-ci-bot
Copy link
Collaborator

Hi @DvirDukhan. Thanks for your PR.

I'm waiting for a feast-dev member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@codecov-commenter
Copy link

codecov-commenter commented Aug 29, 2021

Codecov Report

Merging #1812 (b795399) into master (24d21ec) will increase coverage by 0.08%.
The diff coverage is 86.03%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1812      +/-   ##
==========================================
+ Coverage   84.87%   84.96%   +0.08%     
==========================================
  Files          91       92       +1     
  Lines        6961     7029      +68     
==========================================
+ Hits         5908     5972      +64     
- Misses       1053     1057       +4     
Flag Coverage Δ
integrationtests 84.89% <86.03%> (+0.09%) ⬆️
unittests 63.46% <55.85%> (+0.27%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sdk/python/feast/repo_operations.py 44.21% <0.00%> (-0.37%) ⬇️
sdk/python/feast/registry_store.py 75.00% <75.00%> (ø)
sdk/python/feast/infra/aws.py 88.88% <80.35%> (-11.12%) ⬇️
sdk/python/feast/infra/gcp.py 92.78% <86.53%> (-7.22%) ⬇️
sdk/python/feast/registry.py 81.06% <88.00%> (-0.16%) ⬇️
sdk/python/feast/infra/local.py 92.50% <93.33%> (+0.34%) ⬆️
sdk/python/feast/feature_store.py 93.76% <100.00%> (ø)
sdk/python/feast/repo_config.py 93.02% <100.00%> (+0.05%) ⬆️
.../python/tests/integration/registration/test_cli.py 100.00% <100.00%> (ø)
...on/tests/integration/registration/test_registry.py 100.00% <100.00%> (ø)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 24d21ec...b795399. Read the comment docs.

@woop
Copy link
Member

woop commented Aug 30, 2021

/ok-to-test

@achals
Copy link
Member

achals commented Sep 7, 2021

I'm going to review this today.

@achals achals added the kind/feature New feature or request label Sep 7, 2021
f"Supported schemes are file and gs."
)
self.cached_registry_proto_ttl = cache_ttl
registry_store_provider = str(registry_store_provider)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense to pull out the RegistryStore abstract class into it's own file for simplicity

@@ -52,6 +52,9 @@ class Config:
class RegistryConfig(FeastBaseModel):
""" Metadata Store Configuration. Configuration that relates to reading from and writing to the Feast registry."""

registry_store_provider: Optional[StrictStr]
Copy link
Member

@woop woop Sep 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reusing the name provider here is quite confusing since we have a provider concept already. Is it possible for us to infer the backend based on the URI, or use something like type: to specify the class location? That would be consistent with OnlineStore, OfflineStore, and Provider.

@DvirDukhan
Copy link
Collaborator Author

Rebased and addressed comments.

Comment on lines 88 to 102
if "." not in registry_store_type:
if uri.scheme == "gs":
from feast.infra.gcp import GCSRegistryStore

self._registry_store = GCSRegistryStore(registry_path)
elif uri.scheme == "s3":
from feast.infra.aws import S3RegistryStore

self._registry_store = S3RegistryStore(registry_path)
elif uri.scheme == "file" or uri.scheme == "":
from feast.infra.local import LocalRegistryStore

self._registry_store = LocalRegistryStore(
repo_path=repo_path, registry_path_string=registry_path
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would we hit this case? When users specified registry_store_type: S3RegistryStore, something like that?

I think that in that case, we probably want to have a dict of the "shortcut" types mapping to the fully qualified type (like in https://github.com/feast-dev/feast/blob/master/sdk/python/feast/repo_config.py#L22), and use the dynamic importer.get_class_from_type path.

Copy link
Collaborator Author

@DvirDukhan DvirDukhan Sep 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achals thanks for the review.
I did the modifications, however, since there is a single constructor call self._registry_store = cls(registry_config, repo_path) this enforces S3RegistryStore and GCSRegistryStore constructors to have the signature:
def __init__(self, registry_config: RegistryConfig, repo_path: Path):
This is a bit awkward IMO since repo_path is not being used in those classes, it is just for the LocalRegistryStore
Let me know if you think there is a better approach to consolidate those constructors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to have a single interface even if it means passing in an unused parameter in some of the implementations, so I'd be okay with that change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, @achals. The change is already in the latest modification on the PR.

Copy link
Member

@achals achals left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the second back and forth @DvirDukhan , but I think that's the only other major concern I have. If we address that then i'm happy to lgtm

@abstractmethod
def teardown(self):
"""
Tear down all resources.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All resources or the registry?

"""

@abstractmethod
def get_registry_proto(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the return type be a part of this method?


class RegistryStore(ABC):
"""
RegistryStore: abstract base class implemented by specific backends (local file system, GCS)
Copy link
Member

@woop woop Sep 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment can be made tighter. We already know the class is RegistryStore, and an abstract base class, so that doesn't need to be repeated.

What about

A registry store is a storage backend for the Feast registry.

or if we prefer classes.

RegistryStore is a storage backend for the Registry class.

Copy link
Member

@achals achals left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: achals, DvirDukhan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@achals
Copy link
Member

achals commented Sep 14, 2021

Thanks @DvirDukhan ! The change looks good to me.

@feast-ci-bot feast-ci-bot merged commit 77cdc0e into feast-dev:master Sep 14, 2021
@woop woop mentioned this pull request Jan 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants