-
Notifications
You must be signed in to change notification settings - Fork 903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use dataset factories to register default datasets #2668
Comments
Ccing @marrrcin for comments as I think overwriting default datasets is done across the collection of plugins you created. |
I've already had a call with @merelcht , maybe she can post some notes here. |
Current behaviour
Updating this behaviour to use a dataset factories instead for
Issues with
|
How does this affect the plugins? Our major plugins (AzureML, VertexAI and SageMaker) rely on the fact that we can transparently to the user (and the project) swap the default dataset implementation. This makes plugins' UX really good for the users, because they still don't have to think about datasets that are not explicitly mentioned in the catalog - they have the same experience as with |
@marrrcin This does not affect the user experience at all afaik. This is just internal implementation detail, where we currently set the default dataset in the runner but we want to leverage dataset factories to set it.
This does not override the default one that the user has set. How do the plugins set the default dataset at the moment? Do they use the |
Yes, we rely on |
@marrrcin In that case, it would be replaced by something like |
Moving it to We need to make sure |
When / where would the |
Completed in #3332 |
Description
After the introduction of dataset factories in #2423, the creation of default datasets which currently is handled by runners should be handled in the
DataCatalog
.Once the feature is already merged, users can overwrite default datasets using the following pattern -
This ticket is to change the behaviour of existing runners and use the factories way to register default datasets.
Instead of adding the catalog entry for default datasets in the catalog -
kedro/kedro/runner/runner.py
Lines 85 to 86 in 64b7960
The runner should call a method from the catalog to register a default pattern
To dos -
The text was updated successfully, but these errors were encountered: