Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make datasets work with chunk wise data processing #1980

Closed
merelcht opened this issue Oct 26, 2022 · 1 comment
Closed

Make datasets work with chunk wise data processing #1980

merelcht opened this issue Oct 26, 2022 · 1 comment
Labels
Component: IO Issue/PR addresses data loading/saving/versioning and validation, the DataCatalog and DataSets Issue: Feature Request New feature or improvement to existing feature

Comments

@merelcht
Copy link
Member

merelcht commented Oct 26, 2022

Description

Kedro nodes currently can only accept already loaded data, but not lazily loaded one (or at least that's how the standard approach is). We should investigate if there's any changes needed in the design of the datasets, so Kedro nodes can accept iterators as arguments and yield data, so it can be loaded-processed-saved chink-wise.

@merelcht merelcht added Issue: Feature Request New feature or improvement to existing feature Component: IO Issue/PR addresses data loading/saving/versioning and validation, the DataCatalog and DataSets labels Oct 26, 2022
@merelcht
Copy link
Member Author

merelcht commented Jul 5, 2023

Completed in #2161

@merelcht merelcht closed this as completed Jul 5, 2023
@merelcht merelcht removed this from the Redesign the API for IO (catalog) milestone Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: IO Issue/PR addresses data loading/saving/versioning and validation, the DataCatalog and DataSets Issue: Feature Request New feature or improvement to existing feature
Projects
None yet
Development

No branches or pull requests

1 participant