-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add matlab dataset into kedro-datasets #435
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your pull request @samuel-lee-sj ! We'll need a couple of iterations for this to be ready, but this is a great start 💪🏽
You can also do make lint plugin=kedro-datasets
(from the kedro-plugins
directory) to automatically lint the code
When I try I get an error saying 'No hook with id 'ruff-' in stage 'manual'. Below is the full error message:
|
@samuel-lee-sj it's because of a typo: you wrote Lines 16 to 17 in f59e930
|
Hi @samuel-lee-sj , would you like to continue this PR or do you need some help finishing it? |
Hello @merelcht. I was under the impression that @ankatiyar was going to takeover. Let me know what you think! |
@samuel-lee-sj Could you fix the sign-off on the commits by following the instructions here and I can help with the rest! 😄 |
Hello @ankatiyar. I think I did it. Let me know if there are any more problems. |
@samuel-lee-sj, seems like the DCO test is still not passing. Would you mind trying rebasing again? |
…` from main repository to kedro-datasets (#253) Signed-off-by: Peter Bludau <[email protected]> Co-authored-by: Merel Theisen <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: Simon Brugman <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
#341) Signed-off-by: Alistair McKelvie <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
fix RTD Signed-off-by: Nok <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
* Pin pip version temporarily Signed-off-by: Ankita Katiyar <[email protected]> * Hive support failures Signed-off-by: Ankita Katiyar <[email protected]> * Also pin pip on lint Signed-off-by: Ankita Katiyar <[email protected]> * Temporary ignore databricks spark tests Signed-off-by: Ankita Katiyar <[email protected]> --------- Signed-off-by: Ankita Katiyar <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
* perf(datasets): delay `Engine` creation until need Signed-off-by: Deepyaman Datta <[email protected]> * chore: don't check coverage in TYPE_CHECKING block Signed-off-by: Deepyaman Datta <[email protected]> * perf(datasets): don't connect in `__init__` method Signed-off-by: Deepyaman Datta <[email protected]> * test(datasets): fix tests to touch `create_engine` Signed-off-by: Deepyaman Datta <[email protected]> * perf(datasets): don't connect in `__init__` method Signed-off-by: Deepyaman Datta <[email protected]> * style(datasets): exec Ruff on sql_dataset.py 🐶 Signed-off-by: Deepyaman Datta <[email protected]> * Undo changes to `engines` values type (for Sphinx) Signed-off-by: Deepyaman Datta <[email protected]> * Patch Sphinx build by removing `Engine` references * perf(datasets): don't connect in `__init__` method Signed-off-by: Deepyaman Datta <[email protected]> * chore(datasets): don't require coverage for import * chore(datasets): del unused `TYPE_CHECKING` import * docs(datasets): document lazy connection in README * perf(datasets): remove create in `SQLQueryDataset` Signed-off-by: Deepyaman Datta <[email protected]> * refactor(datasets): do not return the created conn Signed-off-by: Deepyaman Datta <[email protected]> --------- Signed-off-by: Deepyaman Datta <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
* Remove references to Python 3.7 Signed-off-by: lrcouto <[email protected]> * Revert kedro-dataset changes Signed-off-by: lrcouto <[email protected]> * Revert kedro-dataset changes Signed-off-by: lrcouto <[email protected]> * Add information to release docs Signed-off-by: lrcouto <[email protected]> --------- Signed-off-by: lrcouto <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
* feat(datasets) add PolarsDataset to support Polars's Lazy API Signed-off-by: Matthias Roels <[email protected]> * Fix(datasets): rename PolarsDataSet to PolarsDataSet Add PolarsDataSet as an alias for PolarsDataset with deprecation warning. Signed-off-by: Matthias Roels <[email protected]> * Fix(datasets): apply ruff linting rules Signed-off-by: Matthias Roels <[email protected]> * Fix(datasets): Correct pattern matching when Raising exceptions Corrected PolarsDataSet to PolarsDataset in the pattern to match in test_load_missing_file Signed-off-by: Matthias Roels <[email protected]> * fix(datasets): clean up PolarsDataset related code Remove reference to PolarsDataSet as this is not required for new dataset implementations. Signed-off-by: Matthias Roels <[email protected]> * feat(datasets): Rename Polars Datasets to better describe their intent Signed-off-by: Matthias Roels <[email protected]> * feat(datasets): clean up LazyPolarsDataset Signed-off-by: Matthias Roels <[email protected]> * fix(datasets): increase test coverage for PolarsDataset classes Signed-off-by: Matthias Roels <[email protected]> * docs(datasets): add renamed Polars datasets to docs Signed-off-by: Matthias Roels <[email protected]> * docs(datasets): Add new polars datasets to release notes Signed-off-by: Matthias Roels <[email protected]> * fix(datasets): load_args not properly passed to LazyPolarsDataset.load Signed-off-by: Matthias Roels <[email protected]> * docs(datasets): fix spelling error in release notes Co-authored-by: Merel Theisen <[email protected]> Signed-off-by: Matthias Roels <[email protected]> --------- Signed-off-by: Matthias Roels <[email protected]> Signed-off-by: Matthias Roels <[email protected]> Signed-off-by: Merel Theisen <[email protected]> Co-authored-by: Matthias Roels <[email protected]> Co-authored-by: Merel Theisen <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: Merel Theisen <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
* bump version Signed-off-by: Ankita Katiyar <[email protected]> * Update release notes Signed-off-by: Ankita Katiyar <[email protected]> --------- Signed-off-by: Ankita Katiyar <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
Bump version Signed-off-by: Ankita Katiyar <[email protected]> Signed-off-by: Ankita Katiyar <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
Bump version Signed-off-by: Ankita Katiyar <[email protected]> Signed-off-by: Ankita Katiyar <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: Deepyaman Datta <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
Fix missing jQuery Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
#413) * Fix Lazy Polars dataset to use the new-style base class Fix gh-412 Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> * Update release notes Signed-off-by: Ankita Katiyar <[email protected]> * Revert "Update release notes" This reverts commit 92ceea6. --------- Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> Signed-off-by: Sajid Alam <[email protected]> Signed-off-by: Ankita Katiyar <[email protected]> Co-authored-by: Sajid Alam <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: Deepyaman Datta <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
* chore(datasets): lazily load `partitions` classes Signed-off-by: Deepyaman Datta <[email protected]> * test(datasets): run doctests to check examples run Signed-off-by: Deepyaman Datta <[email protected]> * test(datasets): keep running tests amidst failures Signed-off-by: Deepyaman Datta <[email protected]> * docs(datasets): format ManagedTableDataset example Signed-off-by: Deepyaman Datta <[email protected]> * chore(datasets): ignore breaking mods for doctests Signed-off-by: Deepyaman Datta <[email protected]> * style(airflow): black code in Kedro-Airflow README Signed-off-by: Deepyaman Datta <[email protected]> * docs(datasets): fix example syntax, and autoformat Signed-off-by: Deepyaman Datta <[email protected]> * docs(datasets): remove `kedro.extras.datasets` ref Signed-off-by: Deepyaman Datta <[email protected]> * docs(datasets): remove `>>> ` prefix for YAML code Signed-off-by: Deepyaman Datta <[email protected]> * docs(datasets): remove `kedro.extras.datasets` ref Signed-off-by: Deepyaman Datta <[email protected]> * docs(datasets): replace `data_set`s with `dataset`s Signed-off-by: Deepyaman Datta <[email protected]> * chore(datasets): undo changes for running doctests Signed-off-by: Deepyaman Datta <[email protected]> * revert(datasets): undo lazily load `partitions` classes Refs: 3fdc5a8 Signed-off-by: Deepyaman Datta <[email protected]> * revert(airflow): undo black code in Kedro-Airflow README Refs: dc3476e Signed-off-by: Deepyaman Datta <[email protected]> --------- Signed-off-by: Deepyaman Datta <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: samuelleeshemen <[email protected]>
* Add python version support policy to plugin readmes Signed-off-by: Merel Theisen <[email protected]> * Temporarily pin connexion Signed-off-by: Merel Theisen <[email protected]> --------- Signed-off-by: Merel Theisen <[email protected]>
Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: samuelleeshemen <[email protected]>
* Add shared CSS and meganav Signed-off-by: Jo Stichbury <[email protected]> * Add end of file Signed-off-by: Jo Stichbury <[email protected]> * Add new heap data source Signed-off-by: Jo Stichbury <[email protected]> * adjust heap parameter Signed-off-by: Jo Stichbury <[email protected]> * Remove nav_version next to Kedro logo in top left; add Kedro logo * Revise project name and author name Signed-off-by: Jo Stichbury <[email protected]> * Use full kedro icon and type for logo * Add close btn to mobile nav Signed-off-by: vladimir-mck <[email protected]> * Add css for mobile nav logo image Signed-off-by: vladimir-mck <[email protected]> * Update close button for mobile nav Signed-off-by: vladimir-mck <[email protected]> * Add open button to mobile nav Signed-off-by: vladimir-mck <[email protected]> * Delete kedro-datasets/docs/source/kedro-horizontal-color-on-light.svg Signed-off-by: vladimir-mck <[email protected]> * Update conf.py Signed-off-by: vladimir-mck <[email protected]> * Update layout.html Add links to subprojects Signed-off-by: Jo Stichbury <[email protected]> * Remove svg from docs -- not needed?? Signed-off-by: Jo Stichbury <[email protected]> * linter error fix Signed-off-by: Jo Stichbury <[email protected]> --------- Signed-off-by: Jo Stichbury <[email protected]> Signed-off-by: vladimir-mck <[email protected]> Co-authored-by: Tynan DeBold <[email protected]> Co-authored-by: vladimir-mck <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
* Add HuggingFace datasets Co-authored-by: Danny Farah <[email protected]> Co-authored-by: Kevin Koga <[email protected]> Co-authored-by: Mate Scharnitzky <[email protected]> Co-authored-by: Tomer Shor <[email protected]> Co-authored-by: Pierre-Yves Mousset <[email protected]> Co-authored-by: Bela Chupal <[email protected]> Co-authored-by: Khangjrakpam Arjun <[email protected]> Co-authored-by: Juan Luis Cano Rodríguez <[email protected]> Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> * Apply suggestions from code review Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> Co-authored-by: Joel <[email protected]> Co-authored-by: Nok Lam Chan <[email protected]> * Typo Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> * Fix docstring Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> * Add docstring for HFTransformerPipelineDataset Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> * Use intersphinx for cross references in Hugging Face docstrings Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> * Add docstring for HFDataset Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> * Add missing test dependencies Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> * Add tests for huggingface datasets Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> * Fix HFDataset.save Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> * Add test for HFDataset.list_datasets Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> * Use new name Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> * Consolidate imports Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> --------- Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> Co-authored-by: Danny Farah <[email protected]> Co-authored-by: Kevin Koga <[email protected]> Co-authored-by: Mate Scharnitzky <[email protected]> Co-authored-by: Tomer Shor <[email protected]> Co-authored-by: Pierre-Yves Mousset <[email protected]> Co-authored-by: Bela Chupal <[email protected]> Co-authored-by: Khangjrakpam Arjun <[email protected]> Co-authored-by: Joel <[email protected]> Co-authored-by: Nok Lam Chan <[email protected]>
Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: samuelleeshemen <[email protected]>
* test(datasets): fix `dask.ParquetDataset` doctests Signed-off-by: Deepyaman Datta <[email protected]> * test(datasets): use `tmp_path` fixture in doctests Signed-off-by: Deepyaman Datta <[email protected]> * test(datasets): simplify by not passing the schema Signed-off-by: Deepyaman Datta <[email protected]> * test(datasets): ignore conftest for doctests cover Signed-off-by: Deepyaman Datta <[email protected]> * Create MANIFEST.in Signed-off-by: Deepyaman Datta <[email protected]> --------- Signed-off-by: Deepyaman Datta <[email protected]>
Signed-off-by: Merel Theisen <[email protected]>
Signed-off-by: Deepyaman Datta <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
* test(datasets): add outputs to matplotlib doctests Signed-off-by: Deepyaman Datta <[email protected]> * Update Makefile Signed-off-by: Deepyaman Datta <[email protected]> * Reformat code example, line length is short enough * Update kedro-datasets/kedro_datasets/matplotlib/matplotlib_writer.py Signed-off-by: Deepyaman Datta <[email protected]> --------- Signed-off-by: Deepyaman Datta <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: Deepyaman Datta <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
* Add add-on data to heap event Signed-off-by: lrcouto <[email protected]> * Move addons logic to _get_project_property Signed-off-by: Ankita Katiyar <[email protected]> * Add condition for pyproject.toml Signed-off-by: Ankita Katiyar <[email protected]> * Fix tests Signed-off-by: Ankita Katiyar <[email protected]> * Fix tests Signed-off-by: Ankita Katiyar <[email protected]> * add tools to mock Signed-off-by: lrcouto <[email protected]> * lint Signed-off-by: lrcouto <[email protected]> * Update tools test Signed-off-by: Ankita Katiyar <[email protected]> * Add after_context_created tools test Signed-off-by: lrcouto <[email protected]> * Update rename to tools Signed-off-by: Ankita Katiyar <[email protected]> * Update kedro-telemetry/tests/test_plugin.py Co-authored-by: Sajid Alam <[email protected]> Signed-off-by: Ankita Katiyar <[email protected]> --------- Signed-off-by: lrcouto <[email protected]> Signed-off-by: Ankita Katiyar <[email protected]> Signed-off-by: Ankita Katiyar <[email protected]> Co-authored-by: Ankita Katiyar <[email protected]> Co-authored-by: Ankita Katiyar <[email protected]> Co-authored-by: Sajid Alam <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
* update pandas-gbq dependency declaration Signed-off-by: Onur Kuru <[email protected]> * fix fmt Signed-off-by: Onur Kuru <[email protected]> --------- Signed-off-by: Onur Kuru <[email protected]> Co-authored-by: Ahdra Merali <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: Merel Theisen <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
Update scikit-learn version Signed-off-by: Nok Lam Chan <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: samuelleeshemen <[email protected]>
Fix broken links in README Signed-off-by: Juan Luis Cano Rodríguez <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: Deepyaman Datta <[email protected]> Co-authored-by: Juan Luis Cano Rodríguez <[email protected]> Signed-off-by: samuelleeshemen <[email protected]>
Signed-off-by: samuel-lee-sj <[email protected]>
Hello @ankatiyar. I think I messed up the rebase. Could you give me some help? The DCO is saying that: I don't think I edited this commit. If I did I am sorry. |
Description
This is implementation of a .mat dataset from MatLab for kedro
Development notes
Added two files, matlab_dataset.py, test_matlab_dataset.py. Tested on pytest.
Checklist
RELEASE.md
file