Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add ManagedTableDataset for managed Delta Lake tables in Databricks #206

Merged
merged 45 commits into from
May 22, 2023

Commits on May 12, 2023

  1. committing first version of UnityTableCatalog with unit tests. This d…

    …atasets allows users to interface with Unity catalog tables in Databricks to both read and write.
    
    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar authored and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    d1bc9ab View commit details
    Browse the repository at this point in the history
  2. renaming dataset

    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar authored and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    798055e View commit details
    Browse the repository at this point in the history
  3. adding mlflow connectors

    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar authored and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    f2ea255 View commit details
    Browse the repository at this point in the history
  4. fixing mlflow imports

    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar authored and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    9bb88c2 View commit details
    Browse the repository at this point in the history
  5. cleaned up mlflow for initial release

    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar authored and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    20d20b5 View commit details
    Browse the repository at this point in the history
  6. cleaned up mlflow references from setup.py for initial release

    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar authored and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    d6bc149 View commit details
    Browse the repository at this point in the history
  7. fixed deps in setup.py

    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar authored and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    aee12a2 View commit details
    Browse the repository at this point in the history
  8. adding comments before intiial PR

    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar authored and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    911e53f View commit details
    Browse the repository at this point in the history
  9. moved validation to dataclass

    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar authored and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    10932fb View commit details
    Browse the repository at this point in the history
  10. bug fix in type of partition column and cleanup

    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar authored and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    74471a8 View commit details
    Browse the repository at this point in the history
  11. updated docstring for ManagedTableDataSet

    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar authored and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    4022f0d View commit details
    Browse the repository at this point in the history
  12. added backticks to catalog

    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar authored and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    f6531e1 View commit details
    Browse the repository at this point in the history
  13. fixing regex to allow hyphens

    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar authored and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    3ed18a1 View commit details
    Browse the repository at this point in the history
  14. Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py

    Co-authored-by: Jannic <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    a149b4d View commit details
    Browse the repository at this point in the history
  15. Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py

    Co-authored-by: Jannic <[email protected]>
    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    c854a64 View commit details
    Browse the repository at this point in the history
  16. Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py

    Co-authored-by: Jannic <[email protected]>
    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    c994c3f View commit details
    Browse the repository at this point in the history
  17. Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py

    Co-authored-by: Jannic <[email protected]>
    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    09bf847 View commit details
    Browse the repository at this point in the history
  18. Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py

    Co-authored-by: Jannic <[email protected]>
    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    b7e8cff View commit details
    Browse the repository at this point in the history
  19. Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py

    Co-authored-by: Jannic <[email protected]>
    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    e7b8e40 View commit details
    Browse the repository at this point in the history
  20. Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py

    Co-authored-by: Jannic <[email protected]>
    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    31a0c73 View commit details
    Browse the repository at this point in the history
  21. Update kedro-datasets/test_requirements.txt

    Co-authored-by: Jannic <[email protected]>
    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    83704b4 View commit details
    Browse the repository at this point in the history
  22. Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py

    Co-authored-by: Jannic <[email protected]>
    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    1bf1e29 View commit details
    Browse the repository at this point in the history
  23. Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py

    Co-authored-by: Jannic <[email protected]>
    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    9e391ee View commit details
    Browse the repository at this point in the history
  24. Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py

    Co-authored-by: Jannic <[email protected]>
    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    651e379 View commit details
    Browse the repository at this point in the history
  25. Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py

    Co-authored-by: Jannic <[email protected]>
    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    b267616 View commit details
    Browse the repository at this point in the history
  26. adding backticks to catalog

    Signed-off-by: Danny Farah <[email protected]>
    Signed-off-by: Jannic Holzer <[email protected]>
    dannyrfar authored and jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    f8f9786 View commit details
    Browse the repository at this point in the history
  27. Require pandas < 2.0 for compatibility with spark < 3.4

    Signed-off-by: Jannic Holzer <[email protected]>
    jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    57248ea View commit details
    Browse the repository at this point in the history
  28. Replace use of walrus operator

    Signed-off-by: Jannic Holzer <[email protected]>
    jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    944009a View commit details
    Browse the repository at this point in the history
  29. Add test coverage for validation methods

    Signed-off-by: Jannic Holzer <[email protected]>
    jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    25a293e View commit details
    Browse the repository at this point in the history
  30. Remove unused versioning functions

    Signed-off-by: Jannic Holzer <[email protected]>
    jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    3d6b682 View commit details
    Browse the repository at this point in the history
  31. Configuration menu
    Copy the full SHA
    b37a198 View commit details
    Browse the repository at this point in the history
  32. Add pylint ignore

    Signed-off-by: Jannic Holzer <[email protected]>
    jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    952cf3d View commit details
    Browse the repository at this point in the history
  33. Add tests/databricks to ignore for no-spark tests

    Signed-off-by: Jannic Holzer <[email protected]>
    jmholzer committed May 12, 2023
    Configuration menu
    Copy the full SHA
    743816e View commit details
    Browse the repository at this point in the history

Commits on May 17, 2023

  1. Configuration menu
    Copy the full SHA
    0a160a5 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    daf5411 View commit details
    Browse the repository at this point in the history

Commits on May 18, 2023

  1. Remove spurious mlflow test dependency

    Signed-off-by: Jannic Holzer <[email protected]>
    jmholzer committed May 18, 2023
    Configuration menu
    Copy the full SHA
    c1e78cd View commit details
    Browse the repository at this point in the history

Commits on May 19, 2023

  1. Configuration menu
    Copy the full SHA
    5fe9fec View commit details
    Browse the repository at this point in the history

Commits on May 22, 2023

  1. Configuration menu
    Copy the full SHA
    4c830bf View commit details
    Browse the repository at this point in the history
  2. Add explicit check for database existence

    Signed-off-by: Jannic Holzer <[email protected]>
    jmholzer committed May 22, 2023
    Configuration menu
    Copy the full SHA
    dbfd641 View commit details
    Browse the repository at this point in the history
  3. Remove character limit for table names

    Signed-off-by: Jannic Holzer <[email protected]>
    jmholzer committed May 22, 2023
    Configuration menu
    Copy the full SHA
    5ea0d66 View commit details
    Browse the repository at this point in the history
  4. Refactor validation steps in ManagedTable

    Signed-off-by: Jannic Holzer <[email protected]>
    jmholzer committed May 22, 2023
    Configuration menu
    Copy the full SHA
    c2fd478 View commit details
    Browse the repository at this point in the history
  5. Merge branch 'feat/add-managed-delta-table-dataset' of github.com:ked…

    …ro-org/kedro-plugins into feat/add-managed-delta-table-dataset
    
    Signed-off-by: Jannic Holzer <[email protected]>
    jmholzer committed May 22, 2023
    Configuration menu
    Copy the full SHA
    e228164 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    47e2a18 View commit details
    Browse the repository at this point in the history
  7. Remove spurious checks for table and schema name existence

    Signed-off-by: Jannic Holzer <[email protected]>
    jmholzer committed May 22, 2023
    Configuration menu
    Copy the full SHA
    7e52e9c View commit details
    Browse the repository at this point in the history
  8. Merge branch 'feat/add-managed-delta-table-dataset' of github.com:ked…

    …ro-org/kedro-plugins into feat/add-managed-delta-table-dataset
    
    Signed-off-by: Jannic Holzer <[email protected]>
    jmholzer committed May 22, 2023
    Configuration menu
    Copy the full SHA
    f03bbe3 View commit details
    Browse the repository at this point in the history