Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for AtomicCreateTableAsSelect with Delta Lake [databricks] #9425

Merged
merged 3 commits into from
Oct 17, 2023

Conversation

jlowe
Copy link
Member

@jlowe jlowe commented Oct 12, 2023

Fixes #7276. Extended ExternalSource to handle potential replacements for AtomicCreateTableAsSelect and added unit tests. A significant amount of the code change comes from the various GpuCreateDeltaTableCommand files which are small changes from the original versions of CreateDeltaTableCommand sources needed to perform the operation on the GPU. Rather than refactor these for commonality between the Delta Lake versions and make it hard to see the GPU diffs needed for each version, I added each separately to make diffing easier both for the current review and for auditing for changes in Delta Lake patch versions in the future.

@jlowe jlowe self-assigned this Oct 12, 2023
@jlowe jlowe marked this pull request as ready for review October 12, 2023 21:11
@jlowe
Copy link
Member Author

jlowe commented Oct 12, 2023

build

@sameerz sameerz added the feature request New feature or request label Oct 16, 2023
@jlowe jlowe marked this pull request as draft October 16, 2023 14:54
@jlowe jlowe marked this pull request as ready for review October 16, 2023 16:05
@jlowe
Copy link
Member Author

jlowe commented Oct 16, 2023

build

@jlowe
Copy link
Member Author

jlowe commented Oct 16, 2023

build

@@ -92,6 +101,43 @@ object DeltaProviderImpl extends DeltaProviderImplBase {
val cpuFormat = format.asInstanceOf[DeltaParquetFileFormat]
GpuDeltaParquetFileFormat.convertToGpu(cpuFormat)
}

override def isSupportedCatalog(catalogClass: Class[_ <: StagingTableCatalog]): Boolean = {
catalogClass == classOf[DeltaCatalog] || catalogClass == classOf[UnityCatalogV2Proxy]
Copy link
Collaborator

@gerashegalov gerashegalov Oct 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: are we guaranteed in this comparison that both catalogClass and classOf[DeltaCatalog] are loaded via the same ClassLoader instance, might get a false negative otherwise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so. The technique essentially matches the technique used for almost all overrides, as GpuOverrides is building a large hashmap with classes as keys and doing a lookup on that map based on what we find in the CPU plan. So I would think that if we have an issue with this code, we'd have a similar issue with how all overrides work in the plugin.

Copy link
Collaborator

@gerashegalov gerashegalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jlowe jlowe merged commit f2dbb23 into NVIDIA:branch-23.12 Oct 17, 2023
29 checks passed
@jlowe jlowe deleted the delta-actas branch October 17, 2023 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Support AtomicCreateTableAsSelect for Delta Lake
4 participants