Skip to content

Commit

Permalink
MotherDuck changes on top of duckdblabs/main
Browse files Browse the repository at this point in the history
This is a squash of 6 commits:
- remove some tests that test that iceberg command do not work before loading the extension
  (in MD we now preload iceberg so they do work. but this should be removed and replaced by lazy overloading of table functions, triggered my extension load callbacks)

- add generated iceberg dataset that is not that big and uses pyspark (I guess we do not want a dependency on pyspark in our CI)

trying to solve a linux compilation issue, with only a MacOS test environment :-(

some more std::move() rapping to appease linux compiler

- added a new test that just exists in the MD version of this repo
- removed one statement from a original test that will never produce the expected output in MD

added test that confirms hive partitioning works in MD

- renamed our tests as md_
- added a testcase that tests predicate pushdown in iceberg_scan (for the case 1 it resolves to a single parquet_scan, because for the anti-join case 2, DDB local does not achieve this whereas MD does)

Reintroduced removed tests but commented them out

DuckDB Labs made some changes to the test data you can optionally generate (make data)
we actually committed this data in the data/ directory so CI can run it (generating the data requires pyspark)

this commit removes the old data and replaces it with the new (these are iceberg table, so consist of many files)
and consequently this fixes our build
  • Loading branch information
Peter authored and Flogex committed Apr 17, 2024
1 parent d89423c commit 5886d2c
Show file tree
Hide file tree
Showing 157 changed files with 5,848 additions and 19 deletions.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,217 @@
{
"format-version" : 1,
"table-uuid" : "f424ef42-5477-4841-b127-2f18e7dad530",
"location" : "data/iceberg/generated_spec1_0_01/pyspark_iceberg_table",
"last-updated-ms" : 1710862058098,
"last-column-id" : 15,
"schema" : {
"type" : "struct",
"schema-id" : 0,
"fields" : [ {
"id" : 1,
"name" : "l_orderkey_bool",
"required" : false,
"type" : "boolean"
}, {
"id" : 2,
"name" : "l_partkey_int",
"required" : false,
"type" : "int"
}, {
"id" : 3,
"name" : "l_suppkey_long",
"required" : false,
"type" : "long"
}, {
"id" : 4,
"name" : "l_extendedprice_float",
"required" : false,
"type" : "float"
}, {
"id" : 5,
"name" : "l_extendedprice_double",
"required" : false,
"type" : "double"
}, {
"id" : 6,
"name" : "l_extendedprice_dec9_2",
"required" : false,
"type" : "decimal(9, 2)"
}, {
"id" : 7,
"name" : "l_extendedprice_dec18_6",
"required" : false,
"type" : "decimal(18, 6)"
}, {
"id" : 8,
"name" : "l_extendedprice_dec38_10",
"required" : false,
"type" : "decimal(38, 10)"
}, {
"id" : 9,
"name" : "l_shipdate_date",
"required" : false,
"type" : "date"
}, {
"id" : 10,
"name" : "l_partkey_time",
"required" : false,
"type" : "int"
}, {
"id" : 11,
"name" : "l_commitdate_timestamp",
"required" : false,
"type" : "timestamp"
}, {
"id" : 12,
"name" : "l_commitdate_timestamp_tz",
"required" : false,
"type" : "timestamptz"
}, {
"id" : 13,
"name" : "l_comment_string",
"required" : false,
"type" : "string"
}, {
"id" : 14,
"name" : "uuid",
"required" : false,
"type" : "string"
}, {
"id" : 15,
"name" : "l_comment_blob",
"required" : false,
"type" : "binary"
} ]
},
"current-schema-id" : 0,
"schemas" : [ {
"type" : "struct",
"schema-id" : 0,
"fields" : [ {
"id" : 1,
"name" : "l_orderkey_bool",
"required" : false,
"type" : "boolean"
}, {
"id" : 2,
"name" : "l_partkey_int",
"required" : false,
"type" : "int"
}, {
"id" : 3,
"name" : "l_suppkey_long",
"required" : false,
"type" : "long"
}, {
"id" : 4,
"name" : "l_extendedprice_float",
"required" : false,
"type" : "float"
}, {
"id" : 5,
"name" : "l_extendedprice_double",
"required" : false,
"type" : "double"
}, {
"id" : 6,
"name" : "l_extendedprice_dec9_2",
"required" : false,
"type" : "decimal(9, 2)"
}, {
"id" : 7,
"name" : "l_extendedprice_dec18_6",
"required" : false,
"type" : "decimal(18, 6)"
}, {
"id" : 8,
"name" : "l_extendedprice_dec38_10",
"required" : false,
"type" : "decimal(38, 10)"
}, {
"id" : 9,
"name" : "l_shipdate_date",
"required" : false,
"type" : "date"
}, {
"id" : 10,
"name" : "l_partkey_time",
"required" : false,
"type" : "int"
}, {
"id" : 11,
"name" : "l_commitdate_timestamp",
"required" : false,
"type" : "timestamp"
}, {
"id" : 12,
"name" : "l_commitdate_timestamp_tz",
"required" : false,
"type" : "timestamptz"
}, {
"id" : 13,
"name" : "l_comment_string",
"required" : false,
"type" : "string"
}, {
"id" : 14,
"name" : "uuid",
"required" : false,
"type" : "string"
}, {
"id" : 15,
"name" : "l_comment_blob",
"required" : false,
"type" : "binary"
} ]
} ],
"partition-spec" : [ ],
"default-spec-id" : 0,
"partition-specs" : [ {
"spec-id" : 0,
"fields" : [ ]
} ],
"last-partition-id" : 999,
"default-sort-order-id" : 0,
"sort-orders" : [ {
"order-id" : 0,
"fields" : [ ]
} ],
"properties" : {
"owner" : "peter",
"write.parquet.compression-codec" : "zstd"
},
"current-snapshot-id" : 1427538264954246454,
"refs" : {
"main" : {
"snapshot-id" : 1427538264954246454,
"type" : "branch"
}
},
"snapshots" : [ {
"snapshot-id" : 1427538264954246454,
"timestamp-ms" : 1710862058098,
"summary" : {
"operation" : "append",
"spark.app.id" : "local-1710862055070",
"added-data-files" : "1",
"added-records" : "60175",
"added-files-size" : "4115457",
"changed-partition-count" : "1",
"total-records" : "60175",
"total-files-size" : "4115457",
"total-data-files" : "1",
"total-delete-files" : "0",
"total-position-deletes" : "0",
"total-equality-deletes" : "0"
},
"manifest-list" : "data/iceberg/generated_spec1_0_01/pyspark_iceberg_table/metadata/snap-1427538264954246454-1-bbab5b8f-dd0a-40e7-8ae6-af228cd96267.avro",
"schema-id" : 0
} ],
"statistics" : [ ],
"snapshot-log" : [ {
"timestamp-ms" : 1710862058098,
"snapshot-id" : 1427538264954246454
} ],
"metadata-log" : [ ]
}
Loading

0 comments on commit 5886d2c

Please sign in to comment.