Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ALTER TABLE ... DROP COLUMN allows dropping a column used by old PartitionSpecs #4563

Open
alexjo2144 opened this issue Apr 14, 2022 · 14 comments
Labels
bug Something isn't working

Comments

@alexjo2144
Copy link
Contributor

Dropping a column used by the most recent PartitionSpec fails cleanly, however dropping a column used by an older PartitionSpec corrupts the table entirely. For example, in SparkSQL:

CREATE TABLE default.test_evolution (col0 BIGINT, col1 BIGINT, col2 BIGINT) USING ICEBERG TBLPROPERTIES ('format-version' = 2, 'write.delete.mode' = 'merge-on-read');
INSERT INTO default.test_evolution VALUES (1, 11, 21);
ALTER TABLE default.test_evolution ADD PARTITION FIELD col2;
INSERT INTO default.test_evolution VALUES (2, 12, 22);
ALTER TABLE default.test_evolution DROP PARTITION FIELD col2;
INSERT INTO default.test_evolution VALUES (3, 13, 23);

ALTER TABLE default.test_evolution DROP COLUMN col2;
-- Query fails with a null pointer exception, but still has side effects

SELECT * FROM default.test_evolution;
-- Query fails, table is unreadable

NPE Stack Trace:

22/04/14 18:36:02 ERROR SparkSQLDriver: Failed in [SELECT * FROM default.test_evolution]
java.lang.NullPointerException: Cannot find source column: 3
	at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:953)
	at org.apache.iceberg.PartitionSpec$Builder.add(PartitionSpec.java:503)
	at org.apache.iceberg.PartitionSpecParser.buildFromJsonFields(PartitionSpecParser.java:155)
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:78)
	at org.apache.iceberg.TableMetadataParser.fromJson(TableMetadataParser.java:357)
	at org.apache.iceberg.TableMetadataParser.fromJson(TableMetadataParser.java:288)
	at org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:251)
	at org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:245)
	at org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$0(BaseMetastoreTableOperations.java:171)
	at org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$1(BaseMetastoreTableOperations.java:185)
	at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:404)
	at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214)
	at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198)
	at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190)
	at org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:185)
	at org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:170)
	at org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:165)
	at org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:207)
	at org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:95)
	at org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:78)
	at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:42)
	at org.apache.iceberg.spark.SparkCatalog.load(SparkCatalog.java:488)
	at org.apache.iceberg.spark.SparkCatalog.loadTable(SparkCatalog.java:135)
	at org.apache.iceberg.spark.SparkCatalog.loadTable(SparkCatalog.java:92)
	at org.apache.spark.sql.connector.catalog.CatalogV2Util$.loadTable(CatalogV2Util.scala:281)
	at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveTables$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveTables$$lookupV2Relation(Analyzer.scala:1123)
	at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveTables$$anonfun$apply$14.applyOrElse(Analyzer.scala:1073)
	at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveTables$$anonfun$apply$14.applyOrElse(Analyzer.scala:1071)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138)
	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135)
	at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1122)
	at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1121)
	at org.apache.spark.sql.catalyst.plans.logical.OrderPreservingUnaryNode.mapChildren(LogicalPlan.scala:206)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:135)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30)
	at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveTables$.apply(Analyzer.scala:1071)
	at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveTables$.apply(Analyzer.scala:1069)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211)
	at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
	at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
	at scala.collection.immutable.List.foldLeft(List.scala:91)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200)
	at scala.collection.immutable.List.foreach(List.scala:431)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200)
	at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:222)
	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:218)
	at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:167)
	at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:218)
	at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:182)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88)
	at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179)
	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:203)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
	at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:202)
	at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:88)
	at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:196)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
	at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:196)
	at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:88)
	at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:86)
	at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:78)
	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:98)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
	at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
	at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:651)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:67)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:384)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:504)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1$adapted(SparkSQLCLIDriver.scala:498)
	at scala.collection.Iterator.foreach(Iterator.scala:943)
	at scala.collection.Iterator.foreach$(Iterator.scala:943)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:498)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:287)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

@rdblue
Copy link
Contributor

rdblue commented Apr 14, 2022

Hm. I think that we want to allow the user to drop columns that are used by old partition specs. But that means we will need to handle specs that can't be bound to the current schema.

We recently introduced an UnboundPartitionSpec class that can be used in places before there is a schema to use with the spec. I think we should probably use that in other places instead of a PartitionSpec that has a concrete schema.

@findepi findepi added the bug Something isn't working label Apr 15, 2022
@felixYyu
Copy link
Contributor

I want to try fix it. @rdblue

@swapz-z
Copy link

swapz-z commented Apr 17, 2022

Hey @felixYyu
I am new to the opensource contribution process.
However, in order to get my hands dirty, I feel this would be a great starting point to setup and reproduce the issue and then analyse it.
Do let me know if it's okay for you if I can tag along with you for this first issue. And be a part of this process ?

Thanks in advance

@felixYyu
Copy link
Contributor

felixYyu commented May 12, 2022

Could you test this case with the refer PR if you have time? @swapz-z @alexjo2144

@github-actions
Copy link

github-actions bot commented Nov 9, 2022

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

@github-actions github-actions bot added the stale label Nov 9, 2022
@github-actions
Copy link

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'

@alexjo2144
Copy link
Contributor Author

This should be reopened

@hashhar
Copy link

hashhar commented Mar 6, 2024

Iceberg 1.1.0 had 3b65cca which seems to address something. Can we now allow this on Trino?

@alexjo2144 / @krvikash ?

@hashhar
Copy link

hashhar commented Mar 6, 2024

hmmm, looks like BEFORE dropping a partition column I need to optimize the table otherwise subsequent reads fail (both in Spark and Trino).

Here's a sample sequence of steps:

trino:default> create table test (part_col timestamp, data_col integer) with (partitioning = ARRAY['month(part_col)']);
CREATE TABLE
trino:default> insert into test values (now(), 1), (now() + interval '1' month, 2), (now() + interval '2' month, 3);
INSERT: 3 rows
trino:default> alter table test add column part_col_new timestamp(6) with time zone;
ADD COLUMN
trino:default> update test set part_col_new = part_col at time zone 'Etc/UTC';
UPDATE: 3 rows
trino:default> alter table test set properties partitioning = ARRAY['month(part_col)', 'month(part_col_new)'];
SET PROPERTIES
trino:default> insert into test (data_col, part_col_new) values (1, now()), (2, now() + interval '1' month), (3, now() + interval '2' month), (4, now() + interval '1' year);
INSERT: 4 rows
trino:default> select partition from "test$partitions";
                   partition
-----------------------------------------------
 {part_col_month=NULL, part_col_new_month=662}
 {part_col_month=651, part_col_new_month=NULL}
 {part_col_month=NULL, part_col_new_month=652}
 {part_col_month=652, part_col_new_month=NULL}
 {part_col_month=NULL, part_col_new_month=650}
 {part_col_month=NULL, part_col_new_month=651}
 {part_col_month=650, part_col_new_month=NULL}
(7 rows)

-- now let's drop the old one - needs to be done via Spark until https://github.com/apache/iceberg/issues/4563 is addressed (it was fixed in Iceberg - was an iceberg bug in the past)
spark-sql (default)> alter table test drop column part_col;
-- now reads will fail both in Spark and Trino
spark-sql (default)> select * from test;
Error reading file(s): hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_month=2024-03/20240306_063551_00068_gex23-d5c19feb-7ac9-430e-a913-1d91f4a3b932.parquet
org.apache.iceberg.exceptions.ValidationException: Cannot find source column for partition field: 1000: part_col_month: month(1)
	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpec.checkCompatibility(PartitionSpec.java:562) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpec$Builder.build(PartitionSpec.java:543) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.UnboundPartitionSpec.bind(UnboundPartitionSpec.java:46) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:71) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$1(PartitionSpecParser.java:88) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.JsonUtil.parse(JsonUtil.java:98) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$2(PartitionSpecParser.java:88) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908) ~[?:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:86) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseContentScanTask.spec(BaseContentScanTask.java:71) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseFileScanTask.spec(BaseFileScanTask.java:27) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseFileScanTask$SplitScanTask.spec(BaseFileScanTask.java:127) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:49) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:42) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BaseReader.constantsMap(BaseReader.java:206) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:91) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:41) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:141) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at scala.Option.exists(Option.scala:376) ~[scala-library-2.12.17.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:97) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) ~[scala-library-2.12.17.jar:?]
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) ~[?:?]
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) ~[?:?]
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:888) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:888) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.scheduler.Task.run(Task.scala:139) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:829) ~[?:?]
Error reading file(s): hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_month=2024-05/20240306_063541_00066_gex23-91d14ef1-a76b-45e5-bb5f-3455e75f7f66.parquet, hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_month=2024-05/20240306_063551_00068_gex23-53b8585f-3a41-4c7c-9b23-1a29809bb194.parquet
org.apache.iceberg.exceptions.ValidationException: Cannot find source column for partition field: 1000: part_col_month: month(1)
	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpec.checkCompatibility(PartitionSpec.java:562) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpec$Builder.build(PartitionSpec.java:543) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.UnboundPartitionSpec.bind(UnboundPartitionSpec.java:46) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:71) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$1(PartitionSpecParser.java:88) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.JsonUtil.parse(JsonUtil.java:98) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$2(PartitionSpecParser.java:88) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908) ~[?:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:86) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseContentScanTask.spec(BaseContentScanTask.java:71) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseFileScanTask.spec(BaseFileScanTask.java:27) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseFileScanTask$SplitScanTask.spec(BaseFileScanTask.java:127) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:49) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:42) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BaseReader.constantsMap(BaseReader.java:206) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:91) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:41) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:141) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at scala.Option.exists(Option.scala:376) ~[scala-library-2.12.17.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:97) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) ~[scala-library-2.12.17.jar:?]
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) ~[?:?]
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) ~[?:?]
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:888) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:888) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.scheduler.Task.run(Task.scala:139) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:829) ~[?:?]
Error reading file(s): hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_month=2024-03/20240306_063541_00066_gex23-88b93510-1371-4911-ab41-1aa816a20cb4.parquet, hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_month=2024-03/20240306_063551_00068_gex23-5350f95e-56ec-47cb-8e3a-985b8603c1ff.parquet
org.apache.iceberg.exceptions.ValidationException: Cannot find source column for partition field: 1000: part_col_month: month(1)
	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpec.checkCompatibility(PartitionSpec.java:562) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpec$Builder.build(PartitionSpec.java:543) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.UnboundPartitionSpec.bind(UnboundPartitionSpec.java:46) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:71) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$1(PartitionSpecParser.java:88) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.JsonUtil.parse(JsonUtil.java:98) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$2(PartitionSpecParser.java:88) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908) ~[?:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:86) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseContentScanTask.spec(BaseContentScanTask.java:71) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseFileScanTask.spec(BaseFileScanTask.java:27) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseFileScanTask$SplitScanTask.spec(BaseFileScanTask.java:127) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:49) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:42) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BaseReader.constantsMap(BaseReader.java:206) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:91) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:41) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:141) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at scala.Option.exists(Option.scala:376) ~[scala-library-2.12.17.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:97) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) ~[scala-library-2.12.17.jar:?]
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) ~[?:?]
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) ~[?:?]
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:888) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:888) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.scheduler.Task.run(Task.scala:139) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:829) ~[?:?]
Exception in task 1.0 in stage 0.0 (TID 1)
org.apache.iceberg.exceptions.ValidationException: Cannot find source column for partition field: 1000: part_col_month: month(1)
	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpec.checkCompatibility(PartitionSpec.java:562) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpec$Builder.build(PartitionSpec.java:543) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.UnboundPartitionSpec.bind(UnboundPartitionSpec.java:46) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:71) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$1(PartitionSpecParser.java:88) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.JsonUtil.parse(JsonUtil.java:98) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$2(PartitionSpecParser.java:88) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908) ~[?:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:86) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseContentScanTask.spec(BaseContentScanTask.java:71) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseFileScanTask.spec(BaseFileScanTask.java:27) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseFileScanTask$SplitScanTask.spec(BaseFileScanTask.java:127) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:49) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:42) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BaseReader.constantsMap(BaseReader.java:206) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:91) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:41) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:141) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at scala.Option.exists(Option.scala:376) ~[scala-library-2.12.17.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:97) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) ~[scala-library-2.12.17.jar:?]
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) ~[?:?]
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) ~[?:?]
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:888) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:888) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.scheduler.Task.run(Task.scala:139) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:829) ~[?:?]
Exception in task 3.0 in stage 0.0 (TID 3)
org.apache.iceberg.exceptions.ValidationException: Cannot find source column for partition field: 1000: part_col_month: month(1)
	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpec.checkCompatibility(PartitionSpec.java:562) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpec$Builder.build(PartitionSpec.java:543) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.UnboundPartitionSpec.bind(UnboundPartitionSpec.java:46) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:71) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$1(PartitionSpecParser.java:88) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.JsonUtil.parse(JsonUtil.java:98) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$2(PartitionSpecParser.java:88) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908) ~[?:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:86) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseContentScanTask.spec(BaseContentScanTask.java:71) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseFileScanTask.spec(BaseFileScanTask.java:27) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseFileScanTask$SplitScanTask.spec(BaseFileScanTask.java:127) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:49) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:42) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BaseReader.constantsMap(BaseReader.java:206) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:91) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:41) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:141) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at scala.Option.exists(Option.scala:376) ~[scala-library-2.12.17.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:97) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) ~[scala-library-2.12.17.jar:?]
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) ~[?:?]
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) ~[?:?]
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:888) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:888) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.scheduler.Task.run(Task.scala:139) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:829) ~[?:?]
Exception in task 2.0 in stage 0.0 (TID 2)
org.apache.iceberg.exceptions.ValidationException: Cannot find source column for partition field: 1000: part_col_month: month(1)
	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpec.checkCompatibility(PartitionSpec.java:562) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpec$Builder.build(PartitionSpec.java:543) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.UnboundPartitionSpec.bind(UnboundPartitionSpec.java:46) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:71) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$1(PartitionSpecParser.java:88) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.JsonUtil.parse(JsonUtil.java:98) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$2(PartitionSpecParser.java:88) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908) ~[?:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:86) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseContentScanTask.spec(BaseContentScanTask.java:71) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseFileScanTask.spec(BaseFileScanTask.java:27) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.BaseFileScanTask$SplitScanTask.spec(BaseFileScanTask.java:127) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:49) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:42) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BaseReader.constantsMap(BaseReader.java:206) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:91) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:41) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:141) ~[iceberg-spark-runtime-3.4_2.12-1.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at scala.Option.exists(Option.scala:376) ~[scala-library-2.12.17.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:97) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63) ~[spark-sql_2.12-3.4.2.jar:?]
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) ~[scala-library-2.12.17.jar:?]
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) ~[?:?]
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) ~[?:?]
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:888) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:888) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.scheduler.Task.run(Task.scala:139) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557) ~[spark-core_2.12-3.4.2.jar:3.4.2]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:829) ~[?:?]
Task 3 in stage 0.0 failed 1 times; aborting job
Job aborted due to stage failure: Task 3 in stage 0.0 failed 1 times, most recent failure: Lost task 3.0 in stage 0.0 (TID 3) (spark executor driver): org.apache.iceberg.exceptions.ValidationException: Cannot find source column for partition field: 1000: part_col_month: month(1)
	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49)
	at org.apache.iceberg.PartitionSpec.checkCompatibility(PartitionSpec.java:562)
	at org.apache.iceberg.PartitionSpec$Builder.build(PartitionSpec.java:543)
	at org.apache.iceberg.UnboundPartitionSpec.bind(UnboundPartitionSpec.java:46)
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:71)
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$1(PartitionSpecParser.java:88)
	at org.apache.iceberg.util.JsonUtil.parse(JsonUtil.java:98)
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$2(PartitionSpecParser.java:88)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406)
	at java.base/java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62)
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:86)
	at org.apache.iceberg.BaseContentScanTask.spec(BaseContentScanTask.java:71)
	at org.apache.iceberg.BaseFileScanTask.spec(BaseFileScanTask.java:27)
	at org.apache.iceberg.BaseFileScanTask$SplitScanTask.spec(BaseFileScanTask.java:127)
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:49)
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:42)
	at org.apache.iceberg.spark.source.BaseReader.constantsMap(BaseReader.java:206)
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:91)
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:41)
	at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:141)
	at org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120)
	at org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63)
	at scala.Option.exists(Option.scala:376)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:97)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:888)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:888)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
	at org.apache.spark.scheduler.Task.run(Task.scala:139)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)

Driver stacktrace:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 1 times, most recent failure: Lost task 3.0 in stage 0.0 (TID 3) (spark executor driver): org.apache.iceberg.exceptions.ValidationException: Cannot find source column for partition field: 1000: part_col_month: month(1)
	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49)
	at org.apache.iceberg.PartitionSpec.checkCompatibility(PartitionSpec.java:562)
	at org.apache.iceberg.PartitionSpec$Builder.build(PartitionSpec.java:543)
	at org.apache.iceberg.UnboundPartitionSpec.bind(UnboundPartitionSpec.java:46)
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:71)
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$1(PartitionSpecParser.java:88)
	at org.apache.iceberg.util.JsonUtil.parse(JsonUtil.java:98)
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$2(PartitionSpecParser.java:88)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406)
	at java.base/java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62)
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:86)
	at org.apache.iceberg.BaseContentScanTask.spec(BaseContentScanTask.java:71)
	at org.apache.iceberg.BaseFileScanTask.spec(BaseFileScanTask.java:27)
	at org.apache.iceberg.BaseFileScanTask$SplitScanTask.spec(BaseFileScanTask.java:127)
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:49)
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:42)
	at org.apache.iceberg.spark.source.BaseReader.constantsMap(BaseReader.java:206)
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:91)
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:41)
	at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:141)
	at org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120)
	at org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63)
	at scala.Option.exists(Option.scala:376)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:97)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:888)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:888)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
	at org.apache.spark.scheduler.Task.run(Task.scala:139)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)

Driver stacktrace:
	at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2785)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2721)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2720)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2720)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1206)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1206)
	at scala.Option.foreach(Option.scala:407)
	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1206)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2984)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2923)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2912)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:971)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2263)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2284)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2303)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2328)
	at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1019)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:405)
	at org.apache.spark.rdd.RDD.collect(RDD.scala:1018)
	at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:448)
	at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:475)
	at org.apache.spark.sql.execution.HiveResult$.hiveResultString(HiveResult.scala:76)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.$anonfun$run$2(SparkSQLDriver.scala:69)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:118)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:195)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:103)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:69)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:415)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:533)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1$adapted(SparkSQLCLIDriver.scala:527)
	at scala.collection.Iterator.foreach(Iterator.scala:943)
	at scala.collection.Iterator.foreach$(Iterator.scala:943)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:527)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:307)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.iceberg.exceptions.ValidationException: Cannot find source column for partition field: 1000: part_col_month: month(1)
	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49)
	at org.apache.iceberg.PartitionSpec.checkCompatibility(PartitionSpec.java:562)
	at org.apache.iceberg.PartitionSpec$Builder.build(PartitionSpec.java:543)
	at org.apache.iceberg.UnboundPartitionSpec.bind(UnboundPartitionSpec.java:46)
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:71)
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$1(PartitionSpecParser.java:88)
	at org.apache.iceberg.util.JsonUtil.parse(JsonUtil.java:98)
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$2(PartitionSpecParser.java:88)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406)
	at java.base/java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62)
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:86)
	at org.apache.iceberg.BaseContentScanTask.spec(BaseContentScanTask.java:71)
	at org.apache.iceberg.BaseFileScanTask.spec(BaseFileScanTask.java:27)
	at org.apache.iceberg.BaseFileScanTask$SplitScanTask.spec(BaseFileScanTask.java:127)
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:49)
	at org.apache.iceberg.util.PartitionUtil.constantsMap(PartitionUtil.java:42)
	at org.apache.iceberg.spark.source.BaseReader.constantsMap(BaseReader.java:206)
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:91)
	at org.apache.iceberg.spark.source.BatchDataReader.open(BatchDataReader.java:41)
	at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:141)
	at org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120)
	at org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63)
	at scala.Option.exists(Option.scala:376)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:97)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:888)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:888)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
	at org.apache.spark.scheduler.Task.run(Task.scala:139)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)

and in Trino:

trino:default> select * from test;
Query 20240306_063901_00076_gex23 failed: Cannot find source column for partition field: 1000: part_col_month: month(1)
org.apache.iceberg.exceptions.ValidationException: Cannot find source column for partition field: 1000: part_col_month: month(1)
	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49)
	at org.apache.iceberg.PartitionSpec.checkCompatibility(PartitionSpec.java:562)
	at org.apache.iceberg.PartitionSpec$Builder.build(PartitionSpec.java:543)
	at org.apache.iceberg.UnboundPartitionSpec.bind(UnboundPartitionSpec.java:46)
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:71)
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$1(PartitionSpecParser.java:88)
	at org.apache.iceberg.util.JsonUtil.parse(JsonUtil.java:98)
	at org.apache.iceberg.PartitionSpecParser.lambda$fromJson$2(PartitionSpecParser.java:88)
	at com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2688)
	at java.base/java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1916)
	at com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2686)
	at com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2669)
	at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:112)
	at com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62)
	at org.apache.iceberg.PartitionSpecParser.fromJson(PartitionSpecParser.java:86)
	at org.apache.iceberg.BaseContentScanTask.spec(BaseContentScanTask.java:71)
	at org.apache.iceberg.BaseFileScanTask.spec(BaseFileScanTask.java:27)
	at io.trino.plugin.iceberg.IcebergSplitSource.pruneFileScanTask(IcebergSplitSource.java:300)
	at io.trino.plugin.iceberg.IcebergSplitSource.getNextBatch(IcebergSplitSource.java:252)
	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorSplitSource.getNextBatch(ClassLoaderSafeConnectorSplitSource.java:43)
	at io.trino.split.ConnectorAwareSplitSource.getNextBatch(ConnectorAwareSplitSource.java:73)
	at io.trino.split.TracingSplitSource.getNextBatch(TracingSplitSource.java:64)
	at io.trino.split.BufferingSplitSource$GetNextBatch.fetchSplits(BufferingSplitSource.java:130)
	at io.trino.split.BufferingSplitSource$GetNextBatch.fetchNextBatchAsync(BufferingSplitSource.java:112)
	at io.trino.split.BufferingSplitSource.getNextBatch(BufferingSplitSource.java:61)
	at io.trino.split.TracingSplitSource.getNextBatch(TracingSplitSource.java:64)
	at io.trino.execution.scheduler.SourcePartitionedScheduler.schedule(SourcePartitionedScheduler.java:247)
	at io.trino.execution.scheduler.SourcePartitionedScheduler$1.schedule(SourcePartitionedScheduler.java:172)
	at io.trino.execution.scheduler.PipelinedQueryScheduler$DistributedStagesScheduler.schedule(PipelinedQueryScheduler.java:1275)
	at io.trino.$gen.Trino_439_90_g9c58d39_dirty____20240306_055930_2.run(Unknown Source)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:76)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)

But if I add an ALTER TABLE test EXECUTE optimize BEFORE dropping column from Spark everything works.

@hashhar
Copy link

hashhar commented Mar 6, 2024

cc: @rdblue / @RussellSpitzer Probably Spark should also disallow dropping partition column if it's referenced in the live table files.

Here's files metadata table AFTER dropping the column without an OPTIMIZE.

spark-sql (default)> select content, partition, spec_id, file_path from iceberg_test.default.test.files;
0	{"part_col_month":null,"part_col_new_month":650}	1	hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_new_month=2024-03/20240306_063559_00070_gex23-8e7c17a2-2e37-44a6-99ed-982aa21702ce.parquet
0	{"part_col_month":null,"part_col_new_month":651}	1	hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_new_month=2024-04/20240306_063559_00070_gex23-efceda24-c595-4702-9aa4-bf38262031d3.parquet
0	{"part_col_month":null,"part_col_new_month":652}	1	hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_new_month=2024-05/20240306_063559_00070_gex23-63df6ec0-060f-4339-b297-846717fbcf6c.parquet
0	{"part_col_month":null,"part_col_new_month":662}	1	hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_new_month=2025-03/20240306_063559_00070_gex23-c07c6315-67c1-4853-af28-a3aabaa57be2.parquet
0	{"part_col_month":650,"part_col_new_month":null}	0	hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_month=2024-03/20240306_063551_00068_gex23-d5c19feb-7ac9-430e-a913-1d91f4a3b932.parquet
0	{"part_col_month":651,"part_col_new_month":null}	0	hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_month=2024-04/20240306_063551_00068_gex23-0636d316-8dda-4a60-879e-798881d26f8e.parquet
0	{"part_col_month":652,"part_col_new_month":null}	0	hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_month=2024-05/20240306_063551_00068_gex23-53de9668-3179-4df8-8216-fed09540c81b.parquet
0	{"part_col_month":650,"part_col_new_month":null}	0	hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_month=2024-03/20240306_063541_00066_gex23-88b93510-1371-4911-ab41-1aa816a20cb4.parquet
0	{"part_col_month":651,"part_col_new_month":null}	0	hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_month=2024-04/20240306_063541_00066_gex23-c9b37629-0e0e-4924-a988-1970e33494ba.parquet
0	{"part_col_month":652,"part_col_new_month":null}	0	hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_month=2024-05/20240306_063541_00066_gex23-91d14ef1-a76b-45e5-bb5f-3455e75f7f66.parquet
1	{"part_col_month":650,"part_col_new_month":null}	0	hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_month=2024-03/20240306_063551_00068_gex23-5350f95e-56ec-47cb-8e3a-985b8603c1ff.parquet
1	{"part_col_month":651,"part_col_new_month":null}	0	hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_month=2024-04/20240306_063551_00068_gex23-55094782-8bac-4723-80d4-f6a96579e367.parquet
1	{"part_col_month":652,"part_col_new_month":null}	0	hdfs://hadoop-master:9000/user/hive/warehouse/test-3c13f3fc2560432baa0ea6c9bbbd6874/data/part_col_month=2024-05/20240306_063551_00068_gex23-53b8585f-3a41-4c7c-9b23-1a29809bb194.parquet

@hashhar
Copy link

hashhar commented Mar 6, 2024

🤦 I didn't realise this was issue under apache/iceberg and not trinodb/trino. Sorry for the Trino specific discussion.

I now realise that the issue is exactly asking for what I propose - disallow dropping column if used by older partition spec.

Copy link

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

@github-actions github-actions bot added the stale label Sep 11, 2024
@hashhar
Copy link

hashhar commented Sep 11, 2024

cc: @rdblue Seems this is important. See #5707 (comment) for why this is important.

@github-actions github-actions bot removed the stale label Sep 12, 2024
@osscm
Copy link

osscm commented Sep 25, 2024

@hashhar @rdblue any conclusion on this issue, we saw this one with 421 and 438.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants