Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors when using dynamic hive catalogs after a worker crash #21093

Closed
sviscaino opened this issue Mar 14, 2024 · 2 comments
Closed

Errors when using dynamic hive catalogs after a worker crash #21093

sviscaino opened this issue Mar 14, 2024 · 2 comments

Comments

@sviscaino
Copy link

sviscaino commented Mar 14, 2024

Reopening an issue from 18040 as per requested by @dain

Trino version: 435-e
OS: RHEL
Deploy: 1 coordinator, 4 workers

Steps to reproduce:
Using dynamic catalog management and a hive catalog. It seems like if one or multiple workers crashes, after it restarts we get the following error when querying the catalog:

io.trino.spi.TrinoException: Unexpected response from http://<ip of a worker>:8080/v1/task/<query id>?summarize
	at io.trino.server.remotetask.SimpleHttpResponseHandler.onSuccess(SimpleHttpResponseHandler.java:70)
	at io.trino.server.remotetask.SimpleHttpResponseHandler.onSuccess(SimpleHttpResponseHandler.java:27)
	at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1133)
	at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:79)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.lang.IllegalArgumentException: Unable to create class io.trino.execution.TaskInfo from JSON response:
[io.airlift.jaxrs.JsonMapperParsingException: Invalid json for Java type io.trino.server.TaskUpdateRequest
	at io.airlift.jaxrs.AbstractJacksonMapper.readFrom(AbstractJacksonMapper.java:123)
	at io.airlift.jaxrs.JsonMapper.readFrom(JsonMapper.java:41)
	at org.glassfish.jersey.message.internal.ReaderInterceptorExecutor$TerminalReaderInterceptor.invokeReadFrom(ReaderInterceptorExecutor.java:233)
	at org.glassfish.jersey.message.internal.ReaderInterceptorExecutor$TerminalReaderInterceptor.aroundReadFrom(ReaderInterceptorExecutor.java:212)
...
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Unknown handle id: hive:<catalog name>:<uuid>:io.trino.plugin.hive.HiveTableHandle (through reference chain: io.trino.server.TaskUpdateRequest["fragment"]->io.trino.sql.planner.PlanFragment["root"]->io.trino.sql.planner.plan.OutputNode["source"]->io.trino.sql.planner.plan.ProjectNode["source"]->io.trino.sql.planner.plan.TableScanNode["table"]->io.trino.metadata.TableHandle["connectorHandle"])
	at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:402)
	at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:361)
	at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.wrapAndThrow(BeanDeserializerBase.java:1853)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeWithErrorWrapping(BeanDeserializer.java:572)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:439)

Sometimes we also get the following:

java.lang.ClassCastException: class io.trino.plugin.hive.HiveInsertTableHandle cannot be cast to class io.trino.plugin.hive.HiveInsertTableHandle (io.trino.plugin.hive.HiveInsertTableHandle is in unnamed module of loader io.trino.server.PluginClassLoader @aeb0954; io.trino.plugin.hive.HiveInsertTableHandle is in unnamed module of loader io.trino.server.PluginClassLoader @6d718b5d)
	at io.trino.plugin.hive.HiveMetadata.finishInsert(HiveMetadata.java:2170)
	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorMetadata.finishInsert(ClassLoaderSafeConnectorMetadata.java:625)
	at io.trino.tracing.TracingConnectorMetadata.finishInsert(TracingConnectorMetadata.java:706)
	at io.trino.metadata.MetadataManager.finishInsert(MetadataManager.java:1140)
	at io.trino.tracing.TracingMetadata.finishInsert(TracingMetadata.java:694)
	at io.trino.sql.planner.LocalExecutionPlanner.lambda$createTableFinisher$4(LocalExecutionPlanner.java:4381)
	at io.trino.operator.TableFinishOperator.getOutput(TableFinishOperator.java:319)
...

After dropping and re-creating the catalog, the error persists.
I somehow am fixing it manually by killing workers and dropping and re-creating the catalog multiple times, but I'm not sure of the exact steps.

Edit: just saw 18053 - indeed after simply running a DESCRIBE on the table it starts working again, so it's probably the same issue

@electrum
Copy link
Member

I notice your Trino version is 435-e. Are you running the Starburst version of Trino? If so, please file a support ticket with Starburst, as the Starburst version is different than Trino OSS.

@sviscaino
Copy link
Author

I notice your Trino version is 435-e. Are you running the Starburst version of Trino? If so, please file a support ticket with Starburst, as the Starburst version is different than Trino OSS.

Will do thanks - I had assumed 435-e was based off 435 OSS

@electrum electrum closed this as not planned Won't fix, can't repro, duplicate, stale Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants