Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.0.1 database upgrade failure #5356

Closed
wolfchimneyrock opened this issue Oct 16, 2024 · 4 comments
Closed

3.0.1 database upgrade failure #5356

wolfchimneyrock opened this issue Oct 16, 2024 · 4 comments

Comments

@wolfchimneyrock
Copy link
Contributor

Description

Registry
Version
: 3.0.0.M5 -> 3.0.1 upgrade
Persistence type: postgresql

We perform database schema upgrades with a special 'bootstrap' instance of apicurio which has admin privileges to our postgres database (and can thus create tables, etc) whereas the client serving apicurio instances run with either read-only or read-write (but not db/table admin) access.

typically the bootstrap process starts an admin instance of apicurio, and then waits until /health/ready reports that the storage layer is ready, then stops the admin apicurio (assuming the database was successfully upgraded)

this worked for all of the 2.x upgrades we handled.

with 3.0.x upgrade from db version 100 to 101, it looks like the /health/ready endpoint reports the storage is available before the database schema change is finished - resulting in further failures to start apicurio 3.0.1. rolling back to 3.0.0M5 works fine, since the new tables are ignored.

Steps to Reproduce

start apicurio 3.0.1 with a db version 100
poll /health/ready until response code 200 returned
terminate apicurio using SIGTERM

Expected vs Actual Behaviour

expected: database is fully upgraded before /health/ready reports 200 code.
actual: database upgrade is still in progress when /health/ready reports 200.

Alternatively, the database upgrade could be idempotent and could resume a partial interrupted application.

Logs

WARN  [io.qua.run.con.DeprecatedRuntimePropertiesRecorder] (main) The 'quarkus.log.console.color' config property is deprecated and should not be used anymore
INFO  [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) SqlRegistryStorage constructed successfully.
INFO  [io.api.reg.sto.imp.sql.RegistryDatasourceProducer] (main) Using postgresql SQL storage.
INFO  [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) Checking to see if the DB is initialized.
INFO  [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) Database was already initialized, skipping.
INFO  [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) Checking to see if the DB is up-to-date.
INFO  [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) Build's DB version is 101
INFO  [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) Old database version detected, upgrading.
INFO  [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) Upgrading the Apicurio Hub API database.
INFO  [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main)      Database type: postgresql
INFO  [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main)      From Version:  100
INFO  [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main)      To Version:    101
ERROR [io.qua.run.Application] (main) Failed to start application (with profile [prod]): java.lang.RuntimeException: Failed to start quarkus
        at io.quarkus.runner.ApplicationImpl.doStart(Unknown Source)
        at io.quarkus.runtime.Application.start(Application.java:101)
        at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:111)
        at io.quarkus.runtime.Quarkus.run(Quarkus.java:71)
        at io.quarkus.runtime.Quarkus.run(Quarkus.java:44)
        at io.quarkus.runtime.Quarkus.run(Quarkus.java:124)
        at io.quarkus.runner.GeneratedMain.main(Unknown Source)
Caused by: io.apicurio.registry.storage.impl.sql.jdb.RuntimeSqlException: org.postgresql.util.PSQLException: ERROR: relation "outbox" already exists
        at io.apicurio.registry.storage.impl.sql.jdb.UpdateImpl.execute(UpdateImpl.java:28)
        at io.apicurio.registry.storage.impl.sql.AbstractSqlRegistryStorage.lambda$upgradeDatabaseRaw$3(AbstractSqlRegistryStorage.java:344)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
        at io.apicurio.registry.storage.impl.sql.AbstractSqlRegistryStorage.upgradeDatabaseRaw(AbstractSqlRegistryStorage.java:337)
        at io.apicurio.registry.storage.impl.sql.AbstractSqlRegistryStorage.lambda$initialize$0(AbstractSqlRegistryStorage.java:234)
        at io.apicurio.registry.storage.impl.sql.AbstractHandleFactory.withHandle(AbstractHandleFactory.java:45)
        at io.apicurio.registry.storage.impl.sql.AbstractHandleFactory.withHandleNoException(AbstractHandleFactory.java:98)
        at io.apicurio.registry.storage.impl.sql.HandleFactoryProducer_ProducerMethod_produceHandleFactory_NU1A3ot6FkfWU5W0o_6TCeQ4KO4_ClientProxy.withHandleNoException(Unknown Source)
        at io.apicurio.registry.storage.impl.sql.AbstractSqlRegistryStorage.initialize(AbstractSqlRegistryStorage.java:223)
        at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass.initialize$$superforward(Unknown Source)
        at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass$$function$$110.apply(Unknown Source)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:73)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
        at io.apicurio.common.apps.logging.LoggingInterceptor.logMethodEntry(LoggingInterceptor.java:53)
        at io.apicurio.common.apps.logging.LoggingInterceptor_Bean.intercept(Unknown Source)
        at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
        at io.apicurio.registry.metrics.health.readiness.PersistenceTimeoutReadinessInterceptor.intercept(PersistenceTimeoutReadinessInterceptor.java:29)
        at io.apicurio.registry.metrics.health.readiness.PersistenceTimeoutReadinessInterceptor_Bean.intercept(Unknown Source)
        at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
        at io.apicurio.registry.metrics.health.liveness.PersistenceExceptionLivenessInterceptor.intercept(PersistenceExceptionLivenessInterceptor.java:25)
        at io.apicurio.registry.metrics.health.liveness.PersistenceExceptionLivenessInterceptor_Bean.intercept(Unknown Source)
        at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:62)
        at io.apicurio.registry.metrics.StorageMetricsInterceptor.intercept(StorageMetricsInterceptor.java:41)
        at io.apicurio.registry.metrics.StorageMetricsInterceptor_Bean.intercept(Unknown Source)
        at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
        at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
        at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass.initialize(Unknown Source)
        at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage.initialize(SqlRegistryStorage.java:35)
        at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass.initialize$$superforward(Unknown Source)
        at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass$$function$$109.apply(Unknown Source)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:73)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
        at io.apicurio.common.apps.logging.LoggingInterceptor.logMethodEntry(LoggingInterceptor.java:53)
        at io.apicurio.common.apps.logging.LoggingInterceptor_Bean.intercept(Unknown Source)
        at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
        at io.apicurio.registry.metrics.health.readiness.PersistenceTimeoutReadinessInterceptor.intercept(PersistenceTimeoutReadinessInterceptor.java:29)
        at io.apicurio.registry.metrics.health.readiness.PersistenceTimeoutReadinessInterceptor_Bean.intercept(Unknown Source)
        at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
        at io.apicurio.registry.metrics.health.liveness.PersistenceExceptionLivenessInterceptor.intercept(PersistenceExceptionLivenessInterceptor.java:25)
        at io.apicurio.registry.metrics.health.liveness.PersistenceExceptionLivenessInterceptor_Bean.intercept(Unknown Source)
        at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:62)
        at io.apicurio.registry.metrics.StorageMetricsInterceptor.intercept(StorageMetricsInterceptor.java:41)
        at io.apicurio.registry.metrics.StorageMetricsInterceptor_Bean.intercept(Unknown Source)
        at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
        at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
        at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass.initialize(Unknown Source)
        at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_ClientProxy.initialize(Unknown Source)
        at io.apicurio.registry.storage.RegistryStorageProducer.raw(RegistryStorageProducer.java:94)
        at io.apicurio.registry.storage.RegistryStorageProducer.current(RegistryStorageProducer.java:51)
        at io.apicurio.registry.storage.RegistryStorageProducer_ProducerMethod_current_KUuvkPhY_4l3q6EULlF1GPXbKes_Bean.doCreate(Unknown Source)
        at io.apicurio.registry.storage.RegistryStorageProducer_ProducerMethod_current_KUuvkPhY_4l3q6EULlF1GPXbKes_Bean.create(Unknown Source)
        at io.apicurio.registry.storage.RegistryStorageProducer_ProducerMethod_current_KUuvkPhY_4l3q6EULlF1GPXbKes_Bean.create(Unknown Source)
        at io.quarkus.arc.impl.AbstractSharedContext.createInstanceHandle(AbstractSharedContext.java:119)
        at io.quarkus.arc.impl.AbstractSharedContext$1.get(AbstractSharedContext.java:38)
        at io.quarkus.arc.impl.AbstractSharedContext$1.get(AbstractSharedContext.java:35)
        at io.quarkus.arc.generator.Default_jakarta_enterprise_context_ApplicationScoped_ContextInstances.c120(Unknown Source)
        at io.quarkus.arc.generator.Default_jakarta_enterprise_context_ApplicationScoped_ContextInstances.computeIfAbsent(Unknown Source)
        at io.quarkus.arc.impl.AbstractSharedContext.get(AbstractSharedContext.java:35)
        at io.quarkus.arc.impl.ClientProxies.getApplicationScopedDelegate(ClientProxies.java:21)
        at io.apicurio.registry.storage.RegistryStorageProducer_ProducerMethod_current_KUuvkPhY_4l3q6EULlF1GPXbKes_ClientProxy.arc$delegate(Unknown Source)
                at io.apicurio.registry.storage.RegistryStorageProducer_ProducerMethod_current_KUuvkPhY_4l3q6EULlF1GPXbKes_ClientProxy.isReady(Unknown Source)
        at io.apicurio.common.apps.config.impl.DynamicConfigStartup.onStart(DynamicConfigStartup.java:46)
        at io.apicurio.common.apps.config.impl.DynamicConfigStartup_Bean.doCreate(Unknown Source)
        at io.apicurio.common.apps.config.impl.DynamicConfigStartup_Bean.create(Unknown Source)
        at io.apicurio.common.apps.config.impl.DynamicConfigStartup_Bean.create(Unknown Source)
        at io.quarkus.arc.impl.AbstractSharedContext.createInstanceHandle(AbstractSharedContext.java:119)
        at io.quarkus.arc.impl.AbstractSharedContext$1.get(AbstractSharedContext.java:38)
        at io.quarkus.arc.impl.AbstractSharedContext$1.get(AbstractSharedContext.java:35)
        at io.quarkus.arc.generator.Default_jakarta_enterprise_context_ApplicationScoped_ContextInstances.c55(Unknown Source)
        at io.quarkus.arc.generator.Default_jakarta_enterprise_context_ApplicationScoped_ContextInstances.computeIfAbsent(Unknown Source)
        at io.quarkus.arc.impl.AbstractSharedContext.get(AbstractSharedContext.java:35)
        at io.quarkus.arc.impl.ClientProxies.getApplicationScopedDelegate(ClientProxies.java:21)
        at io.apicurio.common.apps.config.impl.DynamicConfigStartup_ClientProxy.arc$delegate(Unknown Source)
        at io.apicurio.common.apps.config.impl.DynamicConfigStartup_ClientProxy.arc_contextualInstance(Unknown Source)
        at io.apicurio.common.apps.config.impl.DynamicConfigStartup_Observer_Synthetic_NzGHHybqmtc7Ct8_H1RkCbxYXQo.notify(Unknown Source)
        at io.quarkus.arc.impl.EventImpl$Notifier.notifyObservers(EventImpl.java:351)
        at io.quarkus.arc.impl.EventImpl$Notifier.notify(EventImpl.java:333)
        at io.quarkus.arc.impl.EventImpl.fire(EventImpl.java:80)
        at io.quarkus.arc.runtime.ArcRecorder.fireLifecycleEvent(ArcRecorder.java:155)
        at io.quarkus.arc.runtime.ArcRecorder.handleLifecycleEvents(ArcRecorder.java:106)
        at io.quarkus.deployment.steps.LifecycleEventsBuildStep$startupEvent1144526294.deploy_0(Unknown Source)
        at io.quarkus.deployment.steps.LifecycleEventsBuildStep$startupEvent1144526294.deploy(Unknown Source)
        ... 7 more
Caused by: org.postgresql.util.PSQLException: ERROR: relation "outbox" already exists
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2725)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2412)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:371)
        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:502)
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:419)
        at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:194)
        at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:155)
        at io.agroal.pool.wrapper.PreparedStatementWrapper.executeUpdate(PreparedStatementWrapper.java:90)
        at io.apicurio.registry.storage.impl.sql.jdb.UpdateImpl.execute(UpdateImpl.java:26)
        ... 102 more
@apicurio-bot
Copy link

apicurio-bot bot commented Oct 16, 2024

Thank you for reporting an issue!

Pinging @jsenko to respond or triage.

@wolfchimneyrock
Copy link
Contributor Author

additionally, SIGTERM is meant to be a "soft" shutdown - maybe if a database upgrade is in process and it receives SIGTERM, it will wait until the upgrade is complete? or maybe a different signal SIGUSR1 etc... could be used?

@carlesarnal
Copy link
Member

We've had a bug in our upgrade scripts that is now fixed on main. We will do a 3.0.2 release today that fixes the issue.

@EricWittmann
Copy link
Member

Apologies to @wolfchimneyrock and others. As @carlesarnal says we had a bug in the upgrade scripts that is now resolved. Version 3.0.2 is released and should fix the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

3 participants