Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't fail if schema_version and schema_update_history tables already exist #697

Merged
merged 1 commit into from
Aug 29, 2020

Conversation

markmark206
Copy link

What changed?

Don't fail if cassandra tables already exist. This allows us to retry and roll forward in case of mid-stream failures during installations.

Why?

Our test pipelines occasionally experienced failures, especially as we started testing with larger, replicated cassandra cluster

How did you test it?
I built a docker image with this change and used it to deploy to kubernetes cluster a few times.

Potential risks

If this doesn't work for some reason, we can rollback or fix.

@underrun
Copy link
Contributor

this looks okay to me but are there any scenarios where failing here is a safety feature for users? if not cool - if so then maybe we can verify cassandra is fully ready before creating these at all?

@markmark206
Copy link
Author

markmark206 commented Aug 28, 2020

this looks okay to me but are there any scenarios where failing here is a safety feature for users? if not cool - if so then maybe we can verify cassandra is fully ready before creating these at all?

Great point, thank you! I am not aware of any, but I am curious if anyone has any additional insights here.

(The intention here is to address the failure scenario outlined in the side note in temporalio/helm-charts#65

I am somewhat reluctant to gate the creation logic on an external "all #{replication_factor} replicas are online" check, since things could change between that check and executing this logic, in which case we end up in the same current non-recoverable failure state, just like we did before.)

@markmark206 markmark206 merged commit 3f3dcce into temporalio:master Aug 29, 2020
@markmark206 markmark206 deleted the ifexists branch August 29, 2020 00:03
@shawnhathaway
Copy link
Contributor

this looks okay to me but are there any scenarios where failing here is a safety feature for users? if not cool - if so then maybe we can verify cassandra is fully ready before creating these at all?

Ideally all persistence queries are idempotent and we craft non-idempotent queries when we need them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants