Fix scale down timing bug #90

spilchen · 2021-11-01T12:55:53Z

This fixes a timing scenario with scale down that can cause corruption with the admintools.conf.

This can occur if during ‘admintools -t db_remove_node’ another change is made to the VerticaDB CR to scale down again. For instance, if we patched the CR to scale a subcluster from 3 to 2. The operator will invoke ‘admintools -t db_remove_node’. We hit the issue if before admintools returns the VerticaDB is patched again to scale the subcluster from 2 to 1.

If you hit this problem, you will see a message like this in the operator log when it tries admintools commands:

Error in /opt/vertica/config/admintools.conf?: No option 'v_verticadb_node0005' in section: 'Nodes'

The fix for this is that the uninstall will be requeued if we detect another scale down has
occurred. This will force us to call ‘admintools -t db_remove_node’ for the 2nd scale down before we drive uninstall logic.

Matt Spilchen and others added 3 commits October 29, 2021 13:56

Fix scale down timing scenario

fb1c1a1

Add changie entry

58e2dde

Merge branch 'vertica:main' into scale-down-fix

d96b0bf

spilchen requested a review from roypaulin November 1, 2021 13:01

spilchen merged commit 0336847 into vertica:main Nov 1, 2021

spilchen deleted the scale-down-fix branch November 1, 2021 17:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix scale down timing bug #90

Fix scale down timing bug #90

spilchen commented Nov 1, 2021

Fix scale down timing bug #90

Fix scale down timing bug #90

Conversation

spilchen commented Nov 1, 2021