Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After migration failure, Rocket.Chat complains "control is locked" #5542

Closed
kentonv opened this issue Jan 13, 2017 · 26 comments
Closed

After migration failure, Rocket.Chat complains "control is locked" #5542

kentonv opened this issue Jan 13, 2017 · 26 comments
Assignees
Milestone

Comments

@kentonv
Copy link
Contributor

kentonv commented Jan 13, 2017

Your Rocket.Chat version: 0.49.0

In #5541 I documented a migration failure. Another issue surfaces after the failure: If I restart the grain, Rocket.Chat seems to get stuck claiming that "control is locked". There is only one instance of Rocket.Chat running, but it seems the database is somehow marked "locked" in a way that doesn't resolve automatically. Is this intentional? I would expect Rocket.Chat to retry the failed migration.

Here's the log from the point of clicking "restart" (this is the same grain from #5541):

** SANDSTORM SUPERVISOR: Starting up grain. Sandbox type: userns
+ set -euvo pipefail

export METEOR_SETTINGS='{"public": {"sandstorm": true}}'
+ export 'METEOR_SETTINGS={"public": {"sandstorm": true}}'
+ METEOR_SETTINGS='{"public": {"sandstorm": true}}'
export NODE_ENV=production
+ export NODE_ENV=production
+ NODE_ENV=production
exec node /start.js -p 8000
+ exec node /start.js -p 8000
** Starting Mongo...
2017-01-13T00:25:38.831+0000 I STORAGE  Engine custom option: log=(prealloc=false,file_max=200KB)
about to fork child process, waiting until server is ready for connections.
forked process: 9
child process started successfully, parent exiting
** Starting Meteor...
Will load cache for users
2 records load from users
Will load cache for rocketchat_room
1 records load from rocketchat_room
Will load cache for rocketchat_subscription
1 records load from rocketchat_subscription
Will load cache for rocketchat_settings
424 records load from rocketchat_settings
Updating process.env.MAIL_URL
Will load cache for rocketchat_permissions
60 records load from rocketchat_permissions
ufs: store created at 
ufs: store created at 
sandstorm/sandstorm-http-bridge.c++:2250: warning: App isn't listening for TCP connections after 30 seconds. Continuing to attempt to connect; address->toString() = 127.0.0.1:8000
Not migrating, control is locked. Attempt 1/30. Trying again in 10 seconds.
Not migrating, control is locked. Attempt 2/30. Trying again in 10 seconds.
Not migrating, control is locked. Attempt 3/30. Trying again in 10 seconds.
Not migrating, control is locked. Attempt 4/30. Trying again in 10 seconds.
Not migrating, control is locked. Attempt 5/30. Trying again in 10 seconds.
Not migrating, control is locked. Attempt 6/30. Trying again in 10 seconds.
** SANDSTORM SUPERVISOR: Grain still in use; staying up for now.
Not migrating, control is locked. Attempt 7/30. Trying again in 10 seconds.
Not migrating, control is locked. Attempt 8/30. Trying again in 10 seconds.
Not migrating, control is locked. Attempt 9/30. Trying again in 10 seconds.
@johnlund
Copy link

johnlund commented Jan 13, 2017

I'm having this same issue (Not migrating, control is locked.) with a snap on Ubuntu 16 starting about 1-2 hours ago. I'm getting a 502 Bad Gateway on the home page.

@forkless
Copy link

Having the same issue as well, after unlocking the db in mongo I will get the follow error in journal

cannot read property 'username' of undefined

and db locked at version 76, target version 79

@engelgabriel engelgabriel added this to the 0.49.1 milestone Jan 13, 2017
@negrusti
Copy link

Having the same issue

@engelgabriel
Copy link
Member

@kentonv can you give us access to a instance currently having this problem? It would help to speedup the fix.

@rodrigok
Copy link
Member

I guess the error was here https://github.com/RocketChat/Rocket.Chat/blob/develop/server/startup/migrations/v077.js#L16

Trying to use some user that was not found.

Can someone find the exception when the migration was locked at the first time?

@rodrigok
Copy link
Member

@kentonv @johnlund @forkless @negrusti are you guys using the livechat feature?

@negrusti
Copy link

negrusti commented Jan 13, 2017

Yes, we are using livechat. Unfortunately can't test anything anymore - had to start with clean database.
And yes, the error was:
Cannot read property 'username' of undefined
Database locked at version: 76
Database target version: 79

@engelgabriel
Copy link
Member

@sampaiodiego can/should we test for the existence of username on that migration?

@johnlund
Copy link

@rodrigok Yes, we're also using Livechat.

@rodrigok
Copy link
Member

Sorry for that guys.

I did the fix on develop branch now, we will ship the solution within release 49.1 ASAP.

You can unlock your database if you have access to it using:

use yourDatabaseName
db.migrations.update({_id: 'control'}, {$set: {locked: false}})

@kentonv
Copy link
Contributor Author

kentonv commented Jan 14, 2017

@rodrigok It looks like you fixed #5541 -- the migration failing. Thanks! (EDIT: Actually it's a different migration that's failing there.)

But this issue, #5542, is about what happens after a migration fails. On Sandstorm, users can't easily log into a Mongo shell to unlock their database -- and as Sandstorm is designed for non-technical users, we'd prefer that they not have to. Why does Rocket.Chat "lock" the database like this at all? Could it instead retry migration the next time it starts up (possibly failing again, unless there has been an update)?

@johnlund
Copy link

Agreed. Ubuntu Snaps are not easy to get into the backend of. My snap isn't refreshing to 49.1 either. Not sure if it has been pushed to the Snap server or not. Thanks for the quick response!

@Sing-Li
Copy link
Member

Sing-Li commented Jan 15, 2017

@engelgabriel it seems this issue was closed without addressing the concerns about database locking brought up by both the Sandstorm and Snap users' community.

In practice, can we say that the 'fixes' for migration problems (in later versions) will always unlock the db and update the scheme correctly - as long as all versions are updated in sequence? Thanks.

@engelgabriel
Copy link
Member

@lightweight
Copy link

lightweight commented Jan 16, 2017

I got a Rocket.Chat instance upgraded by doing the following - in my log (which I found by running docker inspect [id] | grep "log" and then I ran less +F [logfile] to watch it) I found that my database was stuck on version 66 (and was meant to be updated to 79), so I used rodrigok's suggestion, but added a bit found here: https://rocket.chat/docs/administrator-guides/database-migration/ - I ran
use rocketchat
db.migrations.update({_id: 'control'}, {$set: {locked: false,version:67}})
which forces the database to the next version from where I was stuck (66), and from there everything upgraded automatically. I'm now on version 79. Phew.

@thepowerprocess
Copy link

Just had the same issue today when server upgraded. The fix mentioned by @lightweight worked. Except it's 'parties' now instead of 'rocketchat'

@JoeatMJ
Copy link

JoeatMJ commented Jul 5, 2018

Have this issue with our snap version migrating from 065.x to 0.66.1. Anyone have a pointers on how to get into Mongo in snap to try to get this fix applied?

@thepowerprocess
Copy link

@JoeatMJ I did this below, but you may need to change '125' to whatever it says the database is locked at and then add one to that version number. You can often find the locked db version by checking the rocketchat log 'sudo journalctl -u snap.rocketchat-server.rocketchat-server'

sudo /snap/rocketchat-server/current/bin/mongo
use parties
db.migrations.update({_id: 'control'}, {$set: {locked: false,version:125}}) 

@JoeatMJ
Copy link

JoeatMJ commented Jul 5, 2018

@cloudsandladders - YOU ARE AWESOME! The server is up again. The server showed that it was at 125, but needed 129. Any issues with such a far leap in updates?

@thepowerprocess
Copy link

thepowerprocess commented Jul 5, 2018

@JoeatMJ That is what mine read too. I don't know why it happened, but unlocking 125 fixed it.

@neohitokiri
Copy link

Diagnostics:
sudo journalctl -u snap.rocketchat-server.rocketchat-server

Error:

+-----------------------------------------------------------------------------------------------------------------+
|                                                                                                                 |
|                                              ERROR! SERVER STOPPED                                              |
|                                                                                                                 |
|                                         Your database migration failed:                                         |
|  cannot use the part (settings of settings.preferences.groupByType) to traverse the element ({settings: null})  |
|                                                                                                                 |
|                        Please make sure you are running the latest version and try again.                       |
|                                 If the problem persists, please contact support.                                |
|                                                                                                                 |
|                                         This Rocket.Chat version: 0.66.1                                        |
|                                         Database locked at version: 124                                         |
|                                           Database target version: 129                                          |
|                                                                                                                 |
|                                 Commit: fb5257f618b22638c6b2ac4c678f76809f5a7d7e                                |
|                                       Date: Wed Jul 4 15:05:11 2018 -0300                                       |
|                                                   Branch: HEAD                                                  |
|                                                   Tag: 0.66.1                                                   |
|                                                                                                                 |
+-----------------------------------------------------------------------------------------------------------------+

Solution:

sudo /snap/rocketchat-server/current/bin/mongo
use parties
db.migrations.update({_id: 'control'}, {$set: {locked: false,version:125}}) 

@geekgonecrazy
Copy link
Contributor

By doing this you skip migrations. Anyone that follows, please do not repeat this. You are asking to have random bugs going forward.

Migrations are in place to migrate you from one schema to another. If you do not complete the migration your data will not be in the correct schema and you can pretty much count on increased bugginess going forward.

@geekgonecrazy
Copy link
Contributor

If you are having this please see: #11353

@thepowerprocess
Copy link

@geekgonecrazy thank you for your warning. I unfortunately have already done this. What is the best course to undo this and so the db can migrate correctly?

@Rimander
Copy link

Rimander commented Jul 6, 2018

I tried to fix it by modifying the database. but when I start the server, the "lock" returns to "true"
db.migrations.update({_id:` 'control'}, {$set: {locked: false}})

{
"_id" : "control",
"version" : 124,
"locked" : false,
"lockedAt" : ISODate("2018-07-06T08:07:10.882Z"),
"buildAt" : "2018-07-04T18:22:24.063Z"
}

@sampaiodiego
Copy link
Member

@Rimander you need either update the server to 0.66.2 (which includes a migration fix) after unlocking the migration or skip migration 125 (by unlocking and setting version as 125) and running the migration manually as described in #11364

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests