Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sporadic Message Errors #1055

Open
mortbauer opened this issue Mar 8, 2024 · 9 comments
Open

Sporadic Message Errors #1055

mortbauer opened this issue Mar 8, 2024 · 9 comments
Assignees
Labels
Milestone

Comments

@mortbauer
Copy link
Contributor

Since I upgraded our foodsoft to 4.8.1 users are reporting problems sending messages, I could also reproduce it once, after restarting the server the problem was gone in my case.
I was also able to isolate the logs for one such problem (I removed the message body because I think it is unrelated):

fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | I, [2024-03-03T14:15:17.121226 #184]  INFO -- : [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847] Started POST "/ruebezahl17/messages" for 213.142.96.93 at 2024-03-03 14:15:17 +0000
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | I, [2024-03-03T14:15:17.124339 #184]  INFO -- : [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847] Processing by MessagesController#create as HTML
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | I, [2024-03-03T14:15:17.124664 #184]  INFO -- : [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847]   Parameters: {"authenticity_token"=>"[FILTERED]", "message"=>{"reply_to"=>"", "send_method"=>"recipients", "workgroup_id"=>"45", "ordergroup_id"=>"264", "messagegroup_id"=>"170", "order_id"=>"2231", "recipient_tokens"=>"[FILTERED]", "private"=>"0", "subject"=>"Dein Kontoblatt", "body"=>""}, "commit"=>"Nachricht verschicken", "foodcoop"=>"ruebezahl17"}
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | I, [2024-03-03T14:15:17.206397 #184]  INFO -- : [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847] Set locale to de
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | I, [2024-03-03T14:15:17.279499 #184]  INFO -- : [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847] Completed 500 Internal Server Error in 155ms (ActiveRecord: 103.2ms | Allocations: 8272)
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | I, [2024-03-03T14:15:17.296609 #184]  INFO -- : [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847]   Rendered /usr/local/bundle/gems/exception_notification-4.5.0/lib/exception_notifier/views/exception_notifier/exception_notification.text.erb (Duration: 11.0ms | Allocations: 3335)
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | I, [2024-03-03T14:15:17.665082 #184]  INFO -- : [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847] Delivered mail [email protected] (366.5ms)
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | F, [2024-03-03T14:15:17.668100 #184] FATAL -- : [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847]   
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847] ActiveRecord::StatementInvalid (Mysql2::Error: Unknown column 'recipients_ids' in 'field list'):
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847]   
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847] app/controllers/application_controller.rb:97:in `set_currency'
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847] app/controllers/application_controller.rb:85:in `set_time_zone'
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | I, [2024-03-03T14:15:17.669792 #184]  INFO -- : [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847] Processing by ErrorsController#show as HTML
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | I, [2024-03-03T14:15:17.670146 #184]  INFO -- : [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847]   Parameters: {"authenticity_token"=>"[FILTERED]", "message"=>{"reply_to"=>"", "send_method"=>"recipients", "workgroup_id"=>"45", "ordergroup_id"=>"264", "messagegroup_id"=>"170", "order_id"=>"2231", "recipient_tokens"=>"[FILTERED]", "private"=>"0", "subject"=>"Dein Kontoblatt", "body"=>"}, "commit"=>"Nachricht verschicken", "foodcoop"=>"ruebezahl17"}
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | I, [2024-03-03T14:15:17.680794 #184]  INFO -- : [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847] Set locale to de
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | I, [2024-03-03T14:15:17.703175 #184]  INFO -- : [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847]   Rendered errors/internal_server_error.html.haml within layouts/application (Duration: 2.3ms | Allocations: 341)
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | I, [2024-03-03T14:15:17.842519 #184]  INFO -- : [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847]   Rendered layout layouts/application.html.haml (Duration: 141.7ms | Allocations: 23687)
fcat_foodsoft.1.qfrln2a3tbb6@v22018056292066830    | I, [2024-03-03T14:15:17.843133 #184]  INFO -- : [54dd8b7e-6d8e-48d7-a5ee-6db73cdb8847] Completed 500 Internal Server Error in 173ms (Views: 79.6ms | ActiveRecord: 70.7ms | Allocations: 27075)

Another pattern seems to be that it only occurs for answers to messages or creating messages to ordergoups.

I do not really get it for now since recipient_ids obviously shouldn't hit the db, however it seems it does.

@mortbauer
Copy link
Contributor Author

Ok, gave it some more thinking, can it be that the message model is missing an: attribute :recipients_ids ?

mortbauer added a commit to foodcoopsat/foodsoft that referenced this issue Mar 9, 2024
@mortbauer
Copy link
Contributor Author

mortbauer commented Mar 9, 2024

I wrapped now the messages create controller in debug log level and and just look at the following lines which are the sql insert statements created for a short time window, first it produces the correct sql and suddenly it produces 4 times in a row the wrong:

image

It includes 2 additional fields, recipients_ids (which should actually by a method provided by ApplicationRecord) and email_state which is actually a field from MessageRecipient.

mixed_sql.log

@mortbauer
Copy link
Contributor Author

This are the logs from more or less the same message once it worked and the other time it didn't.
failed_transaction_debug.log
working_same_transaction.log

@mortbauer
Copy link
Contributor Author

I'm really stuck here, actually this seems to be really old code and suddenly it creates that strange behavior. And it is happening rather often so it is starting to get really troublesome.

@lentschi
Copy link
Contributor

lentschi commented Mar 9, 2024

Hm... to be honest, for now, I have no clue 😟 I also cannot reproduce this locally - maybe it's some issue only occurring with certain production environment settings.

@yksflip Have you ever experienced this error on your servers?

It seems like some kind of rails association caching bug to me, so maybe upgrading rails' (or some other component's) patch version might help. I couldn't find a rails issue for this on GitHub though.

What you could try, is add this temporary line at the top of MessagesController#create:

Rails.logger.info "---TMP DUMMY: #{Message.first.recipients.first.id}"

In the best case, this could even somehow force rails to reload the missing association config before trying to save the message. In the worst case it would throw a similar error three lines earlier.

@yksflip
Copy link
Member

yksflip commented Mar 11, 2024

this sounds really tricky. I haven't heard of this issue before.

I cannot make any sense why it tries to write recipients_ids into the message table. I mean it might used to be like that before the MessageRecipient was introduced, but that's long ago.

@lentschi I don't quite understand why this might be a association caching bug, could you maybe explain this? The association is called message_recipients but here it tries to save to a column called recipients_ids ...

@yksflip yksflip added this to the 4.8.2 milestone Mar 11, 2024
@yksflip yksflip added the bug label Mar 11, 2024
@lentschi
Copy link
Contributor

@yksflip

The association is called message_recipients but here it tries to save to a column called recipients_ids ...

Yes, but there's the recipients association too - see

has_many :recipients, through: :message_recipients, source: :user

So my only guess for now is that active records' associations get confused somehow. (Maybe depending on the order of importing some plugin or whatever... 🤷 )

I'm afraid, I don't have too much time to investigate right now. Should the bug still be there come Friday, I can try to debug.

@lentschi lentschi self-assigned this Mar 15, 2024
@lentschi
Copy link
Contributor

Debugged a little more but still couldn't find a lead. 😟

@mortbauer
When adding the suggested debug log, you could also add Rails.logger.info "---TMP DUMMY message columns: #{Message.column_names}".
That should result in ["id", "sender_id", "subject", "private", "created_at", "reply_to", "group_id", "salt", "received_email"], but I have an inkling, that in the error case it will also include recipients_ids.

What we could also add a debug log for is Message#create_message_recipients - maybe the before_validation handler isn't run for some reason. (But actually I commented it out locally and still don't see any errors related to db queries - the save just fails with a validation error then (see validates_presence_of :message_recipients).

@mortbauer
Copy link
Contributor Author

Thank you for looking into it, but I think I got rid of the problem with these two commits,
1fa917291452c03cc05e8788140650d701f1a308,92a59018b4da23cc369c41da35cc318d7feff715.
The foodsoft is now running with these since almost one week and the error did not show up anymore. So I'm now cleaning it from the debug prints and bring it our main branch.

mortbauer added a commit to foodcoopsat/foodsoft that referenced this issue Mar 15, 2024
@yksflip yksflip modified the milestones: 4.8.2, 4.10 Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Prioritised
Development

No branches or pull requests

3 participants