-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry logic for ioredis response incorrect result #965
Retry logic for ioredis response incorrect result #965
Comments
Any update on this issue? I am using ioredis and facing the same issue as well. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 7 days if no further activity occurs, but feel free to re-open a closed issue if needed. |
I'm encountering this same issue when testing an elasticache failover/upgrade. (Elasticache redis 5.0.0, ioredis 4.14.0) My client is running a mix of individual commands and My debugging so far has shown that during the connection reset, commands sent inside of a multi context are getting put individually on the offline queue, which on its own doesn't seem right. Just before a failover takes place, I send a multi block that looks like this:
All of these commands get queued up by ioredis, but then I observe the following:
It seems that if a connection reset occurs when a multi block is still in flight we might need to either flush or re-send any multi'd commands that might still be in the queue. I'll try to run this experiment in a fork and see if it resolves this issue. |
To start, I think perhaps transacted commands found in In the meantime @stanley115 are you able to reproduce the issue with |
This just hit us on production, after recovering from a failover our sites started showing content from different users mixed together. Any workarounds? |
As far as workarounds, we simply have to restart all connected services whenever we perform an elasticache upgrade. I'm not aware of any better workaround, but I do have a potential fix working locally that I've been meaning to PR. I've been sidetracked since then. |
Elasticache severs the connection immediately after it returns a READONLY error. This can sometimes leave queued up pipelined commands in an inconsistent state when the connection is reestablished. This fix will detect any pipeline command sets that only had a partial response before connection loss, and abort them. This fix will also check the offlineQueue for any transaction fragments, so that we don't send mismatched multi/exec to redis upon reconnection. - Introduced piplineIndex property on pipelined commands to allow for later cleanup - Added a routine to event_handler that aborts any pipelined commands inside commandQueue and offlineQueue that were interrupted in the middle of the pipeline - Introduced inTransaction property on commands to simplify pipeline logic - Added a flags param to mock_server to allow the Elasticache disconnect behavior to be simulated - Added a reconnect_on_error test case for transactions - Added some test cases testing for correct handling of this unique elasticache behavior - Added unit tests to validate inTransaction and pipelineIndex setting Fixes redis#965
Elasticache severs the connection immediately after it returns a READONLY error. This can sometimes leave queued up pipelined commands in an inconsistent state when the connection is reestablished. This fix will detect any pipeline command sets that only had a partial response before connection loss, and abort them. This fix will also check the offlineQueue for any transaction fragments, so that we don't send mismatched multi/exec to redis upon reconnection. - Introduced piplineIndex property on pipelined commands to allow for later cleanup - Added a routine to event_handler that aborts any pipelined commands inside commandQueue and offlineQueue that were interrupted in the middle of the pipeline - Added a routine to event_handler that removes any transaction fragments from the offline queue - Introduced inTransaction property on commands to simplify pipeline logic - Added a flags param to mock_server to allow the Elasticache disconnect behavior to be simulated - Added a reconnect_on_error test case for transactions - Added some test cases testing for correct handling of this unique elasticache behavior - Added unit tests to validate inTransaction and pipelineIndex setting Fixes redis#965
Elasticache severs the connection immediately after it returns a READONLY error. This can sometimes leave queued up pipelined commands in an inconsistent state when the connection is reestablished. For example, if a pipeline has 6 commands and the second one generates a READONLY error, Elasticache will only return results for the first two before severing the connection. Upon reconnect, the pipeline still thinks it has 6 commands to send but the commandQueue has only 4. This fix will detect any pipeline command sets that only had a partial response before connection loss, and abort them. This Elasticache behavior also affects transactions. This fix will check the offlineQueue for any transaction fragments and abort them, so that we don't send mismatched multi/exec to redis upon reconnection. - Introduced piplineIndex property on pipelined commands to allow for later cleanup - Added a routine to event_handler that aborts any pipelined commands inside commandQueue and offlineQueue that were interrupted in the middle of the pipeline - Added a routine to event_handler that removes any transaction fragments from the offline queue - Introduced inTransaction property on commands to simplify pipeline logic - Added a flags param to mock_server to allow the Elasticache disconnect behavior to be simulated - Added a reconnect_on_error test case for transactions - Added some test cases testing for correct handling of this unique elasticache behavior - Added unit tests to validate inTransaction and pipelineIndex setting Fixes redis#965
Elasticache severs the connection immediately after it returns a READONLY error. This can sometimes leave queued up pipelined commands in an inconsistent state when the connection is reestablished. For example, if a pipeline has 6 commands and the second one generates a READONLY error, Elasticache will only return results for the first two before severing the connection. Upon reconnect, the pipeline still thinks it has 6 commands to send but the commandQueue has only 4. This fix will detect any pipeline command sets that only had a partial response before connection loss, and abort them. This Elasticache behavior also affects transactions. If reconnectOnError returns 2, some transaction fragments will end up in the offlineQueue. This fix will check the offlineQueue for any such transaction fragments and abort them, so that we don't send mismatched multi/exec to redis upon reconnection. - Introduced piplineIndex property on pipelined commands to allow for later cleanup - Added a routine to event_handler that aborts any pipelined commands inside commandQueue and offlineQueue that were interrupted in the middle of the pipeline - Added a routine to event_handler that removes any transaction fragments from the offline queue - Introduced inTransaction property on commands to simplify pipeline logic - Added a flags param to mock_server to allow the Elasticache disconnect behavior to be simulated - Added a reconnect_on_error test case for transactions - Added some test cases testing for correct handling of this unique elasticache behavior - Added unit tests to validate inTransaction and pipelineIndex setting Fixes redis#965
Elasticache severs the connection immediately after it returns a READONLY error. This can sometimes leave queued up pipelined commands in an inconsistent state when the connection is reestablished. For example, if a pipeline has 6 commands and the second one generates a READONLY error, Elasticache will only return results for the first two before severing the connection. Upon reconnect, the pipeline still thinks it has 6 commands to send but the commandQueue has only 4. This fix will detect any pipeline command sets that only had a partial response before connection loss, and abort them. This Elasticache behavior also affects transactions. If reconnectOnError returns 2, some transaction fragments may end up in the offlineQueue. This fix will check the offlineQueue for any such transaction fragments and abort them, so that we don't send mismatched multi/exec to redis upon reconnection. - Introduced piplineIndex property on pipelined commands to allow for later cleanup - Added a routine to event_handler that aborts any pipelined commands inside commandQueue and offlineQueue that were interrupted in the middle of the pipeline - Added a routine to event_handler that removes any transaction fragments from the offline queue - Introduced inTransaction property on commands to simplify pipeline logic - Added a flags param to mock_server to allow the Elasticache disconnect behavior to be simulated - Added a reconnect_on_error test case for transactions - Added some test cases testing for correct handling of this unique elasticache behavior - Added unit tests to validate inTransaction and pipelineIndex setting Fixes redis#965
Elasticache severs the connection immediately after it returns a READONLY error. This can sometimes leave queued up pipelined commands in an inconsistent state when the connection is reestablished. For example, if a pipeline has 6 commands and the second one generates a READONLY error, Elasticache will only return results for the first two before severing the connection. Upon reconnect, the pipeline still thinks it has 6 commands to send but the commandQueue has only 4. This fix will detect any pipeline command sets that only had a partial response before connection loss, and abort them. This Elasticache behavior also affects transactions. If reconnectOnError returns 2, some transaction fragments may end up in the offlineQueue. This fix will check the offlineQueue for any such transaction fragments and abort them, so that we don't send mismatched multi/exec to redis upon reconnection. - Introduced piplineIndex property on pipelined commands to allow for later cleanup - Added a routine to event_handler that aborts any pipelined commands inside commandQueue and offlineQueue that were interrupted in the middle of the pipeline - Added a routine to event_handler that removes any transaction fragments from the offline queue - Introduced inTransaction property on commands to simplify pipeline logic - Added a flags param to mock_server to allow the Elasticache disconnect behavior to be simulated - Added a reconnect_on_error test case for transactions - Added some test cases testing for correct handling of this unique elasticache behavior - Added unit tests to validate inTransaction and pipelineIndex setting Fixes redis#965
Elasticache severs the connection immediately after it returns a READONLY error. This can sometimes leave queued up pipelined commands in an inconsistent state when the connection is reestablished. For example, if a pipeline has 6 commands and the second one generates a READONLY error, Elasticache will only return results for the first two before severing the connection. Upon reconnect, the pipeline still thinks it has 6 commands to send but the commandQueue has only 4. This fix will detect any pipeline command sets that only had a partial response before connection loss, and abort them. This Elasticache behavior also affects transactions. If reconnectOnError returns 2, some transaction fragments may end up in the offlineQueue. This fix will check the offlineQueue for any such transaction fragments and abort them, so that we don't send mismatched multi/exec to redis upon reconnection. - Introduced piplineIndex property on pipelined commands to allow for later cleanup - Added a routine to event_handler that aborts any pipelined commands inside commandQueue and offlineQueue that were interrupted in the middle of the pipeline - Added a routine to event_handler that removes any transaction fragments from the offline queue - Introduced inTransaction property on commands to simplify pipeline logic - Added a flags param to mock_server to allow the Elasticache disconnect behavior to be simulated - Added a reconnect_on_error test case for transactions - Added some test cases testing for correct handling of this unique elasticache behavior - Added unit tests to validate inTransaction and pipelineIndex setting Fixes #965
🎉 This issue has been resolved in version 4.16.1 🎉 The release is available on: Your semantic-release bot 📦🚀 |
## [4.16.1](redis/ioredis@v4.16.0...v4.16.1) (2020-03-28) ### Bug Fixes * abort incomplete pipelines upon reconnect ([#1084](redis/ioredis#1084)) ([0013991](redis/ioredis@0013991)), closes [#965](redis/ioredis#965)
Hi,
I am using ioredis with version 4.2.0 and I found ioredis may response unexpected result when "failover primary" is triggered on Amazon ElastiCache.
To verify this issue, I wrote a program to do the following steps:
The source code is as follows:
When the program is executed, it runs normally without any error at the beginning:
However, when I trigger "failover primary" is triggered on Amazon ElastiCache, it responses the following error:
and the program terminate with the following error:
PS: I found this issue cannot be reproduced when maxRetriesPerRequest is set to be 0, and I strongly suspect that there is some state corruption when retry happen.
Here is the ElastiCache cluster details:
The text was updated successfully, but these errors were encountered: