-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent hash routing and resizable pools, potential bug #673
Comments
Without I thought about suggesting we do this the way Cassandra does - keep some state to track of which hash ranges are owned by each routee and change those ranges based on which routees join / leave, but it still wouldn't really solve the problem because we don't have Cassandra's replication mechanism to migrate the state from the previous owner of a hash range to a new one. |
I don't think we'll have a perfect solution to this problem without Akka.Persistence, but I think if we use virtual nodes we can help make it easier to handle changes in the hash distribution. Going to take this on. |
@rogeralsing added this to the /// NOTE: Using <see cref="Resizer"/> with <see cref="ConsistentHashingPool"/> is potentially harmful, as hash ranges
/// might change radically during live message processing. This router works best with fixed-sized pools or fixed
/// number of routees per node in the event of clustered deployments. |
helps resolve akkadotnet#673 Reimplemented Murmur3, ConsistentHashRouting to use virtual nodes Added deploy and routerconfig fallback support Rewrote ActorOf method to LocalActorRefProvider to match Akka Rewrote ActorOf method to RemoteActorRefProvider to match Akka Breaking change - renamed ConsistentHashable interface to IConsistentHashable (per akkadotnet#633) Added MultiNodeTests for ClusterConsistentHashRouting Implemented Pool routers for Akka.Cluster
I don't think we handle resizable pools and consistent hashrouting correctly.
here is my reasoning.
Lets say the pool consists of two workers.
We send a message, say a SomeCustomerCommand with CustomerId (the hash) = 1
This message ends up on say worker 2.
Worker 2 starts processing the message.
If the pool resizes now at this point.
Another message with the same CustomerId (again, the hash in this case)
This message may now end up on another worker since the the number of workers have now changed and the affinity between hash and worker no longer holds.
So this time, the message might end up on worker 3.
So potentially, worker 2 and worker 3 are now processing a message each at the same time, that really should have been handled by the same worker.
Thoughts?
The text was updated successfully, but these errors were encountered: