Consistent hash routing and resizable pools, potential bug #673

rogeralsing · 2015-02-24T18:25:42Z

I don't think we handle resizable pools and consistent hashrouting correctly.
here is my reasoning.

Lets say the pool consists of two workers.
We send a message, say a SomeCustomerCommand with CustomerId (the hash) = 1
This message ends up on say worker 2.
Worker 2 starts processing the message.

If the pool resizes now at this point.
Another message with the same CustomerId (again, the hash in this case)
This message may now end up on another worker since the the number of workers have now changed and the affinity between hash and worker no longer holds.
So this time, the message might end up on worker 3.
So potentially, worker 2 and worker 3 are now processing a message each at the same time, that really should have been handled by the same worker.

Thoughts?

Aaronontheweb · 2015-02-24T18:42:43Z

Without Akka.ClusterSharding or some other way of moving state from one routee to another, this will always be an issue with resizable consistent hash routing.

I thought about suggesting we do this the way Cassandra does - keep some state to track of which hash ranges are owned by each routee and change those ranges based on which routees join / leave, but it still wouldn't really solve the problem because we don't have Cassandra's replication mechanism to migrate the state from the previous owner of a hash range to a new one.

Aaronontheweb · 2015-02-27T23:50:41Z

I don't think we'll have a perfect solution to this problem without Akka.Persistence, but I think if we use virtual nodes we can help make it easier to handle changes in the hash distribution. Going to take this on.

Aaronontheweb · 2015-02-28T20:35:13Z

@rogeralsing added this to the ConsistentHashingPool in my next PR:

/// NOTE: Using <see cref="Resizer"/> with <see cref="ConsistentHashingPool"/> is potentially harmful, as hash ranges
/// might change radically during live message processing. This router works best with fixed-sized pools or fixed
/// number of routees per node in the event of clustered deployments.

helps resolve akkadotnet#673 Reimplemented Murmur3, ConsistentHashRouting to use virtual nodes Added deploy and routerconfig fallback support Rewrote ActorOf method to LocalActorRefProvider to match Akka Rewrote ActorOf method to RemoteActorRefProvider to match Akka Breaking change - renamed ConsistentHashable interface to IConsistentHashable (per akkadotnet#633) Added MultiNodeTests for ClusterConsistentHashRouting Implemented Pool routers for Akka.Cluster

Aaronontheweb added the enhancement label Feb 27, 2015

Aaronontheweb self-assigned this Feb 27, 2015

Aaronontheweb added the confirmed bug label Feb 27, 2015

Aaronontheweb mentioned this issue Mar 3, 2015

ConsistentHashRouter rewrite with Virtual Nodes, Cluster support for pool routers, and ActorRefProvider Updates #707

Merged

Aaronontheweb closed this as completed in #707 Mar 5, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent hash routing and resizable pools, potential bug #673

Consistent hash routing and resizable pools, potential bug #673

rogeralsing commented Feb 24, 2015

Aaronontheweb commented Feb 24, 2015

Aaronontheweb commented Feb 27, 2015

Aaronontheweb commented Feb 28, 2015

Consistent hash routing and resizable pools, potential bug #673

Consistent hash routing and resizable pools, potential bug #673

Comments

rogeralsing commented Feb 24, 2015

Aaronontheweb commented Feb 24, 2015

Aaronontheweb commented Feb 27, 2015

Aaronontheweb commented Feb 28, 2015