Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending InMemoryConnectionStore to add DTLS connection sharing between instances #2118

Closed
cyril2maq opened this issue Mar 10, 2023 · 18 comments

Comments

@cyril2maq
Copy link

We face the same needs as described in #1484.

Some of our clients does not enforce new handshake when their devices wake up; despite their IP address may have change - our LoadBalancer is IP-based.

We did analyze the DTLS CID LoadBalancer solution, but it does not fit our needs : we do not use kubernetes (for the auto-discovery), the persistency mechanism is sensitive to non-graceful shutdown; even with the BLUE/GREEN mechanism there will be some 'miss', and it is not compliant with auto-scaling (in case of decreasing the number of instances).

As a consequence we are implementing a redis-based connection store, which does answer to above limitations.

The idea is to extend the current InMemoryConnectionStore :

  • storing the session in redis on put/updated method after calling super.put/update
  • removing the session from redis on remove method after calling super.remove
  • on get(ConnectionId cid) method : if super.get is empty (i.e. not in local cache), check on redis and restore the connection it is has been stored

(in order to avoid several instances to keep a specific CID connection info in their local cache - hence inconsisent -, we also handle a 'clear in-memory cache for CID=xxx' event consumed by other instances when a connection is restored from redis on an instance)

This implementation requires quite few lines of code.

We are aware it will use intensively redis and decrease DTLS communication performances.

This solution does rely a lot on the current InMemoryConnectionStore implementation; nevertheless it only uses 'super.xxx' methods that are defined in the ResumptionSupportingConnectionStore interface.

We try to estimate if this approach is future proof - it is very sensitive to ResumptionSupportingConnectionStore interface / InMemoryConnectionStore implementation ; if there are some corner case where this could not work ?

Thanks a lot for any feedbacks 🙏

@boaks
Copy link
Contributor

boaks commented Mar 10, 2023

Some of our clients does not enforce new handshake when their devices wake up; despite their IP address may have change - our LoadBalancer is IP-based.

Do these device use DTLS CID? Otherwise I don't see, how that works.

We did analyze the DTLS CID LoadBalancer solution, but it does not fit our needs : we do not use kubernetes (for the auto-discovery), the persistency mechanism is sensitive to non-graceful shutdown; even with the BLUE/GREEN mechanism there will be some 'miss', and it is not compliant with auto-scaling (in case of decreasing the number of instances).

Let me say:
The cid-load-balancer approach was chosen, because the alternative approach of sync the node's connection state is much more complicated. Topics as "auto-scaling" are nice in theory, and great, if it works. But once your boss gets aware of the bill, I don't think, that this solution will find too many friends. Considering auto-scaling indicates, that the solution is cost sensitive. I would guess, that the slowdown + auto-scaling will cost much more in operation, than a "slim fixed cid load balancer solution".

We are aware it will use intensively redis and decrease DTLS communication performances.

Do you have already numbers for that slowdown?
My concern with that solution is still, that this will be easily the main door for (D)DoS attacks. A node never knows, if a record with "unkown cid" must be fetched, that causes a lot of traffic and could be easily used for an attack. In the cid-load-balancer case this is also not for free but it consumes only the forwarding of an UDP message on the application layer, which is much faster than your plans.

This solution does rely a lot on the current InMemoryConnectionStore implementation;

Would it be possible for you consider to use InMemoryReadWriteLockConnectionStore? It's very similar, but with redesigned synchronization and the also new LeastRecentlyUpdatedCache. The old implementation pair InMemoryConnectionStore and LeastRecentlyUsedCache used "time updates" (write operation) on "reads", what makes a clear read/write locking impossible.
Just to mention, some previous similar approaches created a "sync lock" including an remote access, which is not a "performance decrease", it's a disaster. Every school kid will kill your cluster (OK, maybe every student with security background).

We try to estimate if this approach is future proof

Well, once it's clear, that InMemoryReadWriteLockConnectionStore is used, it will at least require a major version to change it in an incompatible way. I would propose that you implement a good test-coverage on your side in order to verify "minor" changes and reports, if something got broken in your case.
For now, I don't see plans, which are altering that API/implementation. Not sure, when working on DTLS 1.3 will start and if that would require an change. Overall, I don't expect this changing ... but the future is unwritten.

@boaks
Copy link
Contributor

boaks commented Mar 10, 2023

============================================================

  • storing the session in redis on put/updated method after calling super.put/update
  • removing the session from redis on remove method after calling super.remove
  • on get(ConnectionId cid) method : if super.get is empty (i.e. not in local cache), check on redis and restore the connection it is has been stored

============================================================

Please note, there is a very fundamental difference between the DTLSSession and the DTLSContext associated with the Connection.
In the far past, the idea was to share the DTLSSession. This is only accessed during the handshake and so the usage is very limited. It requires, that the device do frequently resumption handshakes.
To share the Connection or DTLSContext is ways more inefficient and sophisticated than the DTLSSession. Any access may change the "record sequence numbers state" and requires that to be synced. Otherwise the vulnerability for replay attack is increasing.

@cyril2maq
Copy link
Author

Thank you a lot for taking time to answer.

Do these device use DTLS CID?

Yes.

The cid-load-balancer approach was chosen

I fully understand and can't disagree. We just have some specific use cases, constraints and environment which lead us to evaluate alternative solutions.

My concern with that solution is still, that this will be easily the main door for (D)DoS attacks

I agree. We also manage other types of connectivity and we face similar challenges. We use firewall protection and in-memory rate limiting to mitigate the risks. Regarding DTLS callflows, don't we face the same (D)DoS risks with abbreviated handshake and shared DTLSSession (discussed in #1345) ?

Do you have already numbers for that slowdown?

Not for now, but when we will have some, I will share it here.

Would it be possible for you consider to use InMemoryReadWriteLockConnectionStore?

Thanks, we did miss this class ! We will use it.

Overall, I don't expect this changing

That is a valuable information for us.

... but the future is unwritten

Otherwise, our job would be much more boring 😉

@boaks
Copy link
Contributor

boaks commented Mar 13, 2023

Thank you a lot for taking time to answer.

You're welcome.

We use firewall protection and in-memory rate limiting to mitigate the risks.

I may be wrong, but this smells for me more like adding availability than performance. If so, that's a different, maybe less complicated, challenge.

Regarding DTLS callflows, don't we face the same (D)DoS risks with abbreviated handshake and shared DTLSSession (discussed in #1345) ?

In my opinion the risk is very different.

The DTLS Session is created once and the reused "unchanged". Therefore a DTLS Session maybe shared ahead, which then doesn't require a "remote request for a received handshake message". Even if that "weak consistency" (it's not granted, that the DTLS Session is really shared at that time) fails, the server will just fallback to a full-handshake. If that happens only very rare, it would not harm.

The DTLS Context is split in two parts, the read/write keys (they don't change without (resumption-) handshake), and the "record sequence number" mechanism, which changes on every received or sent record. Additional the
DTLS Context keeps the "record sequence number" of a received "notify_close" message in order to process "reordered" last appl. messages.

Exactly that "record sequence number" is the major trouble point. I assume, for many this is not "that serious" and for a good reason ("the ultimate available an scaling server") the convince themself to know, how to deal with it.
The DTLSContextoffers for that already function to read/write all data and functions to only read/write the "record sequence number" state.

Let's see, what happens ;-).

@boaks
Copy link
Contributor

boaks commented May 11, 2023

Any progress?
Any interface or API extension required?
Or can we close it?

@cyril2maq
Copy link
Author

We did a Redis-based implementation and our functional tests are OK with current API.
We still need to measure performance impact; but this is beyond this specific issue.
So yes we can close it; thanks again for your support !

@boaks
Copy link
Contributor

boaks commented May 12, 2023

Thanks!

@boaks
Copy link
Contributor

boaks commented May 17, 2023

Just to mention, maybe it helps:

Californium has a user-space simple NAT. I use this frequently to test network issues caused by address changes, also under load.

And Californium has a "benchmark-client" in cf-extplugtest-client. I use this to test "handovers" or "failovers" under load.

The critical points in your approach will be, when the real communication changes faster than you align the states. Or on breakdown, when the real communication is ahead of the state wrote to the cache. I guess, both tools maybe useful for tests.

@sbernard31
Copy link
Contributor

@cyril2maq

We still need to measure performance impact; but this is beyond this specific issue.

Do you have any news about that ☝️

@cyril2maq
Copy link
Author

Regarding performances, we did 2 kinds of tests:

  • 'overall' performance tests with this Redis ConnectionStore implementation, i.e. test cases including other middlewares (database and event bus). For instance 'register + send' use case. As other middleware calls take much more time than Redis to handle the flow triggered by any LwM2M message (like storing data, sending a status update event...); the impact of this InMemory to Redis implementation is quite non existent (about 1%).

  • FOTA use case which will heavily use the ConnectionStore and no other time consuming process. In this case the difference is very noticeable, we got a factor 10x in our test environment.

@boaks
Copy link
Contributor

boaks commented Dec 8, 2023

Thanks for the update!

including other middlewares ... As other middleware calls take much more time

Depending on the other middlewares, that's also my experience.

factor 10x

OK. AFAIK, (but I may be wrong) redis is TCP based?
Do you forward the "sequence number changes" directly or "delayed in bulk"?
If your approach is direct one, I guess using something with UDP will also be faster.

@cyril2maq
Copy link
Author

cyril2maq commented Dec 8, 2023

AFAIK, (but I may be wrong) redis is TCP based?

Yes, Redis is TCP based.

Do you forward the "sequence number changes" directly or "delayed in bulk"?

For now we use a simple 'direct' implementation. Indeed the delayed bulk mechanism would increase performance in this context, we'll keep that in mind if we encounter performance issue in the future !

@boaks
Copy link
Contributor

boaks commented Dec 8, 2023

Thanks a lot for the very interesting and useful update about the achieved performance!

@sbernard31
Copy link
Contributor

sbernard31 commented Dec 8, 2023

@cyril2maq do you consider to share the code ?

(not specially as a Scandium contribution as I think this is maybe out of scope)

@cyril2maq
Copy link
Author

No problem to share, but what kind of sharing ?
The Redis implementation we wrote is linked to our software implementation. Several elements are very specific; e.g.

  • it is Spring-ified
  • we implemented metrics for cache hit/miss
  • we use a 'notificationService' to send event on bus for other instances to be aware that the 'owner' instance of a specific cid has changed (and thus remove from local cache this cid in all other instances)

@sbernard31
Copy link
Contributor

I was thinking about sharing the code as copy/past of the class or a link to public repository which contains it but I had in mind that this was "just" a ConnectionStore based on Redis.

@cyril2maq
Copy link
Author

You are right, this is not just a "RedisConnectionStore"; it is more a "InMemoryWithRedisFallbackConnectionStore" :

  • it extends InMemoryReadWriteLockConnectionStore (thx @boaks for the hint)
  • when we need to retrieve a cid; we just retrieve it from inMemory BUT if it is not present in memory (meaning another instance got it) then we retrieve it from redis (and tell other instances to remove it from their local cache)
  • when updating a connection, we still update it in memory (by calling super.update(...) ) then we also update it in Redis

This is this updating part which is costly; and 90% useless (if you use a sticky proxy) as we will not read the result from redis because it will already be in memory of the instance. Hence @boaks suggestion to store it in redis with a bulk/delayed mechanism (and only write once even if a cid had several updates during this delayed period).

@boaks
Copy link
Contributor

boaks commented Dec 8, 2023

BUT if it is not present in memory (meaning another instance got it) then we retrieve it from redis

;-(.

Maybe you write me an e-mail and I write you my opinion back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants