feat: Implementation of heartbeat mechanism as part of KLIP-12 #4173

vpapavas · 2019-12-19T01:15:44Z

Description

Initial implementation of heartbeat mechanism as part of #3959

Added two new endpoints, /heartbeat used for registering received heartbeats and /clusterStatus used for reporting current cluster status. All the logic is in HeartbeatHandler.

Determining whether server is UP or Down
Currently, the policy to decide whether a node is UP or DOWN is simplistic and will need to evolve: If a server misses more than KSQL_HEARTBEAT_MISSED_THRESHOLD_CONFIG it is marked as DOWN. However, if it receives even one heartbeat before the window closes, it will be marked as UP. So, it may miss all heartbeats except the last and still be marked as UP. We want to update this by counting the received heartbeats as well and having a threshold on them as well. We will update the policy after performing real-world tests.

Cluster membership
Currently, the cluster membership is done via the persistent queries KStreamsMetadata.

Testing done

Unit tests for the resources and functional test for the HeartbeatHandler.
I have a functional test that has two servers that send heartbeats and check the cluster status response.

Manual testing: Three KSQL servers (A, B and C) on EC2.
Correctness:

Kill server A, see if the other servers (B, C) report it as down by checking the result of curl -sX GET "http://localhost:8088/clusterStatus". Bring the server A up, check again the result of curl to see if the rest of the server see it now as up.
Use iptables to block outgoing traffic of A to B. Use curl to check if B marks A as down. Check if C marks A as up (since A stills ends traffic to C). Restore traffic to B, check if A is marked up.

Time to detect change in status:
300ms to detect A is down. 150 ms to detect it is up again.

Reviewer checklist

Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
Ensure relevant issues are linked (description should include text like "Fixes #")

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/HeartbeatHandler.java

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/resources/HeartbeatResource.java

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/HeartbeatHandler.java

agavra

Thanks @vpapavas! I think the bones are there, but the complexity and synchronization models can be improved IMO. Let me know if you have any questions :)

ksql-engine/src/main/java/io/confluent/ksql/util/QueryMetadata.java

ksql-execution/src/main/java/io/confluent/ksql/services/SimpleKsqlClient.java

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/HeartbeatHandler.java

ksql-rest-model/src/main/java/io/confluent/ksql/rest/entity/HeartbeatRequest.java

agavra · 2019-12-19T21:56:42Z

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/resources/HeartbeatResource.java

+    HostInfo hostInfo = new HostInfo(hostInfoEntity.getHost(), hostInfoEntity.getPort());
+    long timestamp = request.getTimestamp();
+
+    heartbeatHandler.getReceivedHeartbeats().putIfAbsent(hostInfo, new TreeMap<>());


it might make sense to split the producer/consumer data structures of heartbeats. I would have the heartbeat handler maintain a circular buffer of heartbeats that can have many writers and then whenever you runOneIteration (see comment about AbstractScheduledService) you would flush that buffer into the tree map locally. This would probably result in fewer tree rebalances and a much simpler concurrency model. It also makes this method non-blocking (in the case that heartbeats is locked)

Yes, I had privately suggested all three design choices and we decided to go with the simplest and measure runtime/throughput and then go from there. The requirements to keep in mind are:

The heartbeats per hosts needs to be ordered for processing.

We have multiple producers with out-of-order writes.

We have one consumer.

The choices in order of (reasoning) complexity are:

Synchronize with ConcurrentHashMap + TreeMap

ConcurrentHashMap + ConcurrentNavigableMap

Use read/write locks and unordered concurrent list implementation. It will get ordered and drained by processHeartbeat thread.

As discussed offline - I agree that the criteria for choosing a solution should be simplicity, I just think that option 3 is the simplest. You don't have to worry about concurrency on any complicated (map) data structure. The critical section is really stupid, just append at end and poll from start. It's the same idea as Kafka 😂

EDIT: also my suggestion is not to bother with R/W locks - appending to a list is fast enough we can just hard lock

I will leave this to @vpapavas .

The critical section is really stupid, just append at end and poll from start.

No comments. :)

Tracked in issue #4277.

I feel the first solution might the best because:

With TreeMap
registerHeartbeat: lock, put into TreeMap, unlock (critical section takes O(logN))
processHeartbeat: lock, copy subMap, clear initial map, unlock (critical section takes O(N))

With List
registerHeartbeat: lock, add to List, unlock (critical section takes O(1))
processHeartbeat: lock, order, copy to new list, clear initial list, unlock (critical section takes O(NlogN))

@vpapavas three points:

With the second option, if you're okay with a double-copy, then the critical section becomes O(N). You just drain the list in the critical section and sort it outside of it.

The Ns aren't necessarily the same - the N in the list approach is bounded to the number of heartbeats that come in in one interval, the N in the big TreeMap approach is bounded to all of the heartbeats that you've seen

There's also the frequency argument, hypothetically you're receiving heartbeats way more often than you're registering heartbeats (probably a write-heavy system).

But I digress, I'm not sure perf should be the driving factor here but rather ease of implementation. If you feel that the current implementation is simpler, then we should stick to that. It would be a fun exercise to write them both and compare.

vinothchandar

Left some comments. May be one more spin and we are home!

There are good comments here from @agavra . Lets file issues for these, so we can track them for later, even if don't do this right away?

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/HeartbeatHandler.java

ksql-rest-model/src/main/java/io/confluent/ksql/rest/entity/HeartbeatRequest.java

vinothchandar

One real question and bunch of cleanup suggestions.. Great job @vpapavas ! Otherwise LGTM

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/KsqlRestApplication.java

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/KsqlRestConfig.java

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/services/DefaultKsqlClient.java

ksql-rest-client/src/main/java/io/confluent/ksql/rest/client/KsqlTarget.java

vinothchandar · 2020-01-06T23:44:11Z

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/HeartbeatHandler.java

+            final URI remoteUri = buildLocation(localURL, status.getHostInfoEntity().getHost(),
+                                          status.getHostInfoEntity().getPort());
+            serviceContext.getKsqlClient().makeHeartbeatRequest(remoteUri, localHostInfo,
+                                                                System.currentTimeMillis());


so the heartbeat timestamp is sent in the system's default time zone.. so the UTC issue above would be a real problem..

Aaa good catch!

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/HeartbeatHandler.java

vinothchandar · 2020-01-07T00:03:44Z

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/HeartbeatHandler.java

+        prev = ts;
+      }
+      // Check frame from last received heartbeat to window end
+      if (windowEnd - prev - 1 > 0) {


Okay fair.. I still have some gaps in understanding... The if- check here, seems to check if the difference between the last heartbeat (which can be windowStart or the real last heartbeat) and the windowEnd is non-zero.. I think this will mostly pass, in normal mode of operation right? There will be some non-zero gap between heartbeat and windowEnd, as I can imagine..

The assignment below, resets the missedCount (presumably since we are interested in consecutive misses only).. I am wondering if thats the right thing to do.. could it be possible that we already accumulated a missedCount in the for loop above and we could reset this to 0? if say heartbeatSendIntervalMs was 10ms and heartbeats arrive perfectly (let say), then windowEnd-prev is say 10ms and we set missedCount = 10-1/10 as 0? even though we detected enough consecutive missed above?

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/HeartbeatHandler.java

agavra

Mostly LGTM - thanks @vpapavas! This is a big step forward in HA :) my comments are all about the code (nothing structural) and a few bugs (I think) that I identified.

ksql-engine/src/main/java/io/confluent/ksql/util/QueryMetadata.java

agavra · 2020-01-14T21:42:10Z

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/HeartbeatAgent.java

@@ -0,0 +1,496 @@
+/*
+ * Copyright 2019 Confluent Inc.


nit: 2020 🎉 (all of the new files)

The files were created in 2019 :)

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/HeartbeatAgent.java

ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/resources/ClusterStatusResource.java

agavra · 2020-01-14T22:16:18Z

...-rest-app/src/test/java/io/confluent/ksql/rest/integration/HeartbeatAgentFunctionalTest.java

+  }
+
+  @Test(timeout = 60000)
+  public void shouldMarkRemoteServerAsUpThenDownThenUp() {


we can do this in a follow-up PR, but I'm worried that these tests will add a non-trivial amount of time to our testing suite without necessarily adding much coverage. this is, if i'm understanding correctly, trying to cover the algorithm that computes missing intervals. the HeartbeatAgentTest already does this. so while this test is great during development to make sure that things work, I don't know if we need it in our testing suite

also any test that involves timing is prone to flakiness - can you please run this test at least 50 times and make sure it passes 100% of the time?

These tests don't actually check the algorithm for missing heartbeats but rather the whole heartbeating process. It uses the /clusterStatus endpoint and all three services to check whether servers are reported alive or dead correctly. It tests the interplay of all three services

...-rest-app/src/test/java/io/confluent/ksql/rest/integration/HeartbeatAgentFunctionalTest.java

agavra · 2020-01-14T22:20:28Z

ksql-rest-app/src/test/java/io/confluent/ksql/rest/server/HeartbeatAgentTest.java

+    when(query1.getAllMetadata()).thenReturn(allMetadata1);
+    when(streamsMetadata1.hostInfo()).thenReturn(remoteHostInfo);
+
+    DiscoverClusterService discoverService = heartbeatAgent.new DiscoverClusterService();


woah I've never seen this syntax before 😂 that's kinda cool.

agavra · 2020-01-14T22:22:36Z

ksql-rest-client/src/main/java/io/confluent/ksql/rest/client/KsqlTarget.java

+      if (shouldRetry(readTimeoutMs, e)) {
+        postAsync(path, jsonEntity, calcReadTimeout(readTimeoutMs));
+      }


it's usually better to retry in a loop instead of in the exception block. can you use a framework like Retryer or our RetryUtil to handle this?

Also even if it eventually succeeds, won't it throw because we still fall through...?

I followed what the prior code did here

fixed tests use application_server config to determine local host address fixed compile issues added extra tests test fixed failing test added debug logging, made critical section smaller addressed almogs comments added return

vpapavas requested a review from a team as a code owner December 19, 2019 01:15

vinothchandar self-assigned this Dec 19, 2019

AlanConfluent reviewed Dec 19, 2019

View reviewed changes

agavra reviewed Dec 19, 2019

View reviewed changes

agavra self-assigned this Dec 19, 2019

vinothchandar reviewed Dec 20, 2019

View reviewed changes

vpapavas force-pushed the heartbeat branch from 6fb716f to 2f4cdb4 Compare January 4, 2020 02:47

vinothchandar approved these changes Jan 7, 2020

View reviewed changes

vpapavas force-pushed the heartbeat branch 2 times, most recently from c5ce119 to 86fe4c8 Compare January 14, 2020 04:01

agavra approved these changes Jan 14, 2020

View reviewed changes

vpapavas force-pushed the heartbeat branch from 86fe4c8 to 645319c Compare January 15, 2020 02:29

vpapavas added 2 commits January 16, 2020 10:56

initial implementation of heartbeat and cluster stats=us

5ee7904

fixed tests use application_server config to determine local host address fixed compile issues added extra tests test fixed failing test added debug logging, made critical section smaller addressed almogs comments added return

fix typo in comment

43dcdbe

vpapavas force-pushed the heartbeat branch from 645319c to 43dcdbe Compare January 16, 2020 19:22

vpapavas changed the title ~~feat: Initial implementation of heartbeat mechanism as part of KLIP-12~~ feat: Implementation of heartbeat mechanism as part of KLIP-12 Jan 16, 2020

vpapavas merged commit 37c1eaa into confluentinc:master Jan 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Implementation of heartbeat mechanism as part of KLIP-12 #4173

feat: Implementation of heartbeat mechanism as part of KLIP-12 #4173

vpapavas commented Dec 19, 2019 •

edited

Loading

agavra left a comment

agavra Dec 19, 2019

vpapavas Dec 20, 2019

agavra Dec 20, 2019 •

edited

Loading

vinothchandar Dec 21, 2019

vpapavas Jan 10, 2020

agavra Jan 11, 2020 •

edited

Loading

vinothchandar left a comment

vinothchandar left a comment

vinothchandar Jan 6, 2020

vpapavas Jan 7, 2020

vinothchandar Jan 7, 2020

agavra left a comment

agavra Jan 14, 2020

vpapavas Jan 14, 2020

agavra Jan 14, 2020

vpapavas Jan 14, 2020

agavra Jan 14, 2020

agavra Jan 14, 2020

vpapavas Jan 14, 2020

feat: Implementation of heartbeat mechanism as part of KLIP-12 #4173

feat: Implementation of heartbeat mechanism as part of KLIP-12 #4173

Conversation

vpapavas commented Dec 19, 2019 • edited Loading

Description

Testing done

Reviewer checklist

agavra left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agavra Dec 20, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agavra Jan 11, 2020 • edited Loading

Choose a reason for hiding this comment

vinothchandar left a comment

Choose a reason for hiding this comment

vinothchandar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agavra left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vpapavas commented Dec 19, 2019 •

edited

Loading

agavra Dec 20, 2019 •

edited

Loading

agavra Jan 11, 2020 •

edited

Loading