Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite ping #41

Merged
merged 41 commits into from
Nov 24, 2023
Merged

Rewrite ping #41

merged 41 commits into from
Nov 24, 2023

Conversation

arcusfelis
Copy link
Contributor

Changes:

  • Introduce cets_ping with better node pinging strategy. It interrupts ping, if pang is received. It does not use net_adm:ping/1 to avoid going though net_kernel and potentially blocking it for a couple of seconds if the remote node is not available (couple of seconds adds up fast, if we have a lot of nodes). The current strategy is try to call EPMD first to check if the remote address is resolvable.
  • Improve logging. Report nodeup/nodedown with downtime duration and time since the node start. It is especially useful when tweaking probes and upgrading k8s.
  • Added node_down_history field to CETS. It looks like currently unused, but the purpose is to be able to analyse it manually and check after netsplits. It is added, because before we had conflict_nodes/conflict_tables persistently (i.e. after a netsplit, nodes see different lists). Still, I cannot reproduce conflict bug anymore, after tens of runs - so works as expected.

Copy link

codecov bot commented Nov 20, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (2ca3105) 98.00% compared to head (4b1d86d) 98.19%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #41      +/-   ##
==========================================
+ Coverage   98.00%   98.19%   +0.18%     
==========================================
  Files           9       10       +1     
  Lines         652      719      +67     
==========================================
+ Hits          639      706      +67     
  Misses         13       13              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@arcusfelis arcusfelis mentioned this pull request Nov 20, 2023
Copy link
Member

@chrzaszcz chrzaszcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general. The test suite is becoming lengthy, and I was getting lost in the test cases...

src/cets_ping.erl Outdated Show resolved Hide resolved
src/cets_ping.erl Outdated Show resolved Hide resolved
src/cets_ping.erl Outdated Show resolved Hide resolved
test/cets_SUITE.erl Outdated Show resolved Hide resolved
test/cets_SUITE.erl Outdated Show resolved Hide resolved
test/cets_SUITE.erl Outdated Show resolved Hide resolved
test/cets_SUITE.erl Outdated Show resolved Hide resolved
test/cets_SUITE.erl Outdated Show resolved Hide resolved
test/cets_SUITE.erl Outdated Show resolved Hide resolved
Add setup_two_nodes_and_discovery helper function
Add disco_nodeup_triggers_check_and_get_nodes testcase
There is a very high chance that there could be something new in the DB
on nodeup - so trigger the DB read. Useful if we wanna know IPs after nodeup
and IPs are stored in DB
Copy link
Member

@chrzaszcz chrzaszcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

@chrzaszcz chrzaszcz merged commit 0202120 into main Nov 24, 2023
8 checks passed
@chrzaszcz chrzaszcz deleted the rewrite-ping branch November 24, 2023 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants