[STALE - DO NOT REVIEW][P2P] Improve RainTree Testing Framework (Issue#85) #179

Olshansk · 2022-08-27T22:33:48Z

Description

Add documentation and improve the P2P RainTree testing framework.

Issue

Incremental step to fixing issue/85

Type of change

Please mark the options that are relevant.

List of changes

Primary Changes

Add a telemetry event for RainTree send which can be used for:
- Telemetry of how RainTree is performing
- More deterministic testing (i.e. matching on exact # messages received and sent over the network)
Add a bit of documentation on how the test framework works in the README
Add inline documentation related to details of the test framework
Reorganize helpers for the testing framework so its easier to follow
Single source of truth for the serviceURL between the shared code and the p2p testing framework

Flyby changes

Add a few TODOs throughout the code
Remove p2pConfig in the p2p module struct
Group p2p tests together in Makefile
Remove some legacy pre2p code

Testing

make test_p2p
make test_all
LocalNet w/ all of the steps outlined in the README

Checklist

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have tested my changes using the available tooling
If applicable, I have made corresponding changes to related local or global README
If applicable, I have added new diagrams using mermaid.js
If applicable, I have added tests that prove my fix is effective or that my feature works

Olshansk · 2022-08-27T22:49:34Z

@andrewnguyen22 Per your request to add documentation around how the test framework works, I realized there are ways to make the framework:

Easier to understand
More robust and deterministic
More extensible
More visibility by adding telemetry, which can also be used for testing

I think that this PR will require a fair amount of thought and effort to understand, so please spend some time to look over it when you have a chance. I think it makes our testing much more robust and will give us the power to track concrete visibility into how messages are being sent.

Once this is reviewed and merged into main, we can continue iterating on the redundancy layer in a follow up PR. You might need to update TestRainTreeCommConfig to account for redundancy logic or simply update the # network reads/writes appropriately.

An example (from the tests):

func TestRainTreeNetworkCompleteTwentySevenNodes(t *testing.T) {
	// 	                                                                                                                    val_1
	// 	                                     ┌────────────────────────────────────────────────────────────────────────────────┴───────────────────────────────────────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────┐
	//                                    val_10                                                                                                                   val_1                                                                                                       val_19
	//            ┌──────────────────────────┴──────────────┬──────────────────────────────────────┐                                         ┌────────────────────────┴────────────┬──────────────────────────────────┐                                  ┌────────────────────────┴──────────────┬────────────────────────────────────┐
	//          val_16                                    val_10                                 val_22                                     val_7                                 val_1                             val_13                             val_25                                  val_19                                val_4
	//   ┌────────┴─────┬───────────┐              ┌────────┴─────┬───────────┐           ┌────────┴─────┬───────────┐              ┌────────┴────┬──────────┐             ┌───────┴────┬─────────┐          ┌────────┴─────┬──────────┐         ┌───────┴─────┬──────────┐             ┌────────┴─────┬───────────┐          ┌───────┴────┬──────────┐
	// val_20         val_16      val_24         val_14         val_10      val_18      val_26         val_22      val_12         val_11        val_7      val_15        val_5        val_1     val_9      val_17         val_13     val_3     val_2         val_25     val_6         val_23         val_19      val_27     val_8        val_4      val_21
	originatorNode := validatorId(1)
	var expectedCalls = TestRainTreeCommConfig{
		originatorNode:  {0, 6}, // {num_messages_received_over_network, num_messages_sent_over network}
		validatorId(2):  {1, 0},
		validatorId(3):  {1, 0},
		validatorId(4):  {1, 2},
		validatorId(5):  {1, 0},
		validatorId(6):  {1, 0},
		validatorId(7):  {1, 2},
		validatorId(8):  {1, 0},
		validatorId(9):  {1, 0},
		validatorId(10): {1, 4},
		validatorId(11): {1, 0},
		validatorId(12): {1, 0},
		validatorId(13): {1, 2},
		validatorId(14): {1, 0},
		validatorId(15): {1, 0},
		validatorId(16): {1, 2},
		validatorId(17): {1, 0},
		validatorId(18): {1, 0},
		validatorId(19): {1, 4},
		validatorId(20): {1, 0},
		validatorId(21): {1, 0},
		validatorId(22): {1, 2},
		validatorId(23): {1, 0},
		validatorId(24): {1, 0},
		validatorId(25): {1, 2},
		validatorId(26): {1, 0},
		validatorId(27): {1, 0},
	}
	testRainTreeCalls(t, originatorNode, expectedCalls)
}

andrewnguyen22 · 2022-08-29T20:35:16Z

I see the documentation and this is a step in the right direction, the mocking is difficult for me to follow the path.

Can we discuss offline sometime in order to move this forward?

Olshansk · 2022-08-30T11:50:59Z

Can we discuss offline sometime in order to move this forward?

Yea, definitely. I might not be able to do so this week but we'll make time to go through it.

the mocking is difficult for me to follow the path.

Since I know you're blocked by my reviews, and we can't sync offline on it yet, I was hoping to make a request that you try to dive deeper into the library in the meantime.

For example, pull the branch and:

Modify unclear variable names so they're more semantic and easy to understand
Modify/refactor function names that are unclear
Add comments / TODOs that don't make sense

I realize this ask is a bit pedantic and very non-fun, but would really appreciate if you do it. The main reasoning is that I tried to make it easily interpretable and understandable, but if you cannot understand it, then I don't think anyone else will. Given that the consensus module level testing follows a similar approach, this is an extremely big problem for the long-term health of the codebase.

andrewnguyen22 · 2022-09-01T18:20:48Z

Review when you can

Olshansk · 2022-09-03T04:17:18Z

@andrewnguyen22 Thanks for the questions/comments/modifications. Made me rethink the readability after you asked, so I think we're heading in the right direction here.

Aside from

// TODO (Olshansk) why did these values change in between commits?

I made a modification to the business logic of TestRainTreeConfig. In particular, the way I was computing numNetworkWrites was incorrect. Rather than counting the number of writes that node makes to other nodes, it was computing the number of writes other nodes due to it, making it always equal to num reads.

// TODO (Olshansk) explain how these expected values are derived

The way we derive the number of reads & writes is a function of how RainTree works. E.g. if we have N nodes, with some names, and we determine who the originator is, we have to manually craft how many reads and writes each node does.   Hence, the little tree diagram above each test.

I used my python simulator to help me craft the tests. Do you think we should productionize it?

andrewnguyen22

Discussed offline recently in protocol hour - this PR is stale right now.

Need to see changes and conflict free before review

…279 Won't fix / outdated TODOes. Tracked in #179

codecov-commenter · 2022-10-07T21:56:21Z

Codecov Report

Base: 0.00% // Head: 43.91% // Increases project coverage by +43.91% 🎉

Coverage data is based on head (bd88f47) compared to base (983f480).
Patch coverage: 0.00% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff            @@
##           main     #179       +/-   ##
=========================================
+ Coverage      0   43.91%   +43.91%     
=========================================
  Files         0       24       +24     
  Lines         0     1749     +1749     
=========================================
+ Hits          0      768      +768     
- Misses        0      941      +941     
- Partials      0       40       +40

Impacted Files	Coverage Δ
p2p/raintree/network.go	`20.40% <0.00%> (ø)`
utility/types/vote.go	`100.00% <0.00%> (ø)`
persistence/types/base_actor.go	`0.00% <0.00%> (ø)`
persistence/types/util.go	`0.00% <0.00%> (ø)`
utility/types/message.go	`72.00% <0.00%> (ø)`
utility/types/mempool.go	`0.00% <0.00%> (ø)`
shared/indexer/indexer.go	`59.40% <0.00%> (ø)`
utility/types/transaction.go	`60.91% <0.00%> (ø)`
p2p/raintree/addrbook_utils.go	`86.66% <0.00%> (ø)`
p2p/raintree/peers_manager.go	`89.62% <0.00%> (ø)`
... and 15 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Olshansk · 2022-10-17T16:18:11Z

This will be descoped into several different issues:

Adding the Jupyter notebook to generate tests
Improving the counters & adding documentation (in [P2P] Improve RainTree counter mechanism in testing framework #304)
Improving / adding tests after (1) and (2) are done

Olshansk · 2022-10-17T16:23:39Z

cc @jessicadaugherty to help with create tickets for (1) and (3) above

Olshansk added 8 commits August 26, 2022 11:17

Cleanup Makefile

eed0dc3

Remove extra p2pConfig in the p2p module

3f6ab8a

renamed the raintree network unit tests

6569dcf

Remove pre2p artifacts

6701959

Generalize ServiceURL / GenericParam

a86208b

Reorder unit test helpers

c0e6926

Remove t from validator_id helper

16feb56

Updated how the test suite works

5ccdb46

Olshansk added documentation Improvements or additions to documentation core Core infrastructure - protocol related p2p P2P specific changes priority:medium labels Aug 27, 2022

Olshansk added this to the M1: Pocket PoS (Proof of Stake) milestone Aug 27, 2022

Olshansk requested a review from andrewnguyen22 August 27, 2022 22:33

Olshansk self-assigned this Aug 27, 2022

Olshansk mentioned this pull request Aug 27, 2022

[P2P] RainTree Cleanup & Redundancy Layers (Issue#85) #137

Closed

14 tasks

Olshansk marked this pull request as ready for review August 27, 2022 22:50

Olshansk requested a review from a team as a code owner August 27, 2022 22:50

Minor updates

deb81ce

clarity review - round 1

1cac77e

andrewnguyen22 force-pushed the issue85_test_documentation branch from 586ceb4 to 1cac77e Compare September 1, 2022 19:10

Olshansk added 5 commits September 2, 2022 16:08

Added a few comments explaining the module RainTree tests

312ce54

Rename TestRainTreeConfig to TestNetworkCommsConfig

272e289

Remove prepareTestP2PModule for more clarity

4d10187

Renamed variables and added clarity to helper functions

2168348

Added comments on magic numbers

9da61af

Olshansk linked an issue Sep 13, 2022 that may be closed by this pull request

[P2P] RainTree Redundancy Layer Implementation #85

Open

8 tasks

Interim commit from meeting

99935dc

andrewnguyen22 suggested changes Sep 18, 2022

View reviewed changes

Olshansk changed the title ~~[P2P] Improve RainTree Testing Framework (Issue#85)~~ [STALE - DO NOT REVIEW][P2P] Improve RainTree Testing Framework (Issue#85) Sep 29, 2022

deblasis added a commit that referenced this pull request Oct 5, 2022

[TECHDEBT] [P2P] Raintree test suite improvements: edge cases - Issue #…

ba657c9

…279 Won't fix / outdated TODOes. Tracked in #179

Olshansk added 2 commits October 7, 2022 12:52

Merge with master - still doesn't work

80d1c53

Merge with master - still doesn't work

bd88f47

Olshansk closed this Oct 17, 2022

Olshansk mentioned this pull request Oct 17, 2022

[P2P] Improve RainTree counter mechanism in testing framework #304

Merged

17 tasks

Olshansk deleted the issue85_test_documentation branch June 2, 2023 21:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[STALE - DO NOT REVIEW][P2P] Improve RainTree Testing Framework (Issue#85) #179

[STALE - DO NOT REVIEW][P2P] Improve RainTree Testing Framework (Issue#85) #179

Olshansk commented Aug 27, 2022 •

edited

Loading

Olshansk commented Aug 27, 2022

andrewnguyen22 commented Aug 29, 2022

Olshansk commented Aug 30, 2022

andrewnguyen22 commented Sep 1, 2022

Olshansk commented Sep 3, 2022

andrewnguyen22 left a comment •

edited

Loading

codecov-commenter commented Oct 7, 2022

Olshansk commented Oct 17, 2022 •

edited

Loading

Olshansk commented Oct 17, 2022

[STALE - DO NOT REVIEW][P2P] Improve RainTree Testing Framework (Issue#85) #179

[STALE - DO NOT REVIEW][P2P] Improve RainTree Testing Framework (Issue#85) #179

Conversation

Olshansk commented Aug 27, 2022 • edited Loading

Description

Issue

Type of change

List of changes

Primary Changes

Flyby changes

Testing

Checklist

Olshansk commented Aug 27, 2022

andrewnguyen22 commented Aug 29, 2022

Olshansk commented Aug 30, 2022

andrewnguyen22 commented Sep 1, 2022

Olshansk commented Sep 3, 2022

andrewnguyen22 left a comment • edited Loading

Choose a reason for hiding this comment

codecov-commenter commented Oct 7, 2022

Codecov Report

Olshansk commented Oct 17, 2022 • edited Loading

Olshansk commented Oct 17, 2022

Olshansk commented Aug 27, 2022 •

edited

Loading

andrewnguyen22 left a comment •

edited

Loading

Olshansk commented Oct 17, 2022 •

edited

Loading