Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

very dumb leader selection #425

Conversation

aeyakovenko
Copy link
Member

@aeyakovenko aeyakovenko commented Jun 23, 2018

@aeyakovenko aeyakovenko changed the title generic array fail case very dumb leader selection, and generic array fail case Jun 23, 2018
@aeyakovenko aeyakovenko changed the title very dumb leader selection, and generic array fail case very dumb leader selection Jun 23, 2018
"t",
"",
"testnet; connec to the network at this gossip entry point",
"host:port",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend HOST:PORT

opts.optopt(
"t",
"",
"testnet; connec to the network at this gossip entry point",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/connec to/connect to at gossip entry point HOST:PORT

Copy link
Contributor

@rob-solana rob-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits, otherwise LGTM

Copy link
Contributor

@garious garious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"When in Rome" please. Using full words and no one letter prefixes would be much appreciated.

opts.optopt(
"t",
"",
"testnet; connec to the network at this gossip entry point",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connect

src/crdt.rs Outdated
@@ -551,6 +551,33 @@ impl Crdt {
blob_sender.send(q)?;
Ok(())
}
/// FIXME: This is obviously the wrong way to do this. Need to implement leader selection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's consistently used TODO or FIXME. Codebase currently uses TODO. If you prefer FIXME, can you change the others?

Copy link
Member Author

@aeyakovenko aeyakovenko Jun 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought FIXME was a higher priority TODO :)

let file = File::open(path.clone()).expect(&format!("file not found: {}", path));
let leader = serde_json::from_reader(file).expect("parse");
let taddr = testnet.parse().unwrap();
let entry = ReplicatedData::new_entry_point(taddr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term entry is already actively used throughout the codebase. Can you find another?

);
let file = File::open(path.clone()).expect(&format!("file not found: {}", path));
let leader = serde_json::from_reader(file).expect("parse");
let taddr = testnet.parse().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testnet_addr?

src/crdt.rs Outdated
}

/// FIXME: This is obviously the wrong way to do this. Need to implement leader selection
/// A t-shirt for the first person to actually use this bad behavior to attack the alpha testnet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fair :)

src/crdt.rs Outdated
/// FIXME: This is obviously the wrong way to do this. Need to implement leader selection
/// A t-shirt for the first person to actually use this bad behavior to attack the alpha testnet
fn update_leader(&mut self) {
if let Some(lid) = self.top_leader() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id would be a better name since lid is a word. If not specific enough, use leader_id

src/crdt.rs Outdated
fn test_update_leader() {
logger::setup();
let me = ReplicatedData::new_leader(&"127.0.0.1:1234".parse().unwrap());
let lead = ReplicatedData::new_leader(&"127.0.0.1:1234".parse().unwrap());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leader0 and leader1 would be consistent with naming elsewhere

src/crdt.rs Outdated
let lead = ReplicatedData::new_leader(&"127.0.0.1:1234".parse().unwrap());
let lead2 = ReplicatedData::new_leader(&"127.0.0.1:1234".parse().unwrap());
let mut crdt = Crdt::new(me.clone());
assert_matches!(crdt.top_leader(), None);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert_eq

src/crdt.rs Outdated
let _ = obj.write().unwrap().update_leader();
let elapsed = timestamp() - start;
if GOSSIP_SLEEP_MILLIS > elapsed {
let left = GOSSIP_SLEEP_MILLIS - elapsed;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left is unnecessarily ambiguous. time_left or remaining would be better.

src/crdt.rs Outdated
}
let mut sorted: Vec<_> = table.iter().collect();
sorted.sort_by_key(|a| a.1);
sorted.last().map(|a| *(*(*a).0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can that be simplified?

Copy link
Member Author

@aeyakovenko aeyakovenko Jun 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe, its borrow checker all the way down. I am not sure it can be without copying the generic array key. and there is no way to sort an iterator.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wrong

@aeyakovenko
Copy link
Member Author

@garious still blocked?

@garious
Copy link
Contributor

garious commented Jun 24, 2018

@aeyakovenko, if you're happy with #427, can you merge it, rebase this PR on it, and remove that #[ignore] on the drone test?

aeyakovenko referenced this pull request in garious/solana Jun 24, 2018
@aeyakovenko, this one failed on Rust stable.
@garious garious mentioned this pull request Jun 24, 2018
@garious
Copy link
Contributor

garious commented Jun 24, 2018

Merged in #430

@garious garious closed this Jun 24, 2018
vkomenda pushed a commit to vkomenda/solana that referenced this pull request Aug 29, 2021
Bumps [eslint](https://github.com/eslint/eslint) from 7.7.0 to 7.8.1.
- [Release notes](https://github.com/eslint/eslint/releases)
- [Changelog](https://github.com/eslint/eslint/blob/master/CHANGELOG.md)
- [Commits](eslint/eslint@v7.7.0...v7.8.1)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
willhickey pushed a commit that referenced this pull request Mar 29, 2024
… (#425)

implements weighted shuffle using binary tree (#185)

This is partial port of firedancer's implementation of weighted shuffle:
https://github.com/firedancer-io/firedancer/blob/3401bfc26/src/ballet/wsample/fd_wsample.c

Though Fenwick trees use less space, inverse queries require an
additional O(log n) factor for binary search resulting an overall
O(n log n log n) performance for weighted shuffle.

This commit instead uses a binary tree where each node contains the sum
of all weights in its left sub-tree. The weights themselves are
implicitly stored at the leaves. Inverse queries and updates to the tree
all can be done O(log n) resulting an overall O(n log n) weighted
shuffle implementation.

Based on benchmarks, this results in 24% improvement in
WeightedShuffle::shuffle:

Fenwick tree:
    test bench_weighted_shuffle_new     ... bench:      36,686 ns/iter (+/- 191)
    test bench_weighted_shuffle_shuffle ... bench:     342,625 ns/iter (+/- 4,067)

Binary tree:
    test bench_weighted_shuffle_new     ... bench:      59,131 ns/iter (+/- 362)
    test bench_weighted_shuffle_shuffle ... bench:     260,194 ns/iter (+/- 11,195)

Though WeightedShuffle::new is now slower, it generally can be cached
and reused as in Turbine:
https://github.com/anza-xyz/agave/blob/b3fd87fe8/turbine/src/cluster_nodes.rs#L68

Additionally the new code has better asymptotic performance. For
example with 20_000 weights WeightedShuffle::shuffle is 31% faster:

Fenwick tree:
    test bench_weighted_shuffle_new     ... bench:     255,071 ns/iter (+/- 9,591)
    test bench_weighted_shuffle_shuffle ... bench:   2,466,058 ns/iter (+/- 9,873)

Binary tree:
    test bench_weighted_shuffle_new     ... bench:     830,727 ns/iter (+/- 10,210)
    test bench_weighted_shuffle_shuffle ... bench:   1,696,160 ns/iter (+/- 75,271)

(cherry picked from commit b6d2237)

Co-authored-by: behzad nouri <[email protected]>
lijunwangs pushed a commit to lijunwangs/solana that referenced this pull request Apr 7, 2024
…ana-labs#185) (solana-labs#425)

implements weighted shuffle using binary tree (solana-labs#185)

This is partial port of firedancer's implementation of weighted shuffle:
https://github.com/firedancer-io/firedancer/blob/3401bfc26/src/ballet/wsample/fd_wsample.c

Though Fenwick trees use less space, inverse queries require an
additional O(log n) factor for binary search resulting an overall
O(n log n log n) performance for weighted shuffle.

This commit instead uses a binary tree where each node contains the sum
of all weights in its left sub-tree. The weights themselves are
implicitly stored at the leaves. Inverse queries and updates to the tree
all can be done O(log n) resulting an overall O(n log n) weighted
shuffle implementation.

Based on benchmarks, this results in 24% improvement in
WeightedShuffle::shuffle:

Fenwick tree:
    test bench_weighted_shuffle_new     ... bench:      36,686 ns/iter (+/- 191)
    test bench_weighted_shuffle_shuffle ... bench:     342,625 ns/iter (+/- 4,067)

Binary tree:
    test bench_weighted_shuffle_new     ... bench:      59,131 ns/iter (+/- 362)
    test bench_weighted_shuffle_shuffle ... bench:     260,194 ns/iter (+/- 11,195)

Though WeightedShuffle::new is now slower, it generally can be cached
and reused as in Turbine:
https://github.com/anza-xyz/agave/blob/b3fd87fe8/turbine/src/cluster_nodes.rs#L68

Additionally the new code has better asymptotic performance. For
example with 20_000 weights WeightedShuffle::shuffle is 31% faster:

Fenwick tree:
    test bench_weighted_shuffle_new     ... bench:     255,071 ns/iter (+/- 9,591)
    test bench_weighted_shuffle_shuffle ... bench:   2,466,058 ns/iter (+/- 9,873)

Binary tree:
    test bench_weighted_shuffle_new     ... bench:     830,727 ns/iter (+/- 10,210)
    test bench_weighted_shuffle_shuffle ... bench:   1,696,160 ns/iter (+/- 75,271)

(cherry picked from commit b6d2237)

Co-authored-by: behzad nouri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants