[Examples/Move] BigVector #16721

amnn · 2024-03-18T11:25:22Z

Description

An implementation of a B+ Tree based data structure in Move.

Test Plan

New unit tests:

big_vector$ sui move test

vercel · 2024-03-18T11:25:31Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
explorer	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Apr 17, 2024 3:43pm
mysten-ui	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Apr 17, 2024 3:43pm
sui-core	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Apr 17, 2024 3:43pm

3 Ignored Deployments

Name	Status	Preview	Updated (UTC)
multisig-toolkit	⬜️ Ignored (Inspect)	Visit Preview	Apr 17, 2024 3:43pm
sui-kiosk	⬜️ Ignored (Inspect)	Visit Preview	Apr 17, 2024 3:43pm
sui-typescript-docs	⬜️ Ignored (Inspect)	Visit Preview	Apr 17, 2024 3:43pm

amnn

Note that this will currently fail to pass tests because it requires indexing support for vector to land in Stdlib, which hasn't happened yet (coming in #16466), but I'm just putting this up as a PR, so that we have some place to discuss the code, and for @0xaslan and @leecchh to play with it as a possible replacement for the CritbitTree + LinkedTable combo we use currently for Deepbook.

amnn · 2024-03-18T11:28:51Z

examples/move/big_vector/sources/big_vector.move

+        _keys: vector<u128>,
+        _vals: vector<E>,
+    ) {
+        abort 0


Left unimplemented for now, because it's a little fiddly/complicated, and an implementation without this should still be an improvement on what we have today, but @gdanezis pointed out that having the ability to insert and remove in batches would be a helpful primitive to expose to efficiently handle workloads due to market makers who are pushing in orders in bulk.

Once we get an idea of whether or not this implementation is useful, we can add these extra bells and whistles.

Note: to be useful it would have to implement batch_op (where you can do in one go both insert, update, or delete), return an idea of mutations, and not abort (so you can mix many different parties operations). So yes, lets wait and see where it might be useful.

Got it, so I was imagining this being used by a market maker who was deleting their old orders and putting in a number of new orders, but it sounds like you have something else in mind, perhaps where you gather updates to the table across multiple transactions into a journal and then flush them in a batch, is that right?

(I have gone slightly off my idea of using a $B^\epsilon$ tree for this, because it would really complicate iteration, otherwise it would work well, but maybe we can get the best of both worlds by literally storing a journal at a single level?)

amnn · 2024-03-18T11:30:01Z

examples/move/big_vector/sources/big_vector.move

+    /// hitting object size limits.
+    const MAX_SLICE_SIZE: u64 = 256 * 1024;
+
+    // === Removal fix-up strategies ===


These constants could be an enum (although ideally they would be a private enum).

amnn · 2024-03-18T11:30:33Z

examples/move/big_vector/sources/big_vector.move

+        _lo: u128,
+        _hi: u128,
+    ) {
+        abort 0


Ditto, out of scope for prototype (useful for market maker workloads).

See above, I do not think separate batch insert / delete with aborts is as useful.

amnn · 2024-03-18T11:32:06Z

examples/move/big_vector/sources/big_vector.move

+        _self: &mut BigVector<E>,
+        _keys: vector<u128>,
+    ): vector<E> {
+        abort 0


Ditto, out of scope for prototype (useful for deleting filled orders).

amnn · 2024-03-18T11:32:44Z

examples/move/big_vector/sources/big_vector.move

+    // === Receiver function aliases ===
+
+    public use fun slice_is_null as SliceRef.is_null;
+    public use fun slice_is_leaf as Slice.is_leaf;


This is a reminder that we need to update GitHub's highlighting for Move, right?

amnn · 2024-03-18T11:37:06Z

examples/move/big_vector/sources/utils.move

Admitting defeat by adding this utils module... Initial temptation was to suggest adding these to Stdlib but I think the right move is to have a more feature rich "standard library" separate from the framework (but depending on it), and functions like this might go in that.

amnn · 2024-03-18T11:38:10Z

examples/move/big_vector/tests/big_vector_tests.move

+        *(&mut bv[7]) = 8;
+        *(&mut bv[9]) = 10;
+        *(&mut bv[15]) = 16;
+        *(&mut bv[19]) = 20;
+        *(&mut bv[22]) = 23;


...a bit funky -- will we have lvalue index syntax before the feature goes out?

cgswords · 2024-03-18T16:57:20Z

examples/move/big_vector/sources/big_vector.move

+            last_id: _,
+        } = self;
+
+        assert!(length == 0, ENotEmpty);


Nit: should we do the assert before destructuring it?

I tend not to worry about the ordering of these things because the assertion failure is game over -- nothing that happened before or after it will have any effect.

cgswords · 2024-03-18T16:58:07Z

examples/move/big_vector/sources/big_vector.move

+        let BigVector {
+            id,
+
+            depth: _,
+            length,
+            max_slice_size: _,
+            max_fan_out: _,
+
+            root_id: _,
+            last_id: _,
+        } = self;


This should work after the enum syntax diff lands:

Suggested change

let BigVector {

id,

depth: _,

length,

max_slice_size: _,

max_fan_out: _,

root_id: _,

last_id: _,

} = self;

let BigVector { id, length , .. } = self;

0xaslan · 2024-03-18T19:04:46Z

examples/move/big_vector/sources/big_vector.move

+
+        let (ix, leaf, off) = self.find_leaf(key);
+        if (off >= leaf.keys.length()) {
+            (leaf.next(), 0)


is there a case where leaf.next() doesn't exist? for example if the key is greater than the highest key present in the tree.

Indeed, this can return (SliceRef { ix: NO_SLICE }, 0), this is the case mentioned in the doc comment here (additional emphasis added):

Returns the reference to the slice and the local offset within the slice if it exists, or (NO_SLICE, 0), if there is no matching key-value pair.

leecchh

LGTM

gdanezis · 2024-03-20T11:14:19Z

Overall question: is BigVector a good name for this? The defining property of a B-Tree is that it is (a) a key value data structure and (b) that the keys are stored in order (and sequential ops are cheap). The term "vector" does not really capture these.

gdanezis

Nice one!

gdanezis · 2024-03-20T11:19:45Z

examples/move/big_vector/sources/big_vector.move

+        /// including gaps in the vector.
+        length: u64,
+
+        /// Max size of leaf nodes (counted in number of elements, `E`).


Maybe add some guidance here in doc about how to set max_slice_size and max_fan_out? I think they affect efficiency of different operations, but I have to reach for my data structures book to remember how exactly.

gdanezis · 2024-03-20T11:26:54Z

examples/move/big_vector/sources/big_vector.move

+        _keys: vector<u128>,
+        _vals: vector<E>,
+    ) {
+        abort 0


Note: to be useful it would have to implement batch_op (where you can do in one go both insert, update, or delete), return an idea of mutations, and not abort (so you can mix many different parties operations). So yes, lets wait and see where it might be useful.

gdanezis · 2024-03-20T11:27:38Z

examples/move/big_vector/sources/big_vector.move

+        _lo: u128,
+        _hi: u128,
+    ) {
+        abort 0


See above, I do not think separate batch insert / delete with aborts is as useful.

amnn · 2024-03-20T12:01:06Z

Overall question: is BigVector a good name for this?

I don't have a strong opinion on this -- happy to take suggestions for names. The original reasons it was called this are that

it was derived from the needs/interface outlined in SIP13: SIP-13: BigVector Implementation sui-foundation/sips#13 (comment),
it's similar to Android's SparseIntArray, which plays a similar trick but with a sorted array instead of a B+ tree, and
I felt bad calling it just a B+ Tree, because I couldn't parameterise the key, so its API looked more like a vector where the index type was a u128, so a very big vector.

But yes, the performance characteristics for insert and remove are off.

gdanezis · 2024-03-20T12:56:59Z

I don't have a strong opinion on this -- happy to take suggestions for names. The original reasons it was called this are that

Yeah, BigSortedVector is not as catchy :)

amnn · 2024-03-21T11:08:37Z

... BTreeVector?

An implementation of a B+ Tree based data structure in Move. Test Plan: New unit tests: ``` big_vector$ sui move test ```

jericlong · 2024-04-25T14:37:45Z

Overall question: is BigVector a good name for this?

I don't have a strong opinion on this -- happy to take suggestions for names. The original reasons it was called this are that

it was derived from the needs/interface outlined in SIP13: Typus Finance BigVector Implementation sui-foundation/sips#13 (comment),

it's similar to Android's SparseIntArray, which plays a similar trick but with a sorted array instead of a B+ tree, and

I felt bad calling it just a B+ Tree, because I couldn't parameterise the key, so its API looked more like a vector where the index type was a u128, so a very big vector.

But yes, the performance characteristics for insert and remove are off.

The original purpose for creating BigVector is the need of massive data storability and mutability in a single transaction. For instance when we settle an option vault, we need to calculate user shares one by one since there are active shares for settlement and warmup shares for the next round. User A may have 10 active + 5 warmup = total 15 for the next round, User B may have 20 active + 3 warmup = total 23 for the next round, since the ratio between active and warmup is not the same, the calculation is necessary. From this point of view I think the B+ Tree based data structure implementation is quiet different with sips#13

amnn · 2024-04-25T15:13:39Z

@jericlong, thanks for sharing! Could you share a bit more on how the data is stored in BigVector and how these user interactions translate to operations on that data structure?

amnn · 2024-10-30T12:42:07Z

Seeing as this has now gone out with Deepbook v3, I'll close this PR!

amnn requested review from gdanezis, damirka, leecchh, manolisliolios, dariorussi, 0xaslan and a team March 18, 2024 11:25

amnn self-assigned this Mar 18, 2024

vercel bot deployed to Preview – mysten-ui March 18, 2024 11:25 View deployment

vercel bot deployed to Preview – sui-core March 18, 2024 11:25 View deployment

amnn force-pushed the amnn/btree-example branch from 60e8bb1 to f0ac0a7 Compare March 18, 2024 11:27

vercel bot deployed to Preview – explorer March 18, 2024 11:27 View deployment

vercel bot deployed to Preview – sui-typescript-docs March 18, 2024 11:27 View deployment

vercel bot deployed to Preview – multisig-toolkit March 18, 2024 11:27 View deployment

vercel bot deployed to Preview – sui-kiosk March 18, 2024 11:27 View deployment

vercel bot deployed to Preview – sui-core March 18, 2024 11:28 View deployment

vercel bot deployed to Preview – mysten-ui March 18, 2024 11:28 View deployment

vercel bot deployed to Preview – sui-typescript-docs March 18, 2024 11:40 View deployment

vercel bot deployed to Preview – sui-kiosk March 18, 2024 11:40 View deployment

vercel bot deployed to Preview – explorer March 18, 2024 11:40 View deployment

vercel bot deployed to Preview – multisig-toolkit March 18, 2024 11:40 View deployment

vercel bot deployed to Preview – sui-core March 18, 2024 11:41 View deployment

vercel bot deployed to Preview – mysten-ui March 18, 2024 11:41 View deployment

amnn commented Mar 18, 2024

View reviewed changes

amnn marked this pull request as ready for review March 18, 2024 11:45

cgswords reviewed Mar 18, 2024

View reviewed changes

0xaslan reviewed Mar 18, 2024

View reviewed changes

leecchh approved these changes Mar 18, 2024

View reviewed changes

gdanezis reviewed Mar 20, 2024

View reviewed changes

amnn added 3 commits April 17, 2024 16:05

[Examples/Move] BigVector

65d0c61

An implementation of a B+ Tree based data structure in Move. Test Plan: New unit tests: ``` big_vector$ sui move test ```

fixup: public(package) instead of public(friend)

c1538ab

fixup: contains

95a73c3

amnn force-pushed the amnn/btree-example branch from f3a9d82 to 95a73c3 Compare April 17, 2024 15:32

vercel bot deployed to Preview – sui-core April 17, 2024 15:36 View deployment

fixup: licenses

5272302

vercel bot deployed to Preview – sui-core April 17, 2024 15:43 View deployment

amnn closed this Oct 30, 2024

amnn deleted the amnn/btree-example branch November 11, 2024 13:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Examples/Move] BigVector #16721

[Examples/Move] BigVector #16721

amnn commented Mar 18, 2024

vercel bot commented Mar 18, 2024 •

edited

Loading

amnn left a comment

amnn Mar 18, 2024

gdanezis Mar 20, 2024

amnn Mar 20, 2024

amnn Mar 18, 2024

amnn Mar 18, 2024

gdanezis Mar 20, 2024

amnn Mar 18, 2024

amnn Mar 18, 2024

amnn Mar 18, 2024

amnn Mar 18, 2024

cgswords Mar 18, 2024

amnn Mar 18, 2024

cgswords Mar 18, 2024

0xaslan Mar 18, 2024

amnn Mar 18, 2024

leecchh left a comment

gdanezis commented Mar 20, 2024

gdanezis left a comment

gdanezis Mar 20, 2024

gdanezis Mar 20, 2024

gdanezis Mar 20, 2024

amnn commented Mar 20, 2024

gdanezis commented Mar 20, 2024

amnn commented Mar 21, 2024

jericlong commented Apr 25, 2024

amnn commented Apr 25, 2024

amnn commented Oct 30, 2024

[Examples/Move] BigVector #16721

[Examples/Move] BigVector #16721

Conversation

amnn commented Mar 18, 2024

Description

Test Plan

vercel bot commented Mar 18, 2024 • edited Loading

amnn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leecchh left a comment

Choose a reason for hiding this comment

gdanezis commented Mar 20, 2024

gdanezis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amnn commented Mar 20, 2024

gdanezis commented Mar 20, 2024

amnn commented Mar 21, 2024

jericlong commented Apr 25, 2024

amnn commented Apr 25, 2024

amnn commented Oct 30, 2024

vercel bot commented Mar 18, 2024 •

edited

Loading