Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

indexer: index tx_affected_addresses #19355

Merged
merged 3 commits into from
Sep 13, 2024
Merged

indexer: index tx_affected_addresses #19355

merged 3 commits into from
Sep 13, 2024

Conversation

amnn
Copy link
Member

@amnn amnn commented Sep 13, 2024

Description

Introduce tx_affected_addresses table -- a combination of tx_recipients and tx_senders, to power the sender-or-recipient query in GraphQL.

Test plan

cargo build -p sui-indexer

Tests based on reading this field will be included with the PR introducing changes to GraphQL.


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • Indexer: Index the addresses affected by a transaction (either because they are a sender or a recipient of the transaction).
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:
  • REST API:

## Description
Introduce `tx_affected` table -- a combination of `tx_recipients` and
`tx_senders`, to power the sender-or-recipient query in GraphQL.

## Test plan

```
cargo build -p sui-indexer
```

Tests based on reading this field will be included with the PR
introducing changes to GraphQL.
Copy link

vercel bot commented Sep 13, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 13, 2024 2:40pm
3 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Sep 13, 2024 2:40pm
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Sep 13, 2024 2:40pm
sui-typescript-docs ⬜️ Ignored (Inspect) Visit Preview Sep 13, 2024 2:40pm

Later on, we will also track affected objects, instead of input and
changed objects.
@amnn amnn changed the title indexer: index tx_affected indexer: index tx_affected_addresses Sep 13, 2024
Comment on lines 125 to 135
.map(|r| StoredTxAffected {
tx_sequence_number,
affected: r.to_vec(),
sender: self.sender.to_vec(),
})
.chain(std::iter::once(StoredTxAffected {
tx_sequence_number,
affected: self.sender.to_vec(),
sender: self.sender.to_vec(),
}))
.collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats the rationale for having the sender duplicated for each row?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to support filtering by sender and affected address. Transaction filters in GraphQL all follow the same rough shape:

  • They get a table that includes the filter value mapped to transaction sequence number, and their primary key is the combination of the filter value and transaction sequence number -- this allows us to phrase a filter on just this field as a single index scan.
  • They get a sender column, and a secondary index from the sender and the filter value to transaction sequence number to support filtering by sender and the filter value. The alternative is to join against tx_senders and postgres is not able to perform this join efficiently, reliably. This is why we we require that if the user sends a query that involves any other combination of filters (not sender + something) they need to provide a scan limit.
  • They get a secondary index on transaction sequence number to support pruning.

By following this pattern, we are able to give people a simple rule to follow: You can apply any single filter without worrying, and you can optionally include a sender filter without worrying, but anything else requires using scan limits. Otherwise it is very difficult for users to figure out what they can do, based on the schema we advertise, very much like things are today in JSON-RPC, where the schema implies we support any number of queries that we don't, for a variety of reasons.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome thanks for the detailed information. Do we happen to have this knowledge/best practice written down anywhere in the repo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question -- the principle around using scan limits is documented in the GraphQL schema:

/// The transaction blocks that exist in the network.
///
/// `scanLimit` restricts the number of candidate transactions scanned when gathering a page of
/// results. It is required for queries that apply more than two complex filters (on function,
/// kind, sender, recipient, input object, changed object, or ids), and can be at most
/// `serviceConfig.maxScanLimit`.
///
/// When the scan limit is reached the page will be returned even if it has fewer than `first`
/// results when paginating forward (`last` when paginating backwards). If there are more
/// transactions to scan, `pageInfo.hasNextPage` (or `pageInfo.hasPreviousPage`) will be set to
/// `true`, and `PageInfo.endCursor` (or `PageInfo.startCursor`) will be set to the last
/// transaction that was scanned as opposed to the last (or first) transaction in the page.
///
/// Requesting the next (or previous) page after this cursor will resume the search, scanning
/// the next `scanLimit` many transactions in the direction of pagination, and so on until all
/// transactions in the scanning range have been visited.
///
/// By default, the scanning range includes all transactions known to GraphQL, but it can be
/// restricted by the `after` and `before` cursors, and the `beforeCheckpoint`,
/// `afterCheckpoint` and `atCheckpoint` filters.

But the insight around the DB schema and transaction patterns is only captured in the graphql-benchmark repo:

https://github.com/MystenLabs/graphql-benchmark/blob/a7bdfad4c0d419cffa8acde24b12b373fcf7e1fa/migration_scripts/bulk_load/src/hybrid_tx.clj#L8-L49

I can add a version of this comment here:

https://github.com/MystenLabs/sui/blob/77b18b45c195103e25d62fb6cd35ab812b9e7f12/crates/sui-graphql-rpc/src/types/transaction_block/tx_lookups.rs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ended up splitting this out into its own PR because there were some other docs I wanted to add! #19363

let tx_affected_addresses = self
.recipients
.iter()
.chain(self.payers.iter())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish there are more comments in the TxIndex data structure but my understanding is that recipients should naturally include payers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is true, for the same reason that a sender (in the non-sponsorship case) is not necessarily in the recipients list: If the transaction transfers the gas coin to some other address.

@amnn amnn merged commit aaa906f into main Sep 13, 2024
48 checks passed
@amnn amnn deleted the amnn/tx-affected-idx branch September 13, 2024 18:00
suiwombat pushed a commit that referenced this pull request Sep 16, 2024
## Description
Introduce `tx_affected_addresses` table -- a combination of
`tx_recipients` and `tx_senders`, to power the sender-or-recipient query
in GraphQL.

## Test plan

```
cargo build -p sui-indexer
```

Tests based on reading this field will be included with the PR
introducing changes to GraphQL.

---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [x] Indexer: Index the addresses affected by a transaction (either
because they are a sender or a recipient of the transaction).
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] REST API:
amnn added a commit that referenced this pull request Sep 19, 2024
## Description

Similar to #19355, introduce `tx_affected_objects` table -- a
combination of `tx_input_objects` and `tx_changed_objects` which will
both eventually be removed in favour of the new table.

## Test plan

```
sui$ cargo build -p sui-indexer
```

Tests based on reading this field will be included with the PR
introducing changes to GraphQL.
amnn added a commit that referenced this pull request Sep 19, 2024
## Description

Similar to #19355, introduce `tx_affected_objects` table -- a
combination of `tx_input_objects` and `tx_changed_objects` which will
both eventually be removed in favour of the new table.

## Test plan

```
sui$ cargo build -p sui-indexer
```

Tests based on reading this field will be included with the PR
introducing changes to GraphQL.

---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [x] Indexer: Index the objects affected by a transaction (either
because they are an input object or are changed by the transaction).
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] REST API:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants