Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[indexer] Add full_objects_history #18994

Merged
merged 1 commit into from
Aug 23, 2024

Conversation

lxfind
Copy link
Contributor

@lxfind lxfind commented Aug 15, 2024

Description

We add a separate table that stores full object history that we do not plan to prune.

Test plan

How did you test the new or updated feature?


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • Indexer:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:
  • REST API:

Copy link

vercel bot commented Aug 15, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 23, 2024 6:11pm
3 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Aug 23, 2024 6:11pm
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Aug 23, 2024 6:11pm
sui-typescript-docs ⬜️ Ignored (Inspect) Visit Preview Aug 23, 2024 6:11pm

@gegaowp
Copy link
Contributor

gegaowp commented Aug 15, 2024

@lxfind the change looks good but let's not merge to main yet, b/c this is breaking change and will be in production in the next release if merged, we usually merge it to a breaking change park and then merge it to main when backfill / DB ops are complete.

I am merging the last batch of breaking changes into main and reset that branch to latest main too, fixing a couple of minor issues regarding ci.

@lxfind
Copy link
Contributor Author

lxfind commented Aug 15, 2024

Got it. Is there a doc that describes the process? For instance, does the oncall usually manually trigger a backfill/migration from the park branch from time to time? How does it identify gaps if the indexer isn't paused?


INSERT INTO full_objects_history (object_id, object_version, object_status, serialized_object)
SELECT object_id, object_version, object_status, serialized_object
FROM objects_history;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry that I actually missed this, I thought this pr only defined the schema. this will not work as INSERT INTO will only run when we set up schema, and objects_history is empty then.

I think instead we will want to

  1. remove this INSERT INTO
  2. add separate ingestion codes to directly write to full_objects_history besides objects_history

and then when we deploy this pr + the pruning policy pr, we

  1. run both CREATE TABLE full_objects_history and INSERT INTO via psql
  2. update the writer pod with this pr + pruning policy pr and resume writer
  3. there will be a small gap after the last cp done byINSERT INTO and the first checkpoint done by new ingestion codes, and we can run a INSERT INTO again to patch that.

@lxfind
Copy link
Contributor Author

lxfind commented Aug 22, 2024

Added ingestion change.
Removed SQL insert.
Also removed the object_status column since that can be derived from the serialized_object field.

@@ -572,16 +572,22 @@ impl<T: R2D2Connection + 'static> PgIndexerStore<T> {
.start_timer();
let mut mutated_objects: Vec<StoredHistoryObject> = vec![];
let mut object_versions: Vec<StoredObjectVersion> = vec![];
let mut deleted_object_ids: Vec<StoredDeletedHistoryObject> = vec![];
let mut deleted_objects: Vec<StoredDeletedHistoryObject> = vec![];
let mut full_history_objects: Vec<StoredFullHistoryObject> = vec![];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

objects_history is one of the bottlenecks today, instead of adding more workload here, let's start a new parallel task for this here?
https://github.com/MystenLabs/sui/blob/main/crates/sui-indexer/src/handlers/committer.rs#L147

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@gegaowp
Copy link
Contributor

gegaowp commented Aug 22, 2024

removed the object_status column since that can be derived from the serialized_object field.

does this require deserialization of the bcs contents ?

@lxfind
Copy link
Contributor Author

lxfind commented Aug 23, 2024

does this require deserialization of the bcs contents ?

No, it only need to check if that column is null.

@lxfind lxfind changed the base branch from main to idx-breaking-change-park August 23, 2024 17:11
@lxfind lxfind requested review from a team and wlmyng as code owners August 23, 2024 17:11
@lxfind lxfind requested review from williamrobertson13 and mystieanwaya and removed request for a team August 23, 2024 17:11
@lxfind lxfind changed the base branch from idx-breaking-change-park to main August 23, 2024 17:13
@lxfind lxfind force-pushed the indexer-add-full-object-history branch from f899699 to a49146b Compare August 23, 2024 17:59
@lxfind lxfind changed the base branch from main to idx-breaking-change-park August 23, 2024 18:00
@lxfind lxfind merged commit 53e3f8c into idx-breaking-change-park Aug 23, 2024
45 of 48 checks passed
@lxfind lxfind deleted the indexer-add-full-object-history branch August 23, 2024 19:03
lxfind added a commit that referenced this pull request Sep 6, 2024
## Description 

This is a redo of #18994, but
instead of merging to parking branch, merge it to main.

## Test plan 

CI

---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] Indexer: 
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] REST API:
lxfind added a commit that referenced this pull request Sep 11, 2024
## Description 

This is a redo of #18994, but
instead of merging to parking branch, merge it to main.

## Test plan 

CI

---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] Indexer: 
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] REST API:
suiwombat pushed a commit that referenced this pull request Sep 16, 2024
## Description 

This is a redo of #18994, but
instead of merging to parking branch, merge it to main.

## Test plan 

CI

---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] Indexer: 
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] REST API:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants