-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOCS] Update Indexing page with all index types and file layout page #9346
Conversation
3cafc45
to
2a4c5c6
Compare
2a4c5c6
to
5b57bbe
Compare
89e23ec
to
3bba237
Compare
3bba237
to
afe7cd5
Compare
| hoodie.simple.index.update.partition.path | true (Optional) | Similar to Key: 'hoodie.bloom.index.update.partition.path' , Only applies if index type is GLOBAL_SIMPLE. When set to true, an update including the partition path of a record that already exists will result in inserting the incoming record into the new partition and deleting the original record in the old partition. When set to false, the original record will only be updated in the old partition <br /><br />`Config Param: SIMPLE_INDEX_UPDATE_PARTITION_PATH_ENABLE` | | ||
| hoodie.hbase.index.update.partition.path | false (Optional) | Only applies if index type is HBASE. When an already existing record is upserted to a new partition compared to whats in storage, this config when set, will delete old record in old partition and will insert it as new record in new partition.<br /><br />`Config Param: UPDATE_PARTITION_PATH_ENABLE` | | ||
|
||
#### Flink based configs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danny0405 can you help reviewing this part?
This is based on our experience and you should diligently decide if the same strategies are best for your workloads. | ||
|
||
## Indexing Strategies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not want to expand the scope here. Will let you take the call. But we should add streaming writes workload type and call out that RLI will be the best.
for eg, if table size is 1TB, but incremental ingestion brings in 1% of data (1Gb or less), RLI will give the best performance out of any other global index options.
Also, for update heavy workloads in case of global index, RLI will out perform other indexes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nsivabalan Agree. We should consider a second doc PR change for this part. Since this involves some more substantiation. I can collab with you next week to add this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
afe7cd5
to
49fa7a1
Compare
Summary - Update page to reflect all index types - Updated page to add configs and links
49fa7a1
to
dfbdb63
Compare
Change Logs
Update indexing page and file layout page
Impact
Docs changes
Risk level (write none, low medium or high below)
Low
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change
ticket number here and follow the instruction to make
changes to the website.
Contributor's checklist