-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support of bulk load for the string like HBase bulkload #1301
Comments
Thanks for proposing this. +1 for this feature. |
I'm willing to submit a PR! |
@ColinChamber Assigned. |
@git-hulk @ColinChamber Thanks for this PR , Is there any progress?looking forward to this bulkload function |
Recently I haven't had enough time. Looking forward to others to achieve it. Unassigned. @liucyao1990 |
Thanks @ColinChamber for your update. |
@git-hulk For this feature, we need provide a command to load data, or provide a tool? In my opinion, there are two steps here.
The second step requires stopping the world. Do we need to support online bulk load? Will there be problems with stopping the world? |
@jihuayu Yes, you're right. And I think it's good to only support the string type first.
My intuitive thought is yes for the online bulk load, even though it will block the write operations when ingesting SSTs.
From my side, I would like to support loading the local SSTs via command and also provides a tool to generate SST files. For the tool input file, we can require users to put their data in a specified format like CSV or others. |
@git-hulk Ok, I'm willing to submit a PR! |
Thanks @jihuayu, assigned. @zuston @liucyao1990 Also welcome to provide more input about how to use the bulk load. |
@git-hulk @jihuayu Hi, here is the bulk load ingestion implementation of Pegasus. https://github.com/apache/incubator-pegasus/pulls?q=label%3Acomponent%2Fbulk_load+. FYI |
Cool, thanks for your input. |
I will first create the SST generation tool. |
Yes, that's right. It's good to NOT support the replication for now. |
Are there any updates here? |
@JackyYangPassion No. Do you want to have a try? |
Okk, I looked carefully discussions in #1628 Initially, this function only supports String type? |
@JackyYangPassion Yes, we would like to support the string first since it's the simplest one. And it's definitely great if can involve other data types. |
@JackyYangPassion Thank you! |
Motivation
Many scenarios need to bulk-load mass data regularly, and it may bring heavy workload and latency spike if loads through the API interface. So it will be better if we can offer a way to mitigate this issue.
Solution
We can use RocksDB Ingest SST to bulk load those data and support for simple strings only.
see more discussions in #1628
The text was updated successfully, but these errors were encountered: