-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executor: LOAD DATA use lightning CSV parser #40852
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Signed-off-by: lance6716 <[email protected]>
Signed-off-by: lance6716 <[email protected]>
Signed-off-by: lance6716 <[email protected]>
Signed-off-by: lance6716 <[email protected]>
Signed-off-by: lance6716 <[email protected]>
Signed-off-by: lance6716 <[email protected]>
/cc @gozssky @buchuitoudegou |
Signed-off-by: lance6716 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need an integration test sort of things?
executor/load_data.go
Outdated
if err != nil { | ||
return prevData, err | ||
if err = loadDataInfo.enqOneTask(ctx); err != nil { | ||
logutil.Logger(ctx).Error("load data process stream error", zap.Error(err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe make the error message more specific to the function (i.e., enqOneTask
) to make it easier to locate the error position according to the log.
} | ||
// rowCount will be used in fillRow(), last insert ID will be assigned according to the rowCount = 1. | ||
// So should add first here. | ||
e.rowCount++ | ||
e.rows = append(e.rows, e.colsToRow(ctx, cols)) | ||
e.rows = append(e.rows, e.colsToRow(ctx, parser.LastRow().Row)) | ||
e.curBatchCnt++ | ||
if e.maxRowsInBatch != 0 && e.rowCount%e.maxRowsInBatch == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not e.RowCount >= e.maxRowsInBatch
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rowCount
will not reset, it's used as counter to report progress
Signed-off-by: lance6716 <[email protected]>
In fact the files under |
ptal @gozssky @buchuitoudegou |
/merge |
This pull request has been accepted and is ready to merge. Commit hash: b9a37c6
|
What problem does this PR solve?
Issue Number: ref #40499
Problem Summary:
What is changed and how it works?
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.