-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expression: speed up builtinRepeatSig by using MergeNulls #12674
Conversation
builtinRepeatSig, builtinLeftSig, builtinRightSig, builtinInsertSig, builtinReplaceSig to speed up
Codecov Report
@@ Coverage Diff @@
## master #12674 +/- ##
===========================================
Coverage 79.9727% 79.9727%
===========================================
Files 465 465
Lines 107319 107319
===========================================
Hits 85826 85826
Misses 15029 15029
Partials 6464 6464 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution.
It seems that MergeNulls
is called wrongly in this PR.
Please check the comment of MergeNull.
It will be nice if you can add a check in MergeNulls
for the rules that the called need to follow to avoid it be called wrongly again.
Thanks for reviewing my change of code. |
It seems that there is a wrongly called tidb/expression/builtin_string_vec.go Line 365 in 9dacf84
result is a int type column, buf and buf1 are string type columns. It seems to violate the requirements of MergeNulls
|
@js00070 |
Thanks! I totally understand now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you provide a benchmark result.
Try to use https://godoc.org/golang.org/x/perf/cmd/benchstat.
Beside, could you help to remove that TODO
you mentioned before?
util/chunk/column.go
Outdated
@@ -648,6 +648,14 @@ func (c *Column) CopyReconstruct(sel []int, dst *Column) *Column { | |||
// The caller should ensure that all these columns have the same | |||
// length, and data stored in the result column is fixed-length type. | |||
func (c *Column) MergeNulls(cols ...*Column) { | |||
if len(c.offsets) != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if len(c.offsets) != 0 { | |
if !c.isFixed() { |
I run the bench test and use here is my benchtest cmd:
here is my benchstat result:
here is my bench test code with no func (b *builtinArithmeticMinusRealSig) vecEvalReal(input *chunk.Chunk, result *chunk.Column) error {
if err := b.args[0].VecEvalReal(b.ctx, input, result); err != nil {
return err
}
n := input.NumRows()
buf, err := b.bufAllocator.get(types.ETReal, n)
if err != nil {
return err
}
defer b.bufAllocator.put(buf)
if err := b.args[1].VecEvalReal(b.ctx, input, buf); err != nil {
return err
}
// result.MergeNulls(buf)
x := result.Float64s()
y := buf.Float64s()
for i := 0; i < n; i++ {
if result.IsNull(i) || buf.IsNull(i) {
continue
}
if (x[i] > 0 && -y[i] > math.MaxFloat64-x[i]) || (x[i] < 0 && -y[i] < -math.MaxFloat64-x[i]) {
return types.ErrOverflow.GenWithStackByArgs("DOUBLE", fmt.Sprintf("(%s - %s)", b.args[0].String(), b.args[1].String()))
}
x[i] = x[i] - y[i]
}
return nil
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
/build |
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
…ect/tidb into feature-add-udf-support * 'feature-add-udf-support' of https://github.com/JustProject/tidb: (26 commits) *: fix bug that the kill command doesn't work when the killed session is waiting for the pessimistic lock (pingcap#12852) executor: fix the projection upon the indexLookUp in indexLookUpJoin can't get result. (pingcap#12889) planner, executor: support create view on union (pingcap#12595) planner/cascades: introduce TransformationID in cascades planner (pingcap#12879) executor: fix data race in test (pingcap#12910) executor: reuse chunk row for insert on duplicate update (pingcap#12847) ddl: speed up tests (pingcap#12888) executor: speed up test (pingcap#12896) expression: implement vectorized evaluation for `builtinSecondSig` (pingcap#12886) expression: implement vectorized evaluation for `builtinJSONObjectSig` (pingcap#12663) expression: speed up builtinRepeatSig by using MergeNulls (pingcap#12674) expression: speed up unit tests under the expression package (pingcap#12887) store,kv: snapshot doesn't cache the non-exists kv entries lead to poor 'insert ignore' performance (pingcap#12872) executor: fix data race in `GetDirtyTable()` (pingcap#12767) domain: increase TTL to reduce the occurrence of reporting min startTS errors (pingcap#12578) executor: split test for speed up (pingcap#12881) executor: fix inconsistent of grants privileges with MySQL when executing `grant all on ...` (pingcap#12330) expression: implement vectorized evaluation for `builtinJSONUnquoteSig` (pingcap#12841) tune grpc connection count between tidb and tikv (pingcap#12884) Makefile: change test parallel to 8 (pingcap#12885) ...
What problem does this PR solve?
I find a TODO in the code here #12014 (comment)
So I tried to introduce vectorized null-bitmap by using function MergeNulls to speed it up
What is changed and how it works?
I use
MergeNulls
inbuiltinInsertSig
, and add a rules check inMergeNulls
to avoid it be called wronglyCheck List
Tests