Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aggfuncs: implement bit-or with new aggregation framework #6975

Merged
merged 9 commits into from
Jul 5, 2018

Conversation

crazycs520
Copy link
Contributor

What have you changed? (mandatory)

This PR implements bit-or with new aggregation framework

What are the type of the changes (mandatory)?

improvement

How has this PR been tested (mandatory)?

the existing test cases

Does this PR affect documentation (docs/docs-cn) update? (optional)

No

Refer to a related PR or issue link (optional)

#6952 #6852

@crazycs520
Copy link
Contributor Author

PTAL @zz-jason @XuHuaiyu

baseAggFunc
}

type result4BitOrUint64 struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/result4BitOrUint64/partialResult4BitFunc/, so we can reuse it in other bit aggregate functions

base := baseAggFunc{
args: aggFuncDesc.Args,
ordinal: ordinal,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to handle the function which has the distinct property.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why bit-or need to care distinct property?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider this query: select bit_or(distinct a) from t; we only calculate the distinct values of column a in this kind of query.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter, because bit_or(distinct a) = bit_or(a), bit_and same too, except bit_xor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment here to statement that function bitor no need to consider the distinct property

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

baseAggFunc
}

type partialResult4BitFunc struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type partialResult4BitFunc uint64

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

func (e *bitOrUint64) AllocPartialResult() PartialResult {
return PartialResult(&partialResult4BitFunc{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return PartialResult(&uint64)

if err != nil {
return errors.Trace(err)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"github.com/pingcap/tidb/util/chunk"
)

type bitOrUint64 struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type baseBitAggFunc struct{
    baseAggFunc
}

type bitOrUint64 struct{
    baseBitAggFunc
}

thus baseBitAggFunc can be reused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done~

func (e *bitOrUint64) UpdatePartialResult(sctx sessionctx.Context, rowsInGroup []chunk.Row, pr PartialResult) error {
p := (*partialResult4BitFunc)(pr)
for _, row := range rowsInGroup {
inputValue, isNull, err := e.args[0].EvalInt(sctx, row)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we wrap a cast as uint in typeInfer4BitFuncs,
or bit_or(varchar) may fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bit_or( varchar ) will return 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, we need to add a cast, consider this case:

drop table if exists t;
create table t(a decimal(10, 4));
insert into t values(12.2);
select bit_or(a) from (select * from t union all select * from t) tmp;
TiDB(localhost:4000) > desc select bit_or(a) from (select * from t union all select * from t) tmp;
+--------------------------+------+----------------------------------------------+----------+
| id                       | task | operator info                                | count    |
+--------------------------+------+----------------------------------------------+----------+
| StreamAgg_13             | root | funcs:bit_or(tmp.a)                          | 1.00     |
| └─Union_21               | root |                                              | 20000.00 |
|   ├─TableReader_24       | root | data:TableScan_23                            | 10000.00 |
|   │ └─TableScan_23       | cop  | table:t, range:[-inf,+inf], keep order:false | 10000.00 |
|   └─TableReader_27       | root | data:TableScan_26                            | 10000.00 |
|     └─TableScan_26       | cop  | table:t, range:[-inf,+inf], keep order:false | 10000.00 |
+--------------------------+------+----------------------------------------------+----------+
6 rows in set (0.00 sec)

The above StreamAgg_13 directly handles the original data instead of another aggregate operator's partial result, which is guaranteed to be uint64. This PR may failed on this query if we don't wrap a cast on it's parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ye, I'll fix it


type partialResult4BitFunc = uint64

func (e *bitOrUint64) AllocPartialResult() PartialResult {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a member function of *baseBitAggFunc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great suggestion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...I take back the last sentence. 😂
We should not make the result value to be a member of baseBitAggFunc , Because this will make The baseBitAggFunc to be Stateful.
Consider another scenario, If we have many groups to be aggregated, if the AggFunc is statefull, we have to create many aggFunc to handle this.
But if AggFunc is not statefull, we can only create one AggFunc and many partialResult4BitFunc, this will reduce go GC pressure.
( This is @zz-jason told me. Thanks very much~ )

return PartialResult(new(partialResult4BitFunc))
}

func (e *bitOrUint64) ResetPartialResult(pr PartialResult) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

*p = 0
}

func (e *bitOrUint64) AppendFinalResult2Chunk(sctx sessionctx.Context, pr PartialResult, chk *chunk.Chunk) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -0,0 +1,60 @@
// Copyright 2018 PingCAP, Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename the filename as func_bitfuncs.go


type baseBitAggFunc struct {
baseAggFunc
value uint64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not store the partial result into a aggregate function. aggregate functions should be stateless.

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason zz-jason added this to the 2.1 milestone Jul 5, 2018
@zz-jason zz-jason added type/enhancement The issue or PR belongs to an enhancement. status/LGT1 Indicates that a PR has LGTM 1. sig/execution SIG execution labels Jul 5, 2018
@@ -120,6 +120,15 @@ func buildGroupConcat(aggFuncDesc *aggregation.AggFuncDesc, ordinal int) AggFunc

// buildCount builds the AggFunc implementation for function "BIT_OR".
func buildBitOr(aggFuncDesc *aggregation.AggFuncDesc, ordinal int) AggFunc {
// BIT_OR doesn't need to handle the distinct property.
switch aggFuncDesc.Args[0].GetType().Tp {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch aggFuncDesc.Args[0].GetType().EvalType(){
case types.ETInt:
xxxx
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. PTAL

@XuHuaiyu
Copy link
Contributor

XuHuaiyu commented Jul 5, 2018

/run-all-tests

Copy link
Contributor

@XuHuaiyu XuHuaiyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@XuHuaiyu XuHuaiyu added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jul 5, 2018
@XuHuaiyu
Copy link
Contributor

XuHuaiyu commented Jul 5, 2018

@crazycs520 This PR can be merged after the checks finish.

@XuHuaiyu XuHuaiyu merged commit 363cdc2 into pingcap:master Jul 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution status/LGT2 Indicates that a PR has LGTM 2. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants