Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: integrate hashEqual interface into LogicalPlan and expression.Expression. #55652

Merged
merged 20 commits into from
Aug 27, 2024

Conversation

AilinKid
Copy link
Contributor

@AilinKid AilinKid commented Aug 26, 2024

What problem does this PR solve?

Issue Number: ref #51664

Problem Summary:

What changed and how does it work?

in the #55570 we introduced a new kind of HashEqual interface, for simplest hash uint64 digest from some primitive types.

And this kind of interface should be implemented by logicalPlan and expression.Expression to quick hash themselves and to compare the same type structure if they are equivalent.

The first thing came into my mind is that, we should let comprehensive LogicalPlan interface and expression.Expression go include this small trait, why i find it will cause too much change for interface's original implementation. Because it's a generic interface, when it is integrated in: it will make this two to be generic

3eec3c1e-76bf-48f5-aab3-ce345c28bf9f

9b7cf491-f5c8-403c-88ef-594ffb6c2e56

which will cause all the place that referred to this two big interface to be redefined as xxx[any] as well. it's not that clean work. So we think about a way to define two new interface as:

// LogicalOperator is the interface for logical operators.
// It is used to represent the logical operators in the cascades framework.
// Since the LogicalPlan are the only implementation of normal interface{}
// with generic. Once we introduce a new generic HashEquals[T] inside it,
// like:
//
//	type LogicalPlan interface {
//			Plan
//	 	HashEquals[T]
//	 	...
//	}
//
// it will make the every implementor of LogicalPlan changed a lot. That's
// why this upper kind of LogicalOperator interface is introduced. And it
// will be used in the cascades framework.
//
// With this interface, every original LogicalPlan implementor don't need
// to change any old implementations, only have to add the two functions of
// Hash64 and Equals to become a LogicalOperator.
type LogicalOperator[T any] interface {
	base.LogicalPlan
	HashEquals[T]
}

// ScalarOperator is the interface for scalar operators.
// It is used to represent the scalar operators in the cascades framework.
// It is named for expression.Expression with HashEquals[T] interface. For
// the same reason as above LogicalOperator, we introduce this interface to
// avoid changing the original implementation of expression.Expression.
type ScalarOperator[T any] interface {
	expression.Expression
	HashEquals[T]
}

the logical expression that will be used in memo, we will referred them as LogicalOperator, which kept all the elements that LogicalExpression has while requiring HashEquals implementations on it. With that we don't need to do much change about current implementation function, only adding these two addition functional is enough.

Finally, after discussion, we derive a solution like below, getting rid of generic and use any instead.

// Hash64 is the interface for hashcode.
// It is used to calculate the lossy digest of an object to return uint64
// rather than compacted bytes from cascaded operators bottom-up.
type Hash64 interface {
	// Hash64 returns the uint64 digest of an object.
	Hash64(h Hasher)
}

// Equals is the interface for equality check.
// When we need to compare two objects when countering hash conflicts, we can
// use this interface to check whether they are equal.
type Equals interface {
	// Equals checks whether two base objects are equal.
	Equals(any) bool
}

// HashEquals is the interface for hash64 and equality check.
type HashEquals interface {
	Hash64
	Equals
}

And we did a benchmark for this, the any usage seem adequate: https://github.com/pingcap/tidb/pull/55652/files#diff-ef9b611a397bd5133fb3b00e5fca906beee78179821ee20a308e22110ead85aeR93

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

.
Signed-off-by: arenatlx <[email protected]>
@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. sig/planner SIG: Planner labels Aug 26, 2024
Copy link

tiprow bot commented Aug 26, 2024

Hi @AilinKid. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

// *************************** start implementation of Plan interface ***************************
// *************************** start implementation of HashEquals interface ****************************

func (p *LogicalProjection) Hash64(h mutil.Hasher) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a future example

.
Signed-off-by: arenatlx <[email protected]>
.
Signed-off-by: arenatlx <[email protected]>
.
Signed-off-by: arenatlx <[email protected]>
.
Signed-off-by: arenatlx <[email protected]>
.
Signed-off-by: arenatlx <[email protected]>
.
Signed-off-by: arenatlx <[email protected]>
.
Signed-off-by: arenatlx <[email protected]>
.
Signed-off-by: arenatlx <[email protected]>
.
Signed-off-by: arenatlx <[email protected]>
.
Signed-off-by: arenatlx <[email protected]>
.
Signed-off-by: arenatlx <[email protected]>
}
// todo: LogicalSchemaProducer should implement HashEquals interface, otherwise, its self elements
// like schema and names are lost.
if !p.LogicalSchemaProducer.Equals(&proj.BaseLogicalPlan) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you ever implemented the equals function of LogicalSchemaProducer ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently not, this is is just a example and reminder for next few pull requests.

.
Signed-off-by: arenatlx <[email protected]>
.
Signed-off-by: AilinKid <[email protected]>
Copy link

codecov bot commented Aug 26, 2024

Codecov Report

Attention: Patch coverage is 0% with 43 lines in your changes missing coverage. Please review.

Project coverage is 74.5166%. Comparing base (ebbe53c) to head (2ac16fd).
Report is 11 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #55652        +/-   ##
================================================
+ Coverage   72.9612%   74.5166%   +1.5554%     
================================================
  Files          1581       1585         +4     
  Lines        442690     444031      +1341     
================================================
+ Hits         322992     330877      +7885     
+ Misses        99873      92859      -7014     
- Partials      19825      20295       +470     
Flag Coverage Δ
integration 48.9595% <0.0000%> (?)
unit 71.8505% <0.0000%> (-0.1974%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.9567% <ø> (ø)
parser ∅ <ø> (∅)
br 49.9369% <ø> (+3.9876%) ⬆️

.
Signed-off-by: arenatlx <[email protected]>
.
Signed-off-by: arenatlx <[email protected]>
if !p.LogicalSchemaProducer.Equals(&proj.BaseLogicalPlan) {
return false
}
//for i, one := range p.Exprs {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, next pull request will introduce hashEqual to expression.Expression, which will enable the comment line here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

.
Signed-off-by: arenatlx <[email protected]>
.
Signed-off-by: arenatlx <[email protected]>
Makefile Outdated
@@ -372,6 +372,7 @@ bench-daily:
go test github.com/pingcap/tidb/pkg/util/codec -run TestBenchDaily -bench Ignore --outfile bench_daily.json
go test github.com/pingcap/tidb/pkg/distsql -run TestBenchDaily -bench Ignore --outfile bench_daily.json
go test github.com/pingcap/tidb/pkg/statistics -run TestBenchDaily -bench Ignore --outfile bench_daily.json
go test github.com/pingcap/tidb/pkg/planner/cascades/base -run TestBenchDaily -bench Ignore --outfile bench_daily.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need to remove this bench test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense, done

.
Signed-off-by: arenatlx <[email protected]>
Copy link
Contributor

@elsa0520 elsa0520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot bot added approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Aug 27, 2024
.
Signed-off-by: arenatlx <[email protected]>
Copy link

ti-chi-bot bot commented Aug 27, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: elsa0520, hawkingrei

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Aug 27, 2024
Copy link

ti-chi-bot bot commented Aug 27, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-08-27 07:14:24.674067015 +0000 UTC m=+854459.808517135: ☑️ agreed by elsa0520.
  • 2024-08-27 07:49:43.023838378 +0000 UTC m=+856578.158288499: ☑️ agreed by hawkingrei.

@ti-chi-bot ti-chi-bot bot merged commit 4f85a35 into pingcap:master Aug 27, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants