Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sink/kafka(ticdc): remove useless partition flush logic #4598

Merged
merged 11 commits into from
Feb 25, 2022

Conversation

Rustin170506
Copy link
Member

@Rustin170506 Rustin170506 commented Feb 15, 2022

What problem does this PR solve?

Issue Number: ref #4423

What is changed and how it works?

  • remove useless partition flush logic. We just need to count the message number.
  • Put the flush operation into the flush worker.

Check List

Tests

  • Unit test
  • Integration test

Code changes

None

Side effects

None

Related changes

None

Release note


None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Feb 15, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • amyangfei
  • overvenus

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added the release-note-none Denotes a PR that doesn't merit a release note. label Feb 15, 2022
@Rustin170506
Copy link
Member Author

/run-all-tests

@ti-chi-bot ti-chi-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 15, 2022
@Rustin170506 Rustin170506 marked this pull request as draft February 15, 2022 16:00
@ti-chi-bot ti-chi-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 15, 2022
@Rustin170506
Copy link
Member Author

/run-all-tests

1 similar comment
@Rustin170506
Copy link
Member Author

/run-all-tests

@Rustin170506 Rustin170506 marked this pull request as ready for review February 16, 2022 03:24
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 16, 2022
@Rustin170506 Rustin170506 added area/ticdc Issues or PRs related to TiCDC. component/sink Sink component. labels Feb 16, 2022
cdc/sink/producer/kafka/kafka.go Outdated Show resolved Hide resolved
cdc/sink/producer/kafka/kafka.go Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Feb 16, 2022

Codecov Report

Merging #4598 (7c79988) into master (9607554) will decrease coverage by 0.4817%.
The diff coverage is 46.5902%.

Flag Coverage Δ
cdc 59.7051% <52.0319%> (-0.2171%) ⬇️
dm 51.3649% <30.1098%> (-0.6640%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

@@               Coverage Diff                @@
##             master      #4598        +/-   ##
================================================
- Coverage   55.6402%   55.1585%   -0.4818%     
================================================
  Files           494        516        +22     
  Lines         61283      63836      +2553     
================================================
+ Hits          34098      35211      +1113     
- Misses        23750      25134      +1384     
- Partials       3435       3491        +56     

@Rustin170506
Copy link
Member Author

/run-all-tests

@Rustin170506 Rustin170506 changed the title ticdc(sink): remove useless partition flush logic sink(ticdc): remove useless partition flush logic Feb 17, 2022
@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Feb 17, 2022
@Rustin170506

This comment was marked as outdated.

@ti-chi-bot ti-chi-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 17, 2022
@Rustin170506

This comment was marked as outdated.

@ti-chi-bot ti-chi-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 24, 2022
@Rustin170506
Copy link
Member Author

Rustin170506 commented Feb 25, 2022

I did some local testing and the duration of flush is basically around 10 ms under sysbench --db-driver=mysql --mysql-host=[127.0.0.1](http://127.0.0.1/) --mysql-port=4000 --mysql-user=root --threads=8 --mysql-db=test --tables=100 --table-size=100000 oltp_write_only prepare load. So this should be acceptable for now.

In addition, I also tried to do a micro benchmark test, where we tried to add buffer to msgChan when the flush latency is large, but it is not effective either.( This is even an order of magnitude slower.) This optimization has little effect in the case of large writes. The flush latency itself should be acceptable when the writes are small.

// buffer
msgChan:    make(chan mqEvent, 512),

// Simulate flush delay
func (m *mockProducer) Flush(ctx context.Context) error {
	if len(m.mqEvent) != 0 {
		time.Sleep(10 * time.Millisecond)
	}
	m.mqEvent = make(map[int32][]*codec.MQMessage)
	m.flushed = true
	return nil
}


func BenchmarkFlush(b *testing.B) {
	worker, _ := newTestWorker()
	ctx, cancel := context.WithCancel(context.Background())
	var wg sync.WaitGroup
	wg.Add(1)
	go func() {
		defer wg.Done()
		err := worker.run(ctx)
		if err != nil && errors.Cause(err) != context.Canceled {
			b.Error(err)
		}
	}()
	b.ResetTimer()

	for i := 0; i < b.N; i++ {
		for i := 0; i < 100000; i++ {
			b.StopTimer()
			if i%100 == 0 {
				time.Sleep(time.Duration(rand.Intn(30)) * time.Millisecond)
			}
			n := rand.Intn(10)
			b.StartTimer()

			if n%2 == 0 {
				worker.msgChan <- mqEvent{
					row: &model.RowChangedEvent{
						CommitTs: 1,
						Table:    &model.TableName{Schema: "a", Table: "b"},
						Columns:  []*model.Column{{Name: "col1", Type: 1, Value: "aa"}},
					},
					partition: int32(i % 10),
				}
			}

			if i%10 == 0 {
				worker.msgChan <- mqEvent{resolvedTs: model.Ts(i)}
			}
		}
	}

	b.StopTimer()
	cancel()
	wg.Wait()
}

no-buffer:

goos: darwin
goarch: amd64
pkg: [github.com/pingcap/tiflow/cdc/sink](http://github.com/pingcap/tiflow/cdc/sink)
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkFlush-16              1        107955002414 ns/op
PASS
ok      [github.com/pingcap/tiflow/cdc/sink](http://github.com/pingcap/tiflow/cdc/sink)      133.716s

buffered:

goos: darwin
goarch: amd64
pkg: [github.com/pingcap/tiflow/cdc/sink](http://github.com/pingcap/tiflow/cdc/sink)
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkFlush-16              1        86659554655 ns/op
PASS
ok      [github.com/pingcap/tiflow/cdc/sink](http://github.com/pingcap/tiflow/cdc/sink)      112.655s

@Rustin170506 Rustin170506 changed the title sink(ticdc): remove useless partition flush logic sink/kafka(ticdc): remove useless partition flush logic Feb 25, 2022
@Rustin170506
Copy link
Member Author

/run-all-tests

@Rustin170506
Copy link
Member Author

/run-leak-test

@@ -43,21 +43,23 @@ type mqEvent struct {

// flushWorker is responsible for sending messages to the Kafka producer on a batch basis.
type flushWorker struct {
msgChan chan mqEvent
msgChan chan mqEvent
ticker *time.Ticker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is it closed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume it shuts down when an error occurs or the flushworker is stopped? Because when an error occurs during the run, it is reported to the processor, which then tries to shut down the sink.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Feb 25, 2022
@overvenus overvenus merged commit 4ef1aed into pingcap:master Feb 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ticdc Issues or PRs related to TiCDC. component/sink Sink component. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants