Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Init TSO service and start service loop #6008

Merged
merged 5 commits into from
Feb 27, 2023

Conversation

binshi-bing
Copy link
Contributor

@binshi-bing binshi-bing commented Feb 17, 2023

What problem does this PR solve?

Issue Number: Ref #5836

What is changed and how does it work?

1. Add basic stuff to init TSO service.
2. Make the following changes:
    Add Participant which is non-embedded-etcd version of Member.
    Rename Member to the EmbeddedEtcdMember.
    Add Member interface which defines the behavior of a member participating in an election loop regardless of concrete mechanism behind.
    Decouple embedded etcd from TSO AllocatorManager, GlobalTSOAllocator and LocalTSOAllocator.
3. Refactor the GRPC and HTTP Server start logic.
    It refers to DGraph https://github.com/binshi-bing/dgraph/blob/cbec2309413a9b17d407b3417300f719381b36d7/dgraph/cmd/zero/run.go#L318-L319 and cmux example https://github.com/soheilhy/cmux/blob/5ec6847320e53b5fee0ab9a4757b56625a946c85/example/example_test.go#L108
4. Add service loops and adjust config with default values, including:
    1. Add tso primary/leader election loop and tso service loops.
    2. Add tso.Config.Adjust() to adjust config with default values if they're not in configuration file or passed from command line.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)

Start:
[2023/02/22 22:46:02.566 -08:00] [INFO] [metricutil.go:83] ["disable Prometheus push client"]
[2023/02/22 22:46:02.566 -08:00] [INFO] [systimemon.go:28] ["start system time monitor"]
[2023/02/22 22:46:02.566 -08:00] [INFO] [etcdutil.go:232] ["create etcd v3 client"] [endpoints="[http://127.0.0.1:2379/]"]
[2023/02/22 22:46:02.574 -08:00] [INFO] [server.go:533] ["init cluster id"] [cluster-id=7203238955249386415]
[2023/02/22 22:46:02.590 -08:00] [INFO] [server.go:574] ["triggering the start callback functions"]
[2023/02/22 22:46:02.590 -08:00] [INFO] [server.go:198] ["start to campaign the primary/leader"] [campaign-tso-primary-name=TSO-Bins-MacBook-Pro.local]
[2023/02/22 22:46:02.596 -08:00] [INFO] [lease.go:65] ["lease granted"] [lease-id=2463898342467820709] [lease-timeout=3] [purpose="keyspace group primary election"]
[2023/02/22 22:46:02.601 -08:00] [INFO] [leadership.go:122] ["check campaign resp"] [resp="{"header":{"cluster_id":15674009460217235928,"member_id":3474484975246189105,"revision":1383,"raft_term":2},"succeeded":true,"responses":[{"Response":{"ResponsePut":{"header":{"revision":1383}}}}]}"]
[2023/02/22 22:46:02.601 -08:00] [INFO] [leadership.go:131] ["write leaderData to leaderPath ok"] [leaderPath=/pd/0/microservice/tso/keyspace-group-00000/primary] [purpose="keyspace group primary election"]
[2023/02/22 22:46:02.601 -08:00] [INFO] [server.go:224] ["campaign tso primary/leader ok"] [campaign-tso-primary-name=TSO-Bins-MacBook-Pro.local]
[2023/02/22 22:46:02.601 -08:00] [INFO] [server.go:231] ["initializing the global tso allocator"]
[2023/02/22 22:46:02.601 -08:00] [INFO] [lease.go:135] ["start lease keep alive worker"] [interval=1s] [purpose="keyspace group primary election"]
[2023/02/22 22:46:02.606 -08:00] [INFO] [tso.go:227] ["sync and save timestamp"] [last=0001/01/01 00:00:00.000 +00:00] [save=2023/02/22 22:46:05.601 -08:00] [next=2023/02/22 22:46:02.601 -08:00]
[2023/02/22 22:46:02.606 -08:00] [INFO] [server.go:240] ["triggering the primary callback functions"]
[2023/02/22 22:46:02.606 -08:00] [INFO] [server.go:248] ["tso cluster primary server is ready to serve"] [tso-primary-name=TSO-Bins-MacBook-Pro.local]

Exit:
[2023/02/22 23:00:17.642 -08:00] [INFO] [server.go:648] ["got signal to exit"] [signal=interrupt]
[2023/02/22 23:00:17.642 -08:00] [INFO] [lease.go:159] ["stop lease keep alive worker"] [purpose="keyspace group primary election"]
[2023/02/22 23:00:17.642 -08:00] [INFO] [server.go:262] ["server is closed"]
[2023/02/22 23:00:17.642 -08:00] [INFO] [server.go:275] ["closing tso server ..."]
[2023/02/22 23:00:17.642 -08:00] [INFO] [tso.go:416] ["reset the timestamp in memory"]
[2023/02/22 23:00:17.642 -08:00] [INFO] [server.go:522] ["mux stop serving"] [error="accept tcp 127.0.0.1:3379: use of closed network connection"]
[2023/02/22 23:00:17.642 -08:00] [INFO] [server.go:459] ["grpc server stopped serving"]
[2023/02/22 23:00:17.642 -08:00] [INFO] [server.go:493] ["http server stopped serving"]
[2023/02/22 23:00:17.642 -08:00] [INFO] [server.go:465] ["try to gracefully stop the server now"]
[2023/02/22 23:00:17.642 -08:00] [INFO] [server.go:477] ["grpc server stopped"]
[2023/02/22 23:00:17.642 -08:00] [INFO] [server.go:500] ["all http(s) requests finished"]
[2023/02/22 23:00:17.642 -08:00] [INFO] [server.go:503] ["http server stopped"]
[2023/02/22 23:00:17.647 -08:00] [INFO] [server.go:166] ["tso server is closed, exit allocator loop"]
[2023/02/22 23:00:17.656 -08:00] [WARN] [retry_interceptor.go:62] ["retrying of unary invoker failed"] [target=endpoint://client-62ef1518-1368-4d7a-b433-a76b35778818/127.0.0.1:2379] [attempt=0] [error="rpc error: code = NotFound desc = etcdserver: requested lease not found"]
[2023/02/22 23:00:17.656 -08:00] [INFO] [server.go:175] ["server is closed, exit tso primary election loop"]
[2023/02/22 23:00:17.657 -08:00] [INFO] [server.go:292] ["tso server is closed"]

Code changes

Release note

None.

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Feb 17, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • lhy1024
  • rleungx

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Feb 17, 2023
@ti-chi-bot
Copy link
Member

Hi @binshi-bing. Thanks for your PR.

I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@codecov
Copy link

codecov bot commented Feb 17, 2023

Codecov Report

Base: 74.79% // Head: 74.12% // Decreases project coverage by -0.67% ⚠️

Coverage data is based on head (1404156) compared to base (8d4438e).
Patch coverage: 54.60% of modified lines in pull request are covered.

❗ Current head 1404156 differs from pull request most recent head d140827. Consider uploading reports for the commit d140827 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6008      +/-   ##
==========================================
- Coverage   74.79%   74.12%   -0.68%     
==========================================
  Files         368      373       +5     
  Lines       36661    37335     +674     
==========================================
+ Hits        27422    27676     +254     
- Misses       6833     7213     +380     
- Partials     2406     2446      +40     
Flag Coverage Δ
unittests 74.12% <54.60%> (-0.68%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/mcs/resource_manager/server/server.go 48.44% <ø> (-0.32%) ⬇️
pkg/mcs/tso/server/grpc_service.go 1.73% <ø> (ø)
pkg/tso/allocator_manager.go 63.37% <ø> (+0.14%) ⬆️
pkg/tso/local_allocator.go 64.86% <0.00%> (+2.70%) ⬆️
pkg/member/participant.go 32.43% <32.43%> (ø)
pkg/mcs/tso/server/server.go 43.75% <63.58%> (ø)
pkg/mcs/tso/server/config.go 64.78% <64.78%> (ø)
server/server.go 73.96% <66.66%> (-1.04%) ⬇️
pkg/mcs/tso/server/testutil.go 82.35% <82.35%> (ø)
pkg/member/member.go 66.32% <100.00%> (+4.21%) ⬆️
... and 39 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@lhy1024
Copy link
Contributor

lhy1024 commented Feb 17, 2023

/ok-to-test

@ti-chi-bot ti-chi-bot added ok-to-test Indicates a PR is ready to be tested. and removed needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Feb 17, 2023
@binshi-bing binshi-bing marked this pull request as draft February 17, 2023 04:14
@ti-chi-bot ti-chi-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Feb 17, 2023
@binshi-bing binshi-bing force-pushed the init-start-tso branch 2 times, most recently from 6804a62 to 6e12e28 Compare February 19, 2023 20:11
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 19, 2023
@binshi-bing binshi-bing force-pushed the init-start-tso branch 4 times, most recently from 6362b67 to 9d17974 Compare February 20, 2023 05:54
@binshi-bing binshi-bing marked this pull request as ready for review February 20, 2023 05:58
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 20, 2023
@lhy1024 lhy1024 requested review from rleungx, JmPotato and lhy1024 and removed request for disksing and HunDunDM February 20, 2023 06:05
lhy1024 added a commit to lhy1024/pd that referenced this pull request Feb 20, 2023
pkg/mcs/tso/server/server.go Outdated Show resolved Hide resolved
pkg/mcs/tso/server/server.go Outdated Show resolved Hide resolved
pkg/mcs/tso/server/server.go Outdated Show resolved Hide resolved
pkg/mcs/tso/server/server.go Outdated Show resolved Hide resolved
pkg/mcs/tso/server/server.go Outdated Show resolved Hide resolved
pkg/tso/config.go Outdated Show resolved Hide resolved
@binshi-bing binshi-bing force-pushed the init-start-tso branch 2 times, most recently from 6fbe238 to f355f92 Compare February 27, 2023 07:20
@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Feb 27, 2023
Copy link
Contributor

@lhy1024 lhy1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except for leave comment

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Feb 27, 2023
Signed-off-by: Bin Shi <[email protected]>

Add Participant which is non-embedded-etcd version of Member.
Rename Member to the EmbeddedEtcdMember.
Add Member interface which defines the behavior of a member particpating in an election regardless of concrete mechanism behind.
Decouple embedded etcd from TSO AllocatorManager, GlobalTSOAllocator and LocalTSOAllocator.

Signed-off-by: Bin Shi <[email protected]>

Refactor the GRPC and HTTP Server start logic.

It refers to DGraph https://github.com/binshi-bing/dgraph/blob/cbec2309413a9b17d407b3417300f719381b36d7/dgraph/cmd/zero/run.go#L318-L319 and cmux example https://github.com/soheilhy/cmux/blob/5ec6847320e53b5fee0ab9a4757b56625a946c85/example/example_test.go#L108.

Signed-off-by: Bin Shi <[email protected]>

Add TSO primary/leader election loop and adjust config with default values.

Changes:
1. Add TSO primary/leader election loop
2. Add tso.Config.Adjust() to adjust config with default values if they're not in configuration file or passed from commandline.

Signed-off-by: Bin Shi <[email protected]>

Handle feedbacks

Signed-off-by: Bin Shi <[email protected]>

Update defaultListenAddr = 127.0.0.1:3379 and defaultHTTPGracefulShutdownTimeout = 5 * time.Second

Signed-off-by: Bin Shi <[email protected]>

rebase master

Signed-off-by: Bin Shi <[email protected]>

Handle feedbacks

Signed-off-by: Bin Shi <[email protected]>

Update the key for keyspace group primary election (e.g., /pd/microservice/tso/keyspace-group-00000/primary)

The entire key is in the format of "/pd/microservice/tso/keyspace-group-XXXXX/primary" in which XXXXX is 5 digits integer with leading zeros.
For example, the key for keyspace group 0 primary election is "/pd/microservice/tso/keyspace-group-00000/primary"

Signed-off-by: Bin Shi <[email protected]>

Update the format of the key for keyspace group primary electio to "/pd/<cluster-id>/microservice/tso/keyspace-group-XXXXX/primary".
For now we use 0 as the default cluster id.

Signed-off-by: Bin Shi <[email protected]>

Refactor log when grpc/http server stops serving

Signed-off-by: Bin Shi <[email protected]>

Handle feedback -- move Member interface to pkg/tso to keep the just-right visibility

Signed-off-by: Bin Shi <[email protected]>

Add unique name/id for Compaign Election and fix a bug in Primary Election loop.

Passed HA/Failure test on local box.

Signed-off-by: Bin Shi <[email protected]>

Set cluster id to default value 0 for now to satisfy the cluster id verification between the client and the server side.

Signed-off-by: Bin Shi <[email protected]>

Add TSO Server Start/Stop Unittest.

Signed-off-by: Bin Shi <[email protected]>
Signed-off-by: Bin Shi <[email protected]>
@lhy1024
Copy link
Contributor

lhy1024 commented Feb 27, 2023

/merge

@ti-chi-bot
Copy link
Member

@lhy1024: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: e6cd0a8

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Feb 27, 2023
@ti-chi-bot ti-chi-bot removed the status/can-merge Indicates a PR has been approved by a committer. label Feb 27, 2023
@lhy1024
Copy link
Contributor

lhy1024 commented Feb 27, 2023

/merge

@ti-chi-bot
Copy link
Member

@lhy1024: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: d140827

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Feb 27, 2023
@ti-chi-bot ti-chi-bot merged commit 8b82aa9 into tikv:master Feb 27, 2023
tsoRootPath = "/tso"
tsoClusterIDPath = "/tso/cluster_id"
// tsoKeyspaceGroupPrimaryElectionPrefix defines the key prefix for keyspace group primary election.
// The entire key is in the format of "/pd/<cluster-id>/microservice/tso/keyspace-group-XXXXX/primary" in which
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe /pd/<cluster-id>/microservice/tso/<group-id>/primary looks better in here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can certainly move "keyspace-group-" from the path to make it more compact.

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 27, 2023
@ti-chi-bot
Copy link
Member

@binshi-bing: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants