-
Notifications
You must be signed in to change notification settings - Fork 499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-9192. Update Ratis to 3.0.0. #5205
Conversation
Thanks @szetszwo work on this, there are some unit test failures. |
@guohao-rosicky , you are right. The failures are due to RATIS-1677. I believe that RATIS-1871 has reduced the number of failures. Let me check the failures tests. |
@guohao-rosicky , it seems the |
hi, @szetszwo , I'll take a look and try to fix the part. |
If we are just testing, would this be better done on a fork or branch until Ratis 3.0 is released? We are planning the Ozone 1.4.0 release and cannot release that with a snapshot Ratis version. It is probably safer to wait until Ratis 3.0 is officially released rather than bring master into an unreleasable state blocked on the Ratis release. cc @symious |
Converting to draft. This looks like it will burn significant CI time and is better debugged on a fork until it is green. |
@errose28 Thanks for the information. I will try to test this snapshot version on Ozone first. |
@errose28 , sure, running the test in szetszwo#5
If we are not in a hurry, let's wait for Ratis 3.0.0 before releasing Ozone 1.4.0. We should start releasing Ratis 3.0.0 once it can pass the Ozone test. |
@symious , thanks for helping out. It can pass almost all the tests; see szetszwo#5. A few tests are failing with NullPointerException in GrpcLogAppender.resetClient. Will fix them in https://issues.apache.org/jira/browse/RATIS-1876 . |
Hi @szetszwo, This pr(https://issues.apache.org/jira/browse/RATIS-1876) has been merged to ratis master, after updating this pr ozone, whether it will pass ci, we also need to test, whether we can HDDS-9192, create a separate branch, we are in the branch, to deal with upgrading ratis-3.0.0 need to deal with things. |
Hi @szetszwo, If you find out what parts of upgrading raits-3.0.0 need to be changed, let me know to share some of the work. |
@guohao-rosicky , it seems all the tests are passing except for TestOMRatisSnapshots; see https://issues.apache.org/jira/browse/HDDS-8876 Let me sync the branch with master and then run it again. |
hi @szetszwo, I observe that you have modified and made compatible ratis 3.0.0, should we go ahead and open this pr. |
@guohao-rosicky , let me update my branch and then open this pr. |
@guohao-rosicky , updated the branch. On a second thought, we should wait for Ratis 3.0.0 release before reopening this. |
Hi @szetszwo, I saw that Ratis 3.0.0 will include RATIS-1569 (https://issues.apache.org/jira/browse/RATIS-1569) patch which might introduce incompatibility between client and server with different versions. May I know whether this is only for client and server that uses DataStream API (write pipeline v2) or will Async API (write pipeline v1) also affected? In other words, can Async API client with lower version (e.g. 2.4.1) communicate with Async API server in Ratis 3.0.0? |
@ivandika3 , only DataStream API (write pipeline v2) is affected. The Async API will not be affected. A 2.4.1 client can use Async API to communicate to a server running 3.0.0 Thanks for checking it! |
@szetszwo Thanks for the clarification. |
@ChenSammi please check CI in fork before marking ready for review. No need to run the same tests for the PR if they are failing. |
Hi @szetszwo , thanks for working on this. Looks like it is still in draft state. Do we know how much gap is there to make it ready? |
@ChenSammi , @adoroszlai , the tests were passing some time back. We were just waiting for the official Rats 3.0.0 release. It seems that some tests are failing now. Let me take a look. |
@szetszwo We'd like to include this PR in the Ozone 1.4.0 release, hope to hear the good news from Ratis side soon. |
@symious , sure, we should include this for Ozone 1.4.0. Let me fix the test failures and then start rolling a Ratis 3.0.0 release. |
This is the JIRA filed for the test failures: https://issues.apache.org/jira/browse/RATIS-1902 |
@szetszwo Thanks for the update. Have tested locally with the changes in RATIS-1902.
The above 4 cases are passed, but the following one still got some issues. Could you help to confirm?
|
Thanks @symious for the test, is this the part of the code used for the test? |
@guohao-rosicky , I tested with HDDS-9192c, seems the results are the same. |
@symious , thanks for checking. Let me sync the branch and check |
|
|
@adoroszlai , yes, It looks like the same as HDDS-8876 -- it also timed out in my machine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @szetszwo for the patch, LGTM.
@guohao-rosicky would you like to take a look? |
@adoroszlai , @smengcl , thanks a lot for reviewing this! I will wait for another day before merging this. See if anyone like to take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM
@guohao-rosicky , thanks for reviewing this! |
What changes were proposed in this pull request?
This is to update Ratis to the latest 3.0.0 version, although it is currently unreleased. We should start testing it with the 3.0.0 snapshots.
What is the link to the Apache JIRA
HDDS-9192
How was this patch tested?
Using existing tests.