-
Notifications
You must be signed in to change notification settings - Fork 499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-11593. Improve container scanner metrics #7335
base: master
Are you sure you want to change the base?
Conversation
@sodonnel @errose28 @sumitagrawl , could you help review this pr? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking this up @jianghuazhu. Can you modify testScannerMetrics
in the existing background data and metadata scanner tests to test that metrics are reset after each iteration? You can do this by adding a new container to be scanned by calling setupMockContainer
in between the first and second run, which should cause a difference in the metrics between runs.
@Metric("total time that the container scanned has been running, in milliseconds.") | ||
private MutableCounterLong totalRunTimes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Metric("total time that the container scanned has been running, in milliseconds.") | |
private MutableCounterLong totalRunTimes; | |
@Metric("total time that the container scanner has been running, in milliseconds.") | |
private MutableCounterLong totalRunTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can rename the getters and setters too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see how totalRunTimes
might be used to imply that this is the sum of times of all the runs, but I think totalRunTime
still conveys this since it is the "total".
@@ -128,6 +128,7 @@ private static void performOnDemandScan(Container<?> container) { | |||
return; | |||
} | |||
|
|||
long startTime = System.nanoTime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use Instant.now().toEpochMillis()
to be consistent with the rest of the code. I think millis
will be precise enough.
scanner.runIteration(); | ||
assertTrue(scanner.getMetrics().getTotalRunTimes() > 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make this a little more robust by modifying one of the container mocks to have a small sleep during its scan. The mocked containers are directly accessible from this test, so you could make it so that when one of the container's scanData
methods is called it will sleep for 500ms. Then this test would check that the total run time is > 500ms. We can then run the scan again and check that it goes up to > 1 second. These are just example numbers, we could probably use lower values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for the metadata test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, can we add this to the existing metrics test instead? Currently this is the only metric tested outside of that method.
Thanks @errose28 for the comment and review. |
What changes were proposed in this pull request?
Update some metrics for container scanner.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-11593
How was this patch tested?
ci: https://github.com/jianghuazhu/ozone/actions/runs/11418787159
datanode jmx: