Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-11593. Improve container scanner metrics #7335

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

jianghuazhu
Copy link
Contributor

What changes were proposed in this pull request?

Update some metrics for container scanner.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11593

How was this patch tested?

ci: https://github.com/jianghuazhu/ozone/actions/runs/11418787159
datanode jmx:
image
image

@jianghuazhu
Copy link
Contributor Author

@sodonnel @errose28 @sumitagrawl , could you help review this pr?
Thank you.

Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this up @jianghuazhu. Can you modify testScannerMetrics in the existing background data and metadata scanner tests to test that metrics are reset after each iteration? You can do this by adding a new container to be scanned by calling setupMockContainer in between the first and second run, which should cause a difference in the metrics between runs.

Comment on lines 46 to 47
@Metric("total time that the container scanned has been running, in milliseconds.")
private MutableCounterLong totalRunTimes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@Metric("total time that the container scanned has been running, in milliseconds.")
private MutableCounterLong totalRunTimes;
@Metric("total time that the container scanner has been running, in milliseconds.")
private MutableCounterLong totalRunTime;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can rename the getters and setters too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see how totalRunTimes might be used to imply that this is the sum of times of all the runs, but I think totalRunTime still conveys this since it is the "total".

@@ -128,6 +128,7 @@ private static void performOnDemandScan(Container<?> container) {
return;
}

long startTime = System.nanoTime();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use Instant.now().toEpochMillis() to be consistent with the rest of the code. I think millis will be precise enough.

Comment on lines 109 to 110
scanner.runIteration();
assertTrue(scanner.getMetrics().getTotalRunTimes() > 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make this a little more robust by modifying one of the container mocks to have a small sleep during its scan. The mocked containers are directly accessible from this test, so you could make it so that when one of the container's scanData methods is called it will sleep for 500ms. Then this test would check that the total run time is > 500ms. We can then run the scan again and check that it goes up to > 1 second. These are just example numbers, we could probably use lower values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for the metadata test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can we add this to the existing metrics test instead? Currently this is the only metric tested outside of that method.

@jianghuazhu
Copy link
Contributor Author

Thanks @errose28 for the comment and review.
I have updated it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants