Make single GPU benchmarking 5x more efficient #2390

mzweilin · 2024-10-22T19:14:39Z

📝 Description

This PR makes the benchmarking module 5x more efficient by using SerialRunner() instead of ParallelRunner(n_jobs=1) when there is only one GPU device. The two runners are functionally equivalent, but SerialRunner() is far more efficient. We need to figure out what makes ParallelRunner() inefficient in the future.

Click me to see `example_benchmark.yaml`.

# sample script to show grid search for two categories
accelerator:
  - cuda
benchmark:
  seed: 42
  model:
    class_path:
      grid: [Padim]
  data:
    class_path: MVTec
    init_args:
      category:
        grid:
          - bottle
          - capsule
      image_size: [256, 256]

Before

$ time anomalib benchmark --config example_benchmark.yaml
real    2m13.158s
user    12m39.772s
sys     1m0.259s

After

$ time anomalib benchmark --config example_benchmark.yaml
real    1m13.563s
user    4m29.377s
sys     0m25.782s

1.8x speedup: 2m13s -> 1m13s
with 2.8x less CPU load: 13m40s -> 4m55s
which means the fix makes it 5x more efficient.

✨ Changes

Select what type of change your PR is:

🐞 Bug fix (non-breaking change which fixes an issue)
🔨 Refactor (non-breaking change which refactors the code base)
🚀 New feature (non-breaking change which adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📚 Documentation update
🔒 Security update

✅ Checklist

Before you submit your pull request, please make sure you have completed the following steps:

📋 I have summarized my changes in the CHANGELOG and followed the guidelines for my type of change (skip for minor changes, documentation updates, and test enhancements).
📚 I have made the necessary updates to the documentation (if applicable).
🧪 I have written tests that support my changes and prove that my fix is effective or my feature works (if applicable).

For more information about code review checklists, see the Code Review Checklist.

Signed-off-by: Weilin Xu <[email protected]>

mzweilin · 2024-10-22T19:47:25Z

Is there any reason to set the logging level to DEBUG in benchmarking? The screen output is too verbose in my opinion.

anomalib/src/anomalib/utils/logging.py

Line 78 in 3a403ae

root_logger.setLevel(logging.DEBUG)

We also get a little extra speedup if we leave the logging level alone.

diff --git a/src/anomalib/utils/logging.py b/src/anomalib/utils/logging.py
index 21f7994f..d73ef440 100644
--- a/src/anomalib/utils/logging.py
+++ b/src/anomalib/utils/logging.py
@@ -74,10 +74,8 @@ def redirect_logs(log_file: str) -> None:
     """
     Path(log_file).parent.mkdir(exist_ok=True, parents=True)
     logger_file_handler = logging.FileHandler(log_file)
-    root_logger = logging.getLogger()
-    root_logger.setLevel(logging.DEBUG)
     format_string = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
-    logging.basicConfig(format=format_string, level=logging.DEBUG, handlers=[logger_file_handler])
+    logging.basicConfig(format=format_string, handlers=[logger_file_handler])
     logging.captureWarnings(capture=True)
     # remove other handlers from all loggers
     loggers = [logging.getLogger(name) for name in logging.root.manager.loggerDict]

Setting to logging.DEBUG

$ time anomalib benchmark --config example_benchmark.yaml
real    1m13.563s
user    4m29.377s
sys     0m25.782s

Leave it alone

$ time anomalib benchmark --config example_benchmark.yaml
real    1m9.581s
user    4m14.989s
sys     0m24.995s

@ashwinvaidya17 @samet-akcay Shell we include the change in this PR for the sake of efficiency?

samet-akcay · 2024-10-23T04:46:59Z

@mzweilin, I agree, we log far too many information, which is confusing most of the time.

codecov · 2024-10-23T06:03:25Z

Codecov Report

Attention: Patch coverage is 83.33333% with 1 line in your changes missing coverage. Please review.

Project coverage is 81.68%. Comparing base (db4c285) to head (a496d2e).
Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
src/anomalib/pipelines/benchmark/pipeline.py	80.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2390      +/-   ##
==========================================
+ Coverage   81.66%   81.68%   +0.02%     
==========================================
  Files         283      283              
  Lines       12682    12687       +5     
==========================================
+ Hits        10357    10364       +7     
+ Misses       2325     2323       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Weilin Xu <[email protected]>

mzweilin · 2024-10-23T17:21:11Z

@mzweilin, I agree, we log far too many information, which is confusing most of the time.

OK. I made the change in aaffba8

mzweilin added 3 commits October 22, 2024 12:07

Use SerialRunner if only one CUDA device is available.

d7ce3a0

Signed-off-by: Weilin Xu <[email protected]>

Resolve PLR6201.

d77e972

Signed-off-by: Weilin Xu <[email protected]>

Update CHANGELOG.

1003a30

Signed-off-by: Weilin Xu <[email protected]>

mzweilin requested a review from samet-akcay as a code owner October 22, 2024 19:14

mzweilin force-pushed the gpu_serial_runner branch from 450479a to 1003a30 Compare October 22, 2024 19:33

Keep the same logging level in benchmarking.

aaffba8

Signed-off-by: Weilin Xu <[email protected]>

Merge branch 'main' into gpu_serial_runner

2e5a85f

samet-akcay enabled auto-merge (squash) October 24, 2024 10:35

mzweilin and others added 2 commits October 24, 2024 06:54

Merge branch 'main' into gpu_serial_runner

8a40946

Merge branch 'main' into gpu_serial_runner

a496d2e

samet-akcay approved these changes Oct 24, 2024

View reviewed changes

samet-akcay merged commit 31952db into openvinotoolkit:main Oct 24, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make single GPU benchmarking 5x more efficient #2390

Make single GPU benchmarking 5x more efficient #2390

mzweilin commented Oct 22, 2024 •

edited

Loading

mzweilin commented Oct 22, 2024 •

edited

Loading

samet-akcay commented Oct 23, 2024

codecov bot commented Oct 23, 2024 •

edited

Loading

mzweilin commented Oct 23, 2024

Make single GPU benchmarking 5x more efficient #2390

Make single GPU benchmarking 5x more efficient #2390

Conversation

mzweilin commented Oct 22, 2024 • edited Loading

📝 Description

✨ Changes

✅ Checklist

mzweilin commented Oct 22, 2024 • edited Loading

samet-akcay commented Oct 23, 2024

codecov bot commented Oct 23, 2024 • edited Loading

Codecov Report

mzweilin commented Oct 23, 2024

mzweilin commented Oct 22, 2024 •

edited

Loading

mzweilin commented Oct 22, 2024 •

edited

Loading

codecov bot commented Oct 23, 2024 •

edited

Loading