Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make single GPU benchmarking 5x more efficient #2390

Merged
merged 7 commits into from
Oct 24, 2024

Conversation

mzweilin
Copy link
Contributor

@mzweilin mzweilin commented Oct 22, 2024

📝 Description

This PR makes the benchmarking module 5x more efficient by using SerialRunner() instead of ParallelRunner(n_jobs=1) when there is only one GPU device. The two runners are functionally equivalent, but SerialRunner() is far more efficient. We need to figure out what makes ParallelRunner() inefficient in the future.

Click me to see `example_benchmark.yaml`.
# sample script to show grid search for two categories
accelerator:
  - cuda
benchmark:
  seed: 42
  model:
    class_path:
      grid: [Padim]
  data:
    class_path: MVTec
    init_args:
      category:
        grid:
          - bottle
          - capsule
      image_size: [256, 256]

Before

$ time anomalib benchmark --config example_benchmark.yaml
real    2m13.158s
user    12m39.772s
sys     1m0.259s

After

$ time anomalib benchmark --config example_benchmark.yaml
real    1m13.563s
user    4m29.377s
sys     0m25.782s

1.8x speedup: 2m13s -> 1m13s
with 2.8x less CPU load: 13m40s -> 4m55s
which means the fix makes it 5x more efficient.

✨ Changes

Select what type of change your PR is:

  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • 🔨 Refactor (non-breaking change which refactors the code base)
  • 🚀 New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📚 Documentation update
  • 🔒 Security update

✅ Checklist

Before you submit your pull request, please make sure you have completed the following steps:

  • 📋 I have summarized my changes in the CHANGELOG and followed the guidelines for my type of change (skip for minor changes, documentation updates, and test enhancements).
  • 📚 I have made the necessary updates to the documentation (if applicable).
  • 🧪 I have written tests that support my changes and prove that my fix is effective or my feature works (if applicable).

For more information about code review checklists, see the Code Review Checklist.

@mzweilin
Copy link
Contributor Author

mzweilin commented Oct 22, 2024

Is there any reason to set the logging level to DEBUG in benchmarking? The screen output is too verbose in my opinion.

root_logger.setLevel(logging.DEBUG)

We also get a little extra speedup if we leave the logging level alone.

diff --git a/src/anomalib/utils/logging.py b/src/anomalib/utils/logging.py
index 21f7994f..d73ef440 100644
--- a/src/anomalib/utils/logging.py
+++ b/src/anomalib/utils/logging.py
@@ -74,10 +74,8 @@ def redirect_logs(log_file: str) -> None:
     """
     Path(log_file).parent.mkdir(exist_ok=True, parents=True)
     logger_file_handler = logging.FileHandler(log_file)
-    root_logger = logging.getLogger()
-    root_logger.setLevel(logging.DEBUG)
     format_string = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
-    logging.basicConfig(format=format_string, level=logging.DEBUG, handlers=[logger_file_handler])
+    logging.basicConfig(format=format_string, handlers=[logger_file_handler])
     logging.captureWarnings(capture=True)
     # remove other handlers from all loggers
     loggers = [logging.getLogger(name) for name in logging.root.manager.loggerDict]

Setting to logging.DEBUG

$ time anomalib benchmark --config example_benchmark.yaml
real    1m13.563s
user    4m29.377s
sys     0m25.782s

Leave it alone

$ time anomalib benchmark --config example_benchmark.yaml
real    1m9.581s
user    4m14.989s
sys     0m24.995s

@ashwinvaidya17 @samet-akcay Shell we include the change in this PR for the sake of efficiency?

@samet-akcay
Copy link
Contributor

@mzweilin, I agree, we log far too many information, which is confusing most of the time.

Copy link

codecov bot commented Oct 23, 2024

Codecov Report

Attention: Patch coverage is 83.33333% with 1 line in your changes missing coverage. Please review.

Project coverage is 81.68%. Comparing base (db4c285) to head (a496d2e).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
src/anomalib/pipelines/benchmark/pipeline.py 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2390      +/-   ##
==========================================
+ Coverage   81.66%   81.68%   +0.02%     
==========================================
  Files         283      283              
  Lines       12682    12687       +5     
==========================================
+ Hits        10357    10364       +7     
+ Misses       2325     2323       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mzweilin
Copy link
Contributor Author

@mzweilin, I agree, we log far too many information, which is confusing most of the time.

OK. I made the change in aaffba8

@samet-akcay samet-akcay enabled auto-merge (squash) October 24, 2024 10:35
@samet-akcay samet-akcay merged commit 31952db into openvinotoolkit:main Oct 24, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants