Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add profiler doc #722

Merged
merged 1 commit into from
Mar 5, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions docs/development/profiler.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
## Profiler (Experimental)

Currently, DJL supports experimental profilers for developers that
investigate the performance of operator execution as well as memory consumption.
The profilers are from engines directly and DJL just expose them.
So different engines have different APIs and produce different output format.
We are still working in progress on the feature.
In the future, we are considering to design a unified APIs and output unified format.

### MXNet

By setting the following environment variable, it generates `profile.json` after executing the code.

```
export MXNET_PROFILER_AUTOSTART=1
```

You can view it in a browser using trace consumer like `chrome://tracing `. Here is a snapshot that shows the sample output.
![img](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/tutorials/python/profiler/profiler_output_chrome.png)

### PyTorch

DJL have integrated PyTorch C++ profiler API and expose `JniUtils.startProfile` and `JniUtils.stopProfile(outputFile)` Java APIs.
`JniUtils.startProfile` takes `useCuda(boolean)`, `recordShape(boolean)` and `profileMemory(boolean)` arguments respectively.
`useCuda` indicates if profiler enables timing of CUDA events using the cudaEvent API.
`recordShape` indicates if information about input dimensions will be collected or not.
`profileMemory` indicates if profiler report memory usage or not.
`JniUtils.stopProfile` takes a outputFile of String type.

Wrap the code snippet you want to profile in between `JniUtils.startProfile` and `JniUtils.stopProfile`.
Here is an example.

```
stu1130 marked this conversation as resolved.
Show resolved Hide resolved
try (ZooModel<Image, Classifications> model = ModelZoo.loadModel(criteria)) {
try (Predictor<Image, Classifications> predictor = model.newPredictor()) {
Image image = ImageFactory.getInstance()
.fromNDArray(manager.zeros(new Shape(3, 224, 224), DataType.UINT8));

JniUtils.startProfile(false, true, true);
predictor.predict(image);
JniUtils.stopProfile(outputFile);
} catch (TranslateException e) {
e.printStackTrace();
}
```

The output format is composed of operator execution record.
Each record contains `name`(operator name), `dur`(time duration), `shape`(input shapes), `cpu mem`(cpu memory footprint).

```
{
"name": "aten::empty",
"ph": "X",
"ts": 24528.313000,
"dur": 5.246000,
"tid": 1,
"pid": "CPU Functions",
"shape": [[], [], [], [], [], []],
"cpu mem": "0 b",
"args": {}
}
```
1 change: 1 addition & 0 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ nav:
- 'docs/development/memory_management.md'
- 'docs/development/inference_performance_optimization.md'
- 'docs/development/benchmark_with_djl.md'
- 'docs/development/profiler.md'
- DJL Community:
- 'docs/forums.md'
- 'leaders.md'
Expand Down