You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tired running the bertQA sample in Jetson Orin nano with jetpack 6.1
I used Bert Base, because Bert Large kills itself when building the engine(may be because of memory issue).
[10/23/2024-13:27:53] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +7, GPU +67, now: CPU 2160, GPU 6001 (MiB)
[10/23/2024-13:27:53] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
[10/23/2024-13:28:39] [TRT] [I] Detected 3 inputs and 1 output network tensors.
[10/23/2024-13:28:42] [TRT] [I] Total Host Persistent Memory: 316288
[10/23/2024-13:28:42] [TRT] [I] Total Device Persistent Memory: 110592
[10/23/2024-13:28:42] [TRT] [I] Total Scratch Memory: 0
[10/23/2024-13:28:42] [TRT] [I] [BlockAssignment] Started assigning block shifts. This will take 164 steps to complete.
[10/23/2024-13:28:43] [TRT] [I] [BlockAssignment] Algorithm ShiftNTopDown took 3.28999ms to assign 5 blocks to 164 nodes requiring 1378304 bytes.
[10/23/2024-13:28:43] [TRT] [I] Total Activation Memory: 1378304
[10/23/2024-13:28:43] [TRT] [I] Total Weights Memory: 170059792
[10/23/2024-13:28:43] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU -1, now: CPU 2372, GPU 6707 (MiB)
[10/23/2024-13:28:43] [TRT] [I] Engine generation completed in 51.1302 seconds.
[10/23/2024-13:28:43] [TRT] [I] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 4 MiB, GPU 384 MiB
[10/23/2024-13:28:43] [TRT] [I] [MemUsageStats] Peak memory usage during Engine building and serialization: CPU: 3087 MiB
[10/23/2024-13:28:43] [TRT] [I] build engine in 52.969 Sec
[10/23/2024-13:28:44] [TRT] [I] Saving Engine to engines/bert_base_128.engine
[10/23/2024-13:28:44] [TRT] [I] Done.
The I used the inference.py, with the same sample given in the examples. python3 inference.py -e engines/bert_base_128.engine -p "TensorRT is a high performance deep learning inference platform that delivers low latency and high throughput for apps such as recommenders, speech and image/video on NVIDIA GPUs. It includes parsers to import models, and plugins to support novel ops and layers before applying optimizations for inference. Today NVIDIA is open-sourcing parsers and plugins in TensorRT so that the deep learning community can customize and extend these components to take advantage of powerful TensorRT optimizations for your apps." -q "What is TensorRT?" -v models/fine-tuned/bert_tf_ckpt_base_qa_squad2_amp_128_v19.03.1/vocab.txt
It throws segmenation fault
`
[10/23/2024-13:30:07] [TRT] [I] Loaded engine size: 208 MiB
[10/23/2024-13:30:08] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +8, GPU +70, now: CPU 317, GPU 4590 (MiB)
[10/23/2024-13:30:08] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +7, GPU +64, now: CPU 109, GPU 4379 (MiB)
[10/23/2024-13:30:08] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +1, now: CPU 0, GPU 163 (MiB)
Passage: TensorRT is a high performance deep learning inference platform that delivers low latency and high throughput for apps such as recommenders, speech and image/video on NVIDIA GPUs. It includes parsers to import models, and plugins to support novel ops and layers before applying optimizations for inference. Today NVIDIA is open-sourcing parsers and plugins in TensorRT so that the deep learning community can customize and extend these components to take advantage of powerful TensorRT optimizations for your apps.
The text was updated successfully, but these errors were encountered:
krishnarajk
changed the title
BertQA sample throws segementation fault on TensorRT 10.3 when running GPU Jetson Orin Nano
BertQA sample throws segementation fault (TensorRT 10.3) when running GPU Jetson Orin Nano
Oct 23, 2024
Description
I tired running the bertQA sample in Jetson Orin nano with jetpack 6.1
I used Bert Base, because Bert Large kills itself when building the engine(may be because of memory issue).
The I used the inference.py, with the same sample given in the examples.
python3 inference.py -e engines/bert_base_128.engine -p "TensorRT is a high performance deep learning inference platform that delivers low latency and high throughput for apps such as recommenders, speech and image/video on NVIDIA GPUs. It includes parsers to import models, and plugins to support novel ops and layers before applying optimizations for inference. Today NVIDIA is open-sourcing parsers and plugins in TensorRT so that the deep learning community can customize and extend these components to take advantage of powerful TensorRT optimizations for your apps." -q "What is TensorRT?" -v models/fine-tuned/bert_tf_ckpt_base_qa_squad2_amp_128_v19.03.1/vocab.txt
It throws segmenation fault
`
[10/23/2024-13:30:07] [TRT] [I] Loaded engine size: 208 MiB
[10/23/2024-13:30:08] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +8, GPU +70, now: CPU 317, GPU 4590 (MiB)
[10/23/2024-13:30:08] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +7, GPU +64, now: CPU 109, GPU 4379 (MiB)
[10/23/2024-13:30:08] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +1, now: CPU 0, GPU 163 (MiB)
Passage: TensorRT is a high performance deep learning inference platform that delivers low latency and high throughput for apps such as recommenders, speech and image/video on NVIDIA GPUs. It includes parsers to import models, and plugins to support novel ops and layers before applying optimizations for inference. Today NVIDIA is open-sourcing parsers and plugins in TensorRT so that the deep learning community can customize and extend these components to take advantage of powerful TensorRT optimizations for your apps.
Question: What is TensorRT?
Segmentation fault (core dumped)
`
** https://github.com/NVIDIA/TensorRT/tree/release/10.3/demo/BERT#model-overview
** I dont use the OSS container, but installed these on device
Please help me over here.
Environment
TensorRT Version: 10.3
NVIDIA GPU: Amper, Jetson Orin nano
NVIDIA Driver Version: Jetpack 6.1
CUDA Version: 12.6
CUDNN Version:
Operating System: 22.04
Python Version (if applicable): 3.10
The text was updated successfully, but these errors were encountered: