BertQA sample throws segementation fault (TensorRT 10.3) when running GPU Jetson Orin Nano #4220

krishnarajk · 2024-10-23T12:05:05Z

Description

I tired running the bertQA sample in Jetson Orin nano with jetpack 6.1
I used Bert Base, because Bert Large kills itself when building the engine(may be because of memory issue).

[10/23/2024-13:27:53] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +7, GPU +67, now: CPU 2160, GPU 6001 (MiB)
[10/23/2024-13:27:53] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
[10/23/2024-13:28:39] [TRT] [I] Detected 3 inputs and 1 output network tensors.
[10/23/2024-13:28:42] [TRT] [I] Total Host Persistent Memory: 316288
[10/23/2024-13:28:42] [TRT] [I] Total Device Persistent Memory: 110592
[10/23/2024-13:28:42] [TRT] [I] Total Scratch Memory: 0
[10/23/2024-13:28:42] [TRT] [I] [BlockAssignment] Started assigning block shifts. This will take 164 steps to complete.
[10/23/2024-13:28:43] [TRT] [I] [BlockAssignment] Algorithm ShiftNTopDown took 3.28999ms to assign 5 blocks to 164 nodes requiring 1378304 bytes.
[10/23/2024-13:28:43] [TRT] [I] Total Activation Memory: 1378304
[10/23/2024-13:28:43] [TRT] [I] Total Weights Memory: 170059792
[10/23/2024-13:28:43] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU -1, now: CPU 2372, GPU 6707 (MiB)
[10/23/2024-13:28:43] [TRT] [I] Engine generation completed in 51.1302 seconds.
[10/23/2024-13:28:43] [TRT] [I] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 4 MiB, GPU 384 MiB
[10/23/2024-13:28:43] [TRT] [I] [MemUsageStats] Peak memory usage during Engine building and serialization: CPU: 3087 MiB
[10/23/2024-13:28:43] [TRT] [I] build engine in 52.969 Sec
[10/23/2024-13:28:44] [TRT] [I] Saving Engine to engines/bert_base_128.engine
[10/23/2024-13:28:44] [TRT] [I] Done.

The I used the inference.py, with the same sample given in the examples.
python3 inference.py -e engines/bert_base_128.engine -p "TensorRT is a high performance deep learning inference platform that delivers low latency and high throughput for apps such as recommenders, speech and image/video on NVIDIA GPUs. It includes parsers to import models, and plugins to support novel ops and layers before applying optimizations for inference. Today NVIDIA is open-sourcing parsers and plugins in TensorRT so that the deep learning community can customize and extend these components to take advantage of powerful TensorRT optimizations for your apps." -q "What is TensorRT?" -v models/fine-tuned/bert_tf_ckpt_base_qa_squad2_amp_128_v19.03.1/vocab.txt
It throws segmenation fault
`
[10/23/2024-13:30:07] [TRT] [I] Loaded engine size: 208 MiB
[10/23/2024-13:30:08] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +8, GPU +70, now: CPU 317, GPU 4590 (MiB)
[10/23/2024-13:30:08] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +7, GPU +64, now: CPU 109, GPU 4379 (MiB)
[10/23/2024-13:30:08] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +1, now: CPU 0, GPU 163 (MiB)

Passage: TensorRT is a high performance deep learning inference platform that delivers low latency and high throughput for apps such as recommenders, speech and image/video on NVIDIA GPUs. It includes parsers to import models, and plugins to support novel ops and layers before applying optimizations for inference. Today NVIDIA is open-sourcing parsers and plugins in TensorRT so that the deep learning community can customize and extend these components to take advantage of powerful TensorRT optimizations for your apps.

Question: What is TensorRT?
Segmentation fault (core dumped)
`
** https://github.com/NVIDIA/TensorRT/tree/release/10.3/demo/BERT#model-overview
** I dont use the OSS container, but installed these on device

Please help me over here.

Environment

TensorRT Version: 10.3

NVIDIA GPU: Amper, Jetson Orin nano

NVIDIA Driver Version: Jetpack 6.1

CUDA Version: 12.6

CUDNN Version:

Operating System: 22.04

Python Version (if applicable): 3.10

The text was updated successfully, but these errors were encountered:

krishnarajk changed the title ~~BertQA sample throws segementation fault on TensorRT 10.3 when running GPU Jetson Orin Nano~~ BertQA sample throws segementation fault (TensorRT 10.3) when running GPU Jetson Orin Nano Oct 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BertQA sample throws segementation fault (TensorRT 10.3) when running GPU Jetson Orin Nano #4220

BertQA sample throws segementation fault (TensorRT 10.3) when running GPU Jetson Orin Nano #4220

krishnarajk commented Oct 23, 2024 •

edited

Loading

BertQA sample throws segementation fault (TensorRT 10.3) when running GPU Jetson Orin Nano #4220

BertQA sample throws segementation fault (TensorRT 10.3) when running GPU Jetson Orin Nano #4220

Comments

krishnarajk commented Oct 23, 2024 • edited Loading

Description

Environment

krishnarajk commented Oct 23, 2024 •

edited

Loading