Name		Name	Last commit message	Last commit date
parent directory ..
GPT2		GPT2
NNDF		NNDF
T5		T5
notebooks		notebooks
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py

README.md

TensorRT Inference for HuggingFace Transformers 🤗

This repository demonstrates TensorRT inference with models developed using HuggingFace Transformers.

Currently, this repository supports the following models:

GPT2 (text generation task). The sample supports following variants of GPT2:

gpt2 (117M), gpt2-large (774M)
T5 (translation, premise task). The sample supports following variants of T5:

t5-small (60M), t5-base (220M), t5-large (770M)

Setup

pip3 install -r requirements.txt

How to run comparison script

python3 run.py compare GPT2 --variant [gpt2 | gpt2-large] --working-dir temp

The above script reports :

script	accuracy	decoder (sec)	encoder (sec)	full (sec)
frameworks	1	0.0292865	0.0174382	0.122532
trt	1	0.00494083	0.0068982	0.0239782

Testing

pytest

It is recommended to use Pytest 4.6.x. Your Python environment must have already had the setup completed.

How to run functional and performance benchmark

python3 run.py run GPT2 [frameworks | trt] --variant [gpt2 | gpt2-large] --working-dir temp

Expected output:

NetworkCheckpointResult(network_results=[NetworkResult(
input='TensorRT is a Deep Learning compiler used for deep learning.\n',
output_tensor=tensor([   51, 22854, ....], device='cuda:0'),
semantic_output=['TensorRT is a Deep Learning compiler used for deep learning.\n\nThe main goal of the project is to create a tool that can be used to train deep learning algorithms.\n\n'],
median_runtime=[NetworkRuntime(name='gpt2_decoder', runtime=0.002254825085401535), NetworkRuntime(name='full', runtime=0.10705459117889404)],
models=NetworkModels(torch=None, onnx=[NetworkModel(name='gpt2_decoder', fpath='temp/GPT2/GPT2-gpt2-fp16.onnx')],
trt=[NetworkModel(name='gpt2_decoder', fpath='temp/GPT2/GPT2-gpt2-fp16.onnx.engine')]))], accuracy=1.0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HuggingFace

HuggingFace

README.md

TensorRT Inference for HuggingFace Transformers 🤗

Setup

How to run comparison script

Testing

How to run functional and performance benchmark

Files

HuggingFace

Directory actions

More options

Directory actions

More options

Latest commit

History

HuggingFace

Folders and files

parent directory

README.md

TensorRT Inference for HuggingFace Transformers 🤗

Setup

How to run comparison script

Testing

How to run functional and performance benchmark