Model | Accuracy | Performance | ||||
---|---|---|---|---|---|---|
INT8 | FP32 | Acc Ratio[(INT8-FP32)/FP32] | INT8 | FP32 | Performance Ratio[INT8/FP32] | |
bert_large_squad_static | 90.78% | 90.87% | -0.11% | 49.08 | 13.48 | 3.64x |
bert_base_mrpc_static | 82.35% | 83.09% | -0.89% | 497.28 | 151.16 | 3.29x |
bert_base_nli_mean_tokens_stsb_static | 89.23% | 89.55% | -0.36% | 546.97 | 151.77 | 3.60x |
bert_base_sparse_mrpc_static | 70.59% | 70.59% | 0.00% | 551.90 | 153.80 | 3.59x |
bert_mini_mrpc_static | 78.19% | 78.68% | -0.62% | 6962.58 | 3252.14 | 2.14x |
bert_mini_sst2_static | 87.16% | 86.93% | 0.26% | 6850.38 | 3218.98 | 2.13x |
distilbert_base_uncased_sst2_static | 90.14% | 90.25% | -0.12% | 1086.13 | 306.45 | 3.54x |
distilbert_base_uncased_mrpc_static | 83.82% | 84.07% | -0.30% | 1091.99 | 303.92 | 3.59x |
distilbert_base_uncased_emotion_static | 93.90% | 94.20% | -0.32% | 1081.35 | 306.33 | 3.53x |
minilm_l6_h384_uncased_sst2_static | 89.33% | 90.14% | -0.90% | 2594.77 | 1083.84 | 2.39x |
roberta_base_mrpc_static | 88.24% | 88.97% | -0.82% | 508.14 | 153.37 | 3.31x |
distilroberta_base_wnli_static | 56.34% | 56.34% | 0.00% | 1097.22 | 315.94 | 3.47x |
paraphrase_xlm_r_multilingual_v1_stsb_static | 86.66% | 87.23% | -0.65% | 552.44 | 153.74 | 3.59x |
finbert_financial_phrasebank_static | 82.57% | 82.80% | -0.28% | 999.94 | 292.55 | 3.42x |