Adding Quantizing with Accuracy Control using NNCF notebook (#19587)

openvinotoolkit · Sep 4, 2023 · e51cac6 · e51cac6
1 parent d396dc0
commit e51cac6
Show file tree

Hide file tree

Showing 94 changed files with 917 additions and 95 deletions.
diff --git a/docs/notebooks/001-hello-world-with-output.rst b/docs/notebooks/001-hello-world-with-output.rst
@@ -1,7 +1,7 @@
 Hello Image Classification
 ==========================
 
-.. _top:
+
 
 This basic introduction to OpenVINO™ shows how to do inference with an
 image classification model.
@@ -15,6 +15,10 @@ created, refer to the `TensorFlow to
 OpenVINO <101-tensorflow-classification-to-openvino-with-output.html>`__
 tutorial. 
 
+
+
+.. _top:
+
 **Table of contents**:
 
 - `Imports <#imports>`__

diff --git a/docs/notebooks/003-hello-segmentation-with-output.rst b/docs/notebooks/003-hello-segmentation-with-output.rst
@@ -1,7 +1,7 @@
 Hello Image Segmentation
 ========================
 
-.. _top:
+
 
 A very basic introduction to using segmentation models with OpenVINO™.
 
@@ -12,6 +12,10 @@ Zoo <https://github.com/openvinotoolkit/open_model_zoo/>`__ is used.
 ADAS stands for Advanced Driver Assistance Services. The model
 recognizes four classes: background, road, curb and mark.
 
+
+
+.. _top:
+
 **Table of contents**:
 
 - `Imports <#imports>`__

diff --git a/docs/notebooks/004-hello-detection-with-output.rst b/docs/notebooks/004-hello-detection-with-output.rst
@@ -1,7 +1,7 @@
 Hello Object Detection
 ======================
 
-.. _top:
+
 
 A very basic introduction to using object detection models with
 OpenVINO™.
@@ -18,6 +18,10 @@ corner, ``(x_max, y_max)`` are the coordinates of the bottom right
 bounding box corner and ``conf`` is the confidence for the predicted
 class. 
 
+
+
+.. _top:
+
 **Table of contents**: 
 
 - `Imports <#imports>`__ 

diff --git a/docs/notebooks/101-tensorflow-classification-to-openvino-with-output.rst b/docs/notebooks/101-tensorflow-classification-to-openvino-with-output.rst
@@ -1,7 +1,7 @@
 Convert a TensorFlow Model to OpenVINO™
 =======================================
 
-.. _top:
+
 
 | This short tutorial shows how to convert a TensorFlow
   `MobileNetV3 <https://docs.openvino.ai/2023.1/omz_models_model_mobilenet_v3_small_1_0_224_tf.html>`__
@@ -13,7 +13,11 @@ Convert a TensorFlow Model to OpenVINO™
   Runtime <https://docs.openvino.ai/2023.1/openvino_docs_OV_UG_OV_Runtime_User_Guide.html>`__
   and do inference with a sample image.
 
-| **Table of contents**:
+
+
+| .. _top:
+
+**Table of contents**:
 
 - `Imports <#imports>`__
 - `Settings <#settings>`__

diff --git a/docs/notebooks/102-pytorch-onnx-to-openvino-with-output.rst b/docs/notebooks/102-pytorch-onnx-to-openvino-with-output.rst
@@ -1,7 +1,7 @@
 Convert a PyTorch Model to ONNX and OpenVINO™ IR
 ================================================
 
-.. _top:
+
 
 This tutorial demonstrates step-by-step instructions on how to do
 inference on a PyTorch semantic segmentation model, using OpenVINO
@@ -35,6 +35,10 @@ plant, sheep, sofa, train, tv monitor**
 More information about the model is available in the `torchvision
 documentation <https://pytorch.org/vision/main/models/lraspp.html>`__
 
+
+
+.. _top:
+
 **Table of contents**: 
 
 - `Preparation <#preparation>`__

diff --git a/docs/notebooks/102-pytorch-to-openvino-with-output.rst b/docs/notebooks/102-pytorch-to-openvino-with-output.rst
@@ -1,7 +1,7 @@
 Convert a PyTorch Model to OpenVINO™ IR
 =======================================
 
-.. _top:
+
 
 This tutorial demonstrates step-by-step instructions on how to do
 inference on a PyTorch classification model using OpenVINO Runtime.
@@ -31,6 +31,10 @@ but elevated to the design space level. The RegNet design space provides
 simple and fast networks that work well across a wide range of flop
 regimes. 
 
+
+
+.. _top:
+
 **Table of contents**:
 
 - `Prerequisites <#prerequisites>`__

diff --git a/docs/notebooks/103-paddle-to-openvino-classification-with-output.rst b/docs/notebooks/103-paddle-to-openvino-classification-with-output.rst
@@ -1,7 +1,7 @@
 Convert a PaddlePaddle Model to OpenVINO™ IR
 ============================================
 
-.. _top:
+
 
 This notebook shows how to convert a MobileNetV3 model from
 `PaddleHub <https://github.com/PaddlePaddle/PaddleHub>`__, pre-trained
@@ -16,6 +16,10 @@ IR model.
 Source of the
 `model <https://www.paddlepaddle.org.cn/hubdetail?name=mobilenet_v3_large_imagenet_ssld&en_category=ImageClassification>`__.
 
+
+
+.. _top:
+
 **Table of contents**:
 
 - `Preparation <#preparation>`__

diff --git a/docs/notebooks/104-model-tools-with-output.rst b/docs/notebooks/104-model-tools-with-output.rst
@@ -1,13 +1,15 @@
 Working with Open Model Zoo Models
 ==================================
 
-.. _top:
+
 
 This tutorial shows how to download a model from `Open Model
 Zoo <https://github.com/openvinotoolkit/open_model_zoo>`__, convert it
 to OpenVINO™ IR format, show information about the model, and benchmark
 the model. 
 
+.. _top:
+
 **Table of contents**:
 
 - `OpenVINO and Open Model Zoo Tools <#openvino-and-open-model-zoo-tools>`__

diff --git a/docs/notebooks/105-language-quantize-bert-with-output.rst b/docs/notebooks/105-language-quantize-bert-with-output.rst
@@ -1,7 +1,7 @@
 Quantize NLP models with Post-Training Quantization in NNCF
 ============================================================
 
-.. _top:
+
 
 This tutorial demonstrates how to apply ``INT8`` quantization to the
 Natural Language Processing model known as
@@ -24,6 +24,10 @@ and datasets. It consists of the following steps:
 -  Compare the performance of the original, converted and quantized
    models.
 
+
+
+.. _top:
+
 **Table of contents**:
 
 - `Imports <#imports>`__

diff --git a/docs/notebooks/106-auto-device-with-output.rst b/docs/notebooks/106-auto-device-with-output.rst
@@ -1,8 +1,6 @@
 Automatic Device Selection with OpenVINO™
 =========================================
 
-.. _top:
-
 The `Auto
 device <https://docs.openvino.ai/2023.1/openvino_docs_OV_UG_supported_plugins_AUTO.html>`__
 (or AUTO in short) selects the most suitable device for inference by
@@ -32,6 +30,10 @@ first inference.
 
    auto
 
+
+
+.. _top:
+
 **Table of contents**:
 
 - `Import modules and create Core <#import-modules-and-create-core>`__

diff --git a/docs/notebooks/107-speech-recognition-quantization-data2vec-with-output.rst b/docs/notebooks/107-speech-recognition-quantization-data2vec-with-output.rst
@@ -1,8 +1,6 @@
 Quantize Speech Recognition Models using NNCF PTQ API
 =====================================================
 
-.. _top:
-
 This tutorial demonstrates how to use the NNCF (Neural Network
 Compression Framework) 8-bit quantization in post-training mode (without
 the fine-tuning pipeline) to optimize the speech recognition model,
@@ -21,6 +19,10 @@ steps:
 -  Compare performance of the original and quantized models.
 -  Compare Accuracy of the Original and Quantized Models.
 
+
+
+.. _top:
+
 **Table of contents**:
 
 - `Download and prepare model <#download-and-prepare-model>`__

diff --git a/docs/notebooks/108-gpu-device-with-output.rst b/docs/notebooks/108-gpu-device-with-output.rst
@@ -1,6 +1,8 @@
 Working with GPUs in OpenVINO™
 ==============================
 
+
+
 .. _top:
 
 **Table of contents**:

diff --git a/docs/notebooks/109-latency-tricks-with-output.rst b/docs/notebooks/109-latency-tricks-with-output.rst
@@ -1,8 +1,6 @@
 Performance tricks in OpenVINO for latency mode
 ===============================================
 
-.. _top:
-
 The goal of this notebook is to provide a step-by-step tutorial for
 improving performance for inferencing in a latency mode. Low latency is
 especially desired in real-time applications when the results are needed
@@ -51,6 +49,10 @@ optimize performance on OpenVINO IR files in
 A similar notebook focused on the throughput mode is available
 `here <109-throughput-tricks-with-output.html>`__.
 
+
+
+.. _top:
+
 **Table of contents**:
 
 - `Data <#data>`__

diff --git a/docs/notebooks/109-throughput-tricks-with-output.rst b/docs/notebooks/109-throughput-tricks-with-output.rst
@@ -1,7 +1,7 @@
 Performance tricks in OpenVINO for throughput mode
 ==================================================
 
-.. _top:
+
 
 The goal of this notebook is to provide a step-by-step tutorial for
 improving performance for inferencing in a throughput mode. High
@@ -46,6 +46,10 @@ optimize performance on OpenVINO IR files in
 A similar notebook focused on the latency mode is available
 `here <109-latency-tricks-with-output.html>`__.
 
+
+
+.. _top:
+
 **Table of contents**:
 
 - `Data <#data>`__

diff --git a/docs/notebooks/110-ct-scan-live-inference-with-output.rst b/docs/notebooks/110-ct-scan-live-inference-with-output.rst
@@ -1,8 +1,6 @@
 Live Inference and Benchmark CT-scan Data with OpenVINO™
 ========================================================
 
-.. _top:
-
 Kidney Segmentation with PyTorch Lightning and OpenVINO™ - Part 4
 -----------------------------------------------------------------
 
@@ -30,6 +28,10 @@ notebook.
 For demonstration purposes, this tutorial will download one converted CT
 scan to use for inference. 
 
+
+
+.. _top:
+
 **Table of contents**:
 
 - `Imports <#imports>`__

diff --git a/docs/notebooks/110-ct-segmentation-quantize-nncf-with-output.rst b/docs/notebooks/110-ct-segmentation-quantize-nncf-with-output.rst
@@ -1,8 +1,6 @@
 Quantize a Segmentation Model and Show Live Inference
 =====================================================
 
-.. _top:
-
 Kidney Segmentation with PyTorch Lightning and OpenVINO™ - Part 3
 -----------------------------------------------------------------
 
@@ -55,6 +53,10 @@ demonstration purposes, this tutorial will download one converted CT
 scan and use that scan for quantization and inference. For production
 purposes, use a representative dataset for quantizing the model. 
 
+
+
+.. _top:
+
 **Table of contents**:
 
 - `Imports <#imports>`__

diff --git a/docs/notebooks/111-yolov5-quantization-migration-with-output.rst b/docs/notebooks/111-yolov5-quantization-migration-with-output.rst
@@ -1,8 +1,6 @@
 Migrate quantization from POT API to NNCF API
 =============================================
 
-.. _top:
-
 This tutorial demonstrates how to migrate quantization pipeline written
 using the OpenVINO `Post-Training Optimization Tool (POT) <https://docs.openvino.ai/2023.1/pot_introduction.html>`__ to
 `NNCF Post-Training Quantization API <https://docs.openvino.ai/nightly/basic_quantization_flow.html>`__.
@@ -23,6 +21,9 @@ The tutorial consists from the following parts:
 7. Compare performance FP32 and INT8 models
 
 
+
+.. _top:
+
 **Table of contents**:
 
 - `Preparation <#preparation>`__

diff --git a/docs/notebooks/112-pytorch-post-training-quantization-nncf-with-output.rst b/docs/notebooks/112-pytorch-post-training-quantization-nncf-with-output.rst
@@ -1,8 +1,6 @@
 Post-Training Quantization of PyTorch models with NNCF
 ======================================================
 
-.. _top:
-
 The goal of this tutorial is to demonstrate how to use the NNCF (Neural
 Network Compression Framework) 8-bit quantization in post-training mode
 (without the fine-tuning pipeline) to optimize a PyTorch model for the
@@ -27,6 +25,9 @@ quantization, not demanding the fine-tuning of the model.
    notebook.
 
 
+
+.. _top:
+
 **Table of contents**:
 
 - `Preparations <#preparations>`__

diff --git a/docs/notebooks/113-image-classification-quantization-with-output.rst b/docs/notebooks/113-image-classification-quantization-with-output.rst
@@ -1,7 +1,7 @@
 Quantization of Image Classification Models
 ===========================================
 
-.. _top:
+
 
 This tutorial demonstrates how to apply ``INT8`` quantization to Image
 Classification model using
@@ -21,6 +21,8 @@ This tutorial consists of the following steps:
 -  Compare performance of the original and quantized models.
 -  Compare results on one picture.
 
+.. _top:
+
 **Table of contents**:
 
 - `Prepare the Model <#prepare-the-model>`__

diff --git a/docs/notebooks/115-async-api-with-output.rst b/docs/notebooks/115-async-api-with-output.rst
@@ -1,7 +1,7 @@
 Asynchronous Inference with OpenVINO™
 =====================================
 
-.. _top:
+
 
 This notebook demonstrates how to use the `Async
 API <https://docs.openvino.ai/nightly/openvino_docs_deployment_optimization_guide_common.html>`__
@@ -14,6 +14,8 @@ in parallel (for example, populating inputs or scheduling other
 requests) rather than wait for the current inference to complete first.
 
 
+.. _top:
+
 **Table of contents**:
 
 - `Imports <#imports>`__

diff --git a/docs/notebooks/116-sparsity-optimization-with-output.rst b/docs/notebooks/116-sparsity-optimization-with-output.rst
@@ -1,7 +1,7 @@
 Accelerate Inference of Sparse Transformer Models with OpenVINO™ and 4th Gen Intel® Xeon® Scalable Processors
 =============================================================================================================
 
-.. _top:
+
 
 This tutorial demonstrates how to improve performance of sparse
 Transformer models with `OpenVINO <https://docs.openvino.ai/>`__ on 4th
@@ -21,6 +21,8 @@ consists of the following steps:
    integration with Hugging Face Optimum.
 -  Compare sparse 8-bit vs. dense 8-bit inference performance.
 
+.. _top:
+
 **Table of contents**:
 
 - `Prerequisites <#prerequisites>`__