Releases · pulp-platform/nemo

26 Mar 22:21

v0.0.8

ed32239

v0.0.8 Latest

Latest

The main feature included in this version is the addition of Statistics-Aware Weight Binning (https://arxiv.org/pdf/1807.06964.pdf) as a technique for weight training.

Assets 2

27 May 11:15

FrancescoConti

v0.0.7

b9f5f9d

v0.0.7 Pre-release

Pre-release

This release includes minor fixes to release v0.0.6, in order to be more conservative and guarantee a 32-bit accumulator is always sufficient.

Assets 2

26 May 14:07

FrancescoConti

v0.0.6

f7607a3

v0.0.6 Pre-release

Pre-release

This release fixes several problems related to BatchNormalization, targeting numerical inconsistencies between QD and ID stage and substantially improving the stability in FQ training. In particular:

BN parameters are no longer as big as possible by default, they are now calibrated according to the BN output range; BN additive parameter is no longer requantized.
BatchNormalization in QD/ID stages can be calibrated using statistics collected from validation (or with a reasonable default)
BatchNormalization freezing also disables gradients, unless explicitly requested not to do so.

Assets 2

08 May 22:41

FrancescoConti

v0.0.5

02dd866

v0.0.5 Pre-release

Pre-release

v0.0.5 introduces new equalization strategies, and switches from floor-based to round-based quantization for weights. Notice this breaks compatibility with networks that were fine-tuned with quantization in previous versions. Compatibility is restored by subtracting eps/2 from all weights.

Assets 2

02 May 14:41

FrancescoConti

v0.0.4

423e105

v0.0.4 Pre-release

Pre-release

This releases focuses on numerical equivalence between the FQ, QD, ID stages.

FQ and QD stages: fix weight hardening so that it correctly represent the fact that clipping parameters (alpha in particular) will have to be also quantized
QD stage: fix the way that quantized activations are performed, now they are exactly like integer activations, except that the numerical format is not converted to int64, as this conversion costs a very significant perf hit
FQ and QD stages: add support for input biasing, bias removal
FQ stage: introduce alpha, beta quantization both in inference and training. For beta, it is mostly cosmetic;
for alpha, instead, it is a substantial difference, because this parameter will have to be represented as an integer when deploying to QD, ID.
QD stage: rune useless BN parameters after Linear weight hardening. Batch-Normalization parameter quantization is easier when the parameters do not have outliers (as quantization is performed per tensor). These outliers are common when the previous Linear layer's weights are small along the same channel - a condition that often results in them being always 0 after hardening. To ease BN quantization, we introduce a transformation that prunes BN parameters when they follow a channel with 0'ed weights (i.e., it is effectively unused). This is active by default when switching to QD stage.
various other minors

Assets 2

27 Apr 13:13

FrancescoConti

v0.0.3

34d7eef

v0.0.3 Pre-release

Pre-release

Automatically derive DFQ equalization dict -- moreover, consider also activation alpha in DFQ equalization. This is useful if the alpha is an "algorithmic" clipping derived, e.g., from a ReLU6 -- as opposed to a clipping we impose only post-calibration such as in the case of a ReLU.
Consider also ConstantPad2d in bias removal
Add convenience net.qd_stage and net.id_stage funcs: these represent a more user-friendly way to switch between the FQ -> QD -> ID internal representations of the network.
Support BatchNorm1d in QD,ID (via PACT_*BatchNorm2d): did not (yet) rename PACT_*BatchNorm2d to reflect the fact that they are actually PACT_*BatchNormNd's now!
Fix shape of reshape_before/after for PACT_Linear layers
Add dropout_to_identity transform and related flag in quantize_pact: this is useful when deploying networks including Dropout layers, which are not useful in inference and cause issues in the DeployGraph. Note that the Dropout removal has to be done in FullPrecision stage, i.e., before creating the deploy graph: this commits adds a flag to quantize_pact in order to do this (should probably not be active when doing retraining).
Bump version to 0.0.3
Fix mnist_test to make it easier to debug
Work around a regression in mnist_test for equalize_dfq
Other minor fixes