diff --git a/.github/workflows/pytest.yml b/.github/workflows/pytest.yml index ffd8173..01aaff7 100644 --- a/.github/workflows/pytest.yml +++ b/.github/workflows/pytest.yml @@ -19,7 +19,7 @@ jobs: # You can test your matrix by printing the current Python version - name: Install dependencies run: | - python -m pip install --upgrade pip wheel packaging + python -m pip install --upgrade pip pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu pip install -e . - name: Test with pytest diff --git a/README.md b/README.md index 182b305..284d4b6 100644 --- a/README.md +++ b/README.md @@ -4,8 +4,8 @@ ![pytest](https://github.com/aleximmer/laplace/actions/workflows/pytest.yml/badge.svg) ![lint](https://github.com/aleximmer/laplace/actions/workflows/lint-ruff.yml/badge.svg) ![format](https://github.com/aleximmer/laplace/actions/workflows/format-ruff.yml/badge.svg) - + The laplace package facilitates the application of Laplace approximations for entire neural networks, subnetworks of neural networks, or just their last layer. The package enables posterior approximations, marginal-likelihood estimation, and various posterior predictive computations. @@ -49,30 +49,31 @@ The [code](https://github.com/runame/laplace-redux) to reproduce the experiments ## Setup -For full compatibility, install this package in a fresh virtual env. -We assume Python >= 3.9 since lower versions are [(soon to be) deprecated](https://devguide.python.org/versions/). -PyTorch version 2.0 and up is also required for full compatibility. +> [!IMPORTANT] +> We assume Python >= 3.9 since lower versions are [(soon to be) deprecated](https://devguide.python.org/versions/). +> PyTorch version 2.0 and up is also required for full compatibility. + To install laplace with `pip`, run the following: ```bash -pip install --upgrade pip wheel packaging -pip install git+https://github.com/aleximmer/laplace.git@0.2 +pip install laplace-torch ``` -> [!CAUTION] -> Unfortunately, we lost our PyPI account and so running `pip install laplace-torch` -> only installs the previous version (0.1)! - -For development purposes, clone the repository and then install: +For development purposes, e.g. if you would like to make contributions, +clone the repository and then install: ```bash # first install the build system: pip install --upgrade pip wheel packaging -# then install the develop +# then install the develop pip install -e ".[all]" ``` +> [!NOTE] +> See [contributing guideline](#contributing). +> We're looking forward to your contributions! + ## Example usage ### Simple usage @@ -112,9 +113,9 @@ la = Laplace(model, "classification", hessian_structure="diag") la.fit(train_loader) la.optimize_prior_precision( - method="gridsearch", - pred_type="glm", - link_approx="probit", + method="gridsearch", + pred_type="glm", + link_approx="probit", val_loader=val_loader ) @@ -291,19 +292,18 @@ cases. Each method has pros and cons, please see [this discussion](https://github.com/aleximmer/Laplace/issues/217#issuecomment-2278311460) for details. In summary -* Disable-grad: General method to perform Laplace on specific types of +- Disable-grad: General method to perform Laplace on specific types of layer/parameter, e.g. in an LLM with LoRA. Can be used to emulate `LLLaplace` as well. Always use `subset_of_weights='all'` for this method. - * subnet selection by disabling grads is more efficient than - `SubnetLaplace` since it avoids calculating full Jacobians first - * disabling grads can only be performed on `Parameter` level and not for - individual weights, so this doesn't cover all cases that `SubnetLaplace` - offers such as `Largest*SubnetMask` or `RandomSubnetMask` -* `LLLaplace`: last-layer specific code with improved performance (#145) -* `SubnetLaplace`: more fine-grained partitioning such as + - subnet selection by disabling grads is more efficient than + `SubnetLaplace` since it avoids calculating full Jacobians first + - disabling grads can only be performed on `Parameter` level and not for + individual weights, so this doesn't cover all cases that `SubnetLaplace` + offers such as `Largest*SubnetMask` or `RandomSubnetMask` +- `LLLaplace`: last-layer specific code with improved performance (#145) +- `SubnetLaplace`: more fine-grained partitioning such as `LargestMagnitudeSubnetMask` - ### Serialization As with plain `torch`, we support to ways to serialize data. @@ -368,7 +368,7 @@ torch.load(..., map_location="cpu") The laplace package consists of two main components: -1. The subclasses of [`laplace.BaseLaplace`](https://github.com/AlexImmer/Laplace/blob/main/laplace/baselaplace.py) that implement different sparsity structures: different subsets of weights (`'all'`, `'subnetwork'` and `'last_layer'`) and different structures of the Hessian approximation (`'full'`, `'kron'`, `'lowrank'`, `'diag'` and `'gp'`). This results in _ten_ currently available options: `laplace.FullLaplace`, `laplace.KronLaplace`, `laplace.DiagLaplace`, `laplace.FunctionalLaplace` the corresponding last-layer variations `laplace.FullLLLaplace`, `laplace.KronLLLaplace`, `laplace.DiagLLLaplace` and `laplace.FunctionalLLLaplace` (which are all subclasses of [`laplace.LLLaplace`](https://github.com/AlexImmer/Laplace/blob/main/laplace/lllaplace.py)), [`laplace.SubnetLaplace`](https://github.com/AlexImmer/Laplace/blob/main/laplace/subnetlaplace.py) (which only supports `'full'` and `'diag'` Hessian approximations) and `laplace.LowRankLaplace` (which only supports inference over `'all'` weights). All of these can be conveniently accessed via the [`laplace.Laplace`](https://github.com/AlexImmer/Laplace/blob/main/laplace/laplace.py) function. +1. The subclasses of [`laplace.BaseLaplace`](https://github.com/AlexImmer/Laplace/blob/main/laplace/baselaplace.py) that implement different sparsity structures: different subsets of weights (`'all'`, `'subnetwork'` and `'last_layer'`) and different structures of the Hessian approximation (`'full'`, `'kron'`, `'lowrank'`, `'diag'` and `'gp'`). This results in _ten_ currently available options: `laplace.FullLaplace`, `laplace.KronLaplace`, `laplace.DiagLaplace`, `laplace.FunctionalLaplace` the corresponding last-layer variations `laplace.FullLLLaplace`, `laplace.KronLLLaplace`, `laplace.DiagLLLaplace` and `laplace.FunctionalLLLaplace` (which are all subclasses of [`laplace.LLLaplace`](https://github.com/AlexImmer/Laplace/blob/main/laplace/lllaplace.py)), [`laplace.SubnetLaplace`](https://github.com/AlexImmer/Laplace/blob/main/laplace/subnetlaplace.py) (which only supports `'full'` and `'diag'` Hessian approximations) and `laplace.LowRankLaplace` (which only supports inference over `'all'` weights). All of these can be conveniently accessed via the [`laplace.Laplace`](https://github.com/AlexImmer/Laplace/blob/main/laplace/laplace.py) function. 2. The backends in [`laplace.curvature`](https://github.com/AlexImmer/Laplace/blob/main/laplace/curvature/) which provide access to Hessian approximations of the corresponding sparsity structures, for example, the diagonal GGN.