Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizers support for parameter groups #523

Merged
merged 29 commits into from
Feb 22, 2022
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
56f0ae2
Added overloads for addcdiv
NiklasGustafsson Feb 8, 2022
1ab1e49
Started migrating optimizer implementations to managed code.
NiklasGustafsson Feb 8, 2022
c5bfe57
Migrated Adam and AdamW implementations to managed code.
NiklasGustafsson Feb 8, 2022
875124b
Migrated the Adagrad optimizer to managed code.
NiklasGustafsson Feb 8, 2022
35cde78
The new optimizer construction APIs take named_parameters instead of …
NiklasGustafsson Feb 8, 2022
c7fe487
Migrating F# examples to new APIs.
NiklasGustafsson Feb 8, 2022
956c3ea
Adding is_complex()
NiklasGustafsson Feb 8, 2022
d051e4d
Started support for parameter groups.
NiklasGustafsson Feb 9, 2022
e25832c
WIP -- adding parameter groups support to optimizers.
NiklasGustafsson Feb 10, 2022
9895206
Further optimizer work.
NiklasGustafsson Feb 13, 2022
3407919
Manual merge.
NiklasGustafsson Feb 14, 2022
e348e8e
Moved more optimizers to support parameter groups.
NiklasGustafsson Feb 15, 2022
45e196a
Finished convering optimizers to managed code.
NiklasGustafsson Feb 15, 2022
a6c2cdc
Adjusted LR schedulers to handle parameter groups.
NiklasGustafsson Feb 16, 2022
61ae757
Added test for #516
NiklasGustafsson Feb 16, 2022
dec5dc4
Update version number.
NiklasGustafsson Feb 16, 2022
3ccdeb6
Reverted version number.
NiklasGustafsson Feb 16, 2022
091b463
Merge branch 'main' into optimizers
NiklasGustafsson Feb 16, 2022
9d861eb
Adding a couple of minor APIs that were missing.
NiklasGustafsson Feb 16, 2022
275400f
Update version number.
NiklasGustafsson Feb 16, 2022
dc022ee
Added is_leaf and retain_grad functions.
NiklasGustafsson Feb 17, 2022
376eeb6
Temporary fix.
NiklasGustafsson Feb 17, 2022
2dfb4f1
Adding "named_parameters" version for all optimizer factories.
NiklasGustafsson Feb 18, 2022
452de01
Merge branch 'main' into optimizers
NiklasGustafsson Feb 18, 2022
07f4657
Updates to release notes and developer guide.
NiklasGustafsson Feb 18, 2022
e0718d5
Initial round of PR responses.
NiklasGustafsson Feb 19, 2022
ada3ef3
More unit tests.
NiklasGustafsson Feb 19, 2022
ee43d9c
Manual merge.
NiklasGustafsson Feb 22, 2022
9aabb84
Updated version number and release notes.
NiklasGustafsson Feb 22, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions DEVGUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -231,3 +231,7 @@ version of PyTorch then quite a lot of careful work needs to be done.

10. Remember to delete all massive artifacts from Azure DevOps and reset this `BuildLibTorchPackages` in in [azure-pipelines.yml](azure-pipelines.yml)


## Building with Visual Studio

In order for builds to work properly using Visual Studio 2019 or 2022, you must start VS from the 'x64 Native Tools Command Prompt for VS 2022' (or 2019) in order for the full environment to be set up correctly. Starting VS from the desktop or taskbar will not work properly.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
[![Build Status](https://dotnet.visualstudio.com/TorchSharp/_apis/build/status/dotnet.TorchSharp?branchName=main)](https://dotnet.visualstudio.com/TorchSharp/_build/latest?definitionId=174&branchName=main)

Please check the [Release Notes](RELEASENOTES.md) file for news on what's been updated in each new release.

__TorchSharp is now in the .NET Foundation!__

If you are using TorchSharp from NuGet, you should be using a version >= 0.95.1 of TorchSharp, and >= 1.10.0.1 of the libtorch-xxx redistributable packages. We recommend using one of the 'bundled' packages: TorchSharp-cpu, TorchSharp-cuda-windows, or TorchSharp-cuda-linux. They will pull in the right libtorch backends.
Expand Down
17 changes: 15 additions & 2 deletions RELEASENOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,22 @@ Releases, starting with 9/2/2021, are listed with the most recent release at the

## NuGet Version 0.96.1

__API Changes:__

__NOTE__: This release contains breaking changes.<br/>

The APIs to create optimizers all take 'parameters()' as well as 'named_parameters()' now.<br/>
Support for parameter groups in most optimizers.<br/>
Support for parameter groups in LR schedulers.<br/>

__Fixed Bugs:__

#495 Add support for OptimizerParamGroup<br/>
#509 Tensor.conj() not implemented<br/>
#510 Module.Load throws Mismatched state_dict sizes exception on BatchNorm1d<br/>
#515 what's reason for making register_module internal?<br/>
#516 AdamW bug on v0.96.0<br/>
#521 Can't set Tensor slice using indexing<br/>

## NuGet Version 0.96.0

Expand All @@ -20,9 +33,9 @@ Lower-cased names: Module.Train --> Module.train and Module.Eval --> Module.eval

__Fixed Bugs:__

#500 BatchNorm1d throws exception during eval with batch size of 1<br/>
#499 Setting Linear.weight is not reflected in 'parameters()'<br/>
#496 Wrong output shape of torch.nn.Conv2d with 2d stride overload<br/>
#499 Setting Linear.weight is not reflected in 'parameters()'<br/>
#500 BatchNorm1d throws exception during eval with batch size of 1<br/>

## NuGet Version 0.95.4

Expand Down
2 changes: 1 addition & 1 deletion build/BranchInfo.props
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<PropertyGroup>
<MajorVersion>0</MajorVersion>
<MinorVersion>96</MinorVersion>
<PatchVersion>0</PatchVersion>
<PatchVersion>1</PatchVersion>
</PropertyGroup>

</Project>
3 changes: 2 additions & 1 deletion src/Examples/SequenceToSequence.cs
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,8 @@ internal static void Main(string[] args)
var val_loss = evaluate(valid_data, model, loss, bptt, ntokens, optimizer);
sw.Stop();

Console.WriteLine($"\nEnd of epoch: {epoch} | lr: {optimizer.LearningRate:0.00} | time: {sw.Elapsed.TotalSeconds:0.0}s | loss: {val_loss:0.00}\n");
var pgFirst = optimizer.ParamGroups.First();
Console.WriteLine($"\nEnd of epoch: {epoch} | lr: {pgFirst.LearningRate:0.00} | time: {sw.Elapsed.TotalSeconds:0.0}s | loss: {val_loss:0.00}\n");
scheduler.step();
}

Expand Down
4 changes: 3 additions & 1 deletion src/Examples/TextClassification.cs
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,9 @@ internal static void Main(string[] args)

sw.Stop();

Console.WriteLine($"\nEnd of epoch: {epoch} | lr: {optimizer.LearningRate:0.0000} | time: {sw.Elapsed.TotalSeconds:0.0}s\n");
var pgFirst = optimizer.ParamGroups.First();

Console.WriteLine($"\nEnd of epoch: {epoch} | lr: {pgFirst.LearningRate:0.00} | time: {sw.Elapsed.TotalSeconds:0.0}s\n");
scheduler.step();
}
}
Expand Down
16 changes: 14 additions & 2 deletions src/FSharp.Examples/SequenceToSequence.fs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ open System.Diagnostics
open System.Collections.Generic

open TorchSharp
open TorchSharp.Modules
open type TorchSharp.torch.nn
open type TorchSharp.torch.optim

Expand Down Expand Up @@ -244,7 +245,17 @@ let run epochs =

use model = new TransformerModel(ntokens, device)
let lr = 2.50
let optimizer = SGD(model.parameters(), lr)

let pgs = [|
SGD.ParamGroup(Parameters = model.parameters(), Options = SGD.Options(momentum = 1.0, dampening = 0.5));
SGD.ParamGroup(model.parameters(), momentum = 1.5, dampening = 0.1)
|]

let optimizer = SGD([|
SGD.ParamGroup(model.parameters(), momentum = 1.0, dampening = 0.5);
SGD.ParamGroup(model.parameters(), momentum = 1.5, dampening = 0.1)
|], lr)

let scheduler = lr_scheduler.StepLR(optimizer, 1, 0.95, last_epoch=15)

let totalTime = Stopwatch()
Expand All @@ -260,7 +271,8 @@ let run epochs =
let val_loss = evaluate model valid_data ntokens
sw.Stop()

let lrStr = optimizer.LearningRate.ToString("0.00")
let pgFirst = optimizer.ParamGroups.First()
let lrStr = pgFirst.LearningRate.ToString("0.00")
let elapsed = sw.Elapsed.TotalSeconds.ToString("0.0")
let lossStr = val_loss.ToString("0.00")

Expand Down
3 changes: 2 additions & 1 deletion src/FSharp.Examples/TextClassification.fs
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,8 @@ let run epochs =

sw.Stop()

let lrStr = optimizer.LearningRate.ToString("0.0000")
let pgFirst = optimizer.ParamGroups.First()
let lrStr = pgFirst.LearningRate.ToString("0.0000")
let tsStr = sw.Elapsed.TotalSeconds.ToString("0.0")
printfn $"\nEnd of epoch: {epoch} | lr: {lrStr} | time: {tsStr}s\n"
scheduler.step() |> ignore
Expand Down
10 changes: 10 additions & 0 deletions src/Native/LibTorchSharp/THSTensor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1030,6 +1030,16 @@ int THSTensor_requires_grad(const Tensor tensor)
CATCH_RETURN(int, 0, tensor->requires_grad());
}

void THSTensor_retain_grad(const Tensor tensor)
{
CATCH(tensor->retain_grad(););
}

int64_t THSTensor_is_leaf(const Tensor tensor)
{
CATCH_RETURN(int64_t, 0, tensor->is_leaf(););
}

Tensor THSTensor_reshape(const Tensor tensor, const int64_t* shape, const int length)
{
CATCH_TENSOR(tensor->reshape(at::ArrayRef<int64_t>(shape, length)));
Expand Down
4 changes: 4 additions & 0 deletions src/Native/LibTorchSharp/THSTensor.h
Original file line number Diff line number Diff line change
Expand Up @@ -586,6 +586,8 @@ EXPORT_API(Tensor) THSTensor_inverse(const Tensor tensor);

EXPORT_API(int) THSTensor_is_contiguous(const Tensor input);

EXPORT_API(int64_t) THSTensor_is_leaf(const Tensor tensor);

EXPORT_API(int) THSTensor_is_sparse(const Tensor tensor);

EXPORT_API(Tensor) THSTensor_isclose(const Tensor tensor, const Tensor other, const double rtol, const double atol, const bool equal_nan);
Expand Down Expand Up @@ -988,6 +990,8 @@ EXPORT_API(Tensor) THSTensor_remainder_scalar(const Tensor left, const Scalar ri

EXPORT_API(Tensor) THSTensor_remainder_scalar_(const Tensor left, const Scalar right);

EXPORT_API(void) THSTensor_retain_grad(const Tensor tensor);

EXPORT_API(Tensor) THSTensor_rsqrt(const Tensor tensor);

EXPORT_API(Tensor) THSTensor_rsqrt_(const Tensor tensor);
Expand Down
Loading