Fix module initialization with other dtypes and simplify module registration #47

sbrunk · 2023-07-30T07:25:14Z

Fixes and improves initializing modules with other parameter types.

import torch.*

// Default:
nn.Linear(10, 10) // Linear[Float32]

// Override explicitly:
nn.Linear[BFloat16](10, 10) // Linear[BFloat16]

// Override default via context parameter:
import Default.float64
nn.Linear(10, 10) // Linear[Float64]

This is currently inconsistent with tensor creation ops like torch.ones where we use a default parameter for the dtype, meaning it's easier to override at runtime, but we have a fixed default. Both designs have tradeoffs and we need to test what works well in practice. Perhaps we can event find a way to combine both approaches.

Type of all modules fixed at creation time

To make container classes like Sequential work, we currently fix the dtype on module initialization even for parameterless modules like Softmax:

final class Softmax[D <: DType: Default](dim: Int) extends TensorModule[D]:
  def apply(t: Tensor[D]): Tensor[D] = Tensor(nativeModule.forward(t.native))

Instead of a generic apply method:

final class Softmax(dim: Int):
  def apply[D <: DType](t: Tensor[D]): Tensor[D] = Tensor(nativeModule.forward(t.native))

If we're able to find a better way, we might be able to change this back in the future.

DType conversion of modules

Module type conversion will need rethinking the module design taking into account (im)mutability of modules and things like mixed precision training and perhaps quantization. For now the type is fixed on module creation.

Other changes

Simplify module registration after improvements in new presets.
Remove broken copy and to(dtype) methods from module.

Remove broken copy and to(dtype) methods from module. Module type conversion will need rethinking the module design so for now it's fixed on creation.

Move type parameter in all modules from apply to constructor for consistency and compat with Sequential. See sbrunk#47 for details about the current design.

sbrunk added 2 commits July 28, 2023 21:28

Simplify module registration after improvements in new presets

5f18d5a

Fix and improve initializing modules with other parameter types

b2b5a2f

Remove broken copy and to(dtype) methods from module. Module type conversion will need rethinking the module design so for now it's fixed on creation.

sbrunk force-pushed the fix-module-initialization branch from 20770c4 to b2b5a2f Compare July 30, 2023 08:47

sbrunk added the enhancement New feature or request label Jul 30, 2023

sbrunk merged commit 131ba89 into main Jul 30, 2023
7 checks passed

sbrunk deleted the fix-module-initialization branch July 30, 2023 09:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix module initialization with other dtypes and simplify module registration #47

Fix module initialization with other dtypes and simplify module registration #47

sbrunk commented Jul 30, 2023 •

edited

Loading

Fix module initialization with other dtypes and simplify module registration #47

Fix module initialization with other dtypes and simplify module registration #47

Conversation

sbrunk commented Jul 30, 2023 • edited Loading

Type of all modules fixed at creation time

DType conversion of modules

Other changes

sbrunk commented Jul 30, 2023 •

edited

Loading