Skip to content

Understanding Trainer.initialize: batch size x number of inputs? #546

Answered by zachgk
hmf asked this question in Q&A
Discussion options

You must be logged in to vote

When you call trainer.initialize(shape), the shape you pass into it is the input shape that your model accepts. It then uses the input shape into the main block to compute the input shape into all children blocks and parameters to initialize all of them as well. So, the mapping from input shape to number of parameters is part of the block class; you don't have to worry about it.

If your model input is a batch of RGB images of size 28x28, you would give it trainer.initialize(new Shape(batchSize, 3, 28, 28)). You could also give it trainer.initialize(new Shape(1, 3, 28, 28)) because the batch size doesn't affect parameter initialization. Or, if it is a model that accepts two images, then it…

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@hmf
Comment options

@zachgk
Comment options

@hmf
Comment options

Answer selected by hmf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants