-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding section 9.1: Gated Recurrent Units (GRU) #92
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LSTM some minor comment
" Y = H.dot(W_hq).add(b_q);\n", | ||
" outputs.add(Y);\n", | ||
" }\n", | ||
" return new Pair(outputs.size() > 1 ? NDArrays.concat(outputs) : outputs.get(0), new NDList(H));\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't it work on outputs.size() == 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, if I am not mistaken that's the reason I had done that. It threw an error otherwise.
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"public static NDArray normal(Shape shape, Device device) {\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually DJL will detect system GPU and use it if applicable. So I think you can remove the device parameter here. If you check TensorFlow implementation, they don't have this param as it do it in DJL way as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh I just saw randomNormal uses the NDArray device by default if you don't specify. Cool, I'll remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be doing this in another PR as it covers multiple sections
Adding section 9.1: Gated Recurrent Units (GRU), based on original http://d2l.ai/chapter_recurrent-modern/gru.html