Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated training? #37

Open
drzax opened this issue Mar 17, 2016 · 8 comments
Open

Updated training? #37

drzax opened this issue Mar 17, 2016 · 8 comments

Comments

@drzax
Copy link

drzax commented Mar 17, 2016

I'm very new to the whole world of neural networks so please forgive any silly questions.

Is there a way to update a network by training it on new text? The undocumented -init_from flag looks like it might do that, but I can't quite be sure.

@AlekzNet
Copy link

I would rephrase the question as follows: when initializing from a checkpoint, what parameters/data can be changed, and what parameters will/must stay the same?

@jcjohnson
Copy link
Owner

When using -init_from the model options (model type, wordvec size, rnn size, rnn layers, dropout, batchnorm) will be ignored and the architecture from the existing checkpoint will be used instead.

In theory you could use a new dataset when training with -init_from, but you would have to make sure that it had the same vocabulary as the original dataset. To support that use case, we could change the preprocessing script to take an input vocabulary as an argument, allowing multiple datasets to be encoded with the same vocabulary.

@AlekzNet
Copy link

I noticed, that "checkpoint_every" also does not change.

@drzax
Copy link
Author

drzax commented Mar 17, 2016

Just so I'm clear then, the current purpose of -init_from is maybe to re-commence a failed or aborted training session, but with all the same details?

@jcjohnson
Copy link
Owner

Yes that's correct; the learning rate and learning rate decay could be
different though

On Thursday, March 17, 2016, Simon Elvery [email protected] wrote:

Just so I'm clear then, the current purpose of -init_from is maybe to
re-commence a failed or aborted training session, but with all the same
details?


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#37 (comment)

@AlekzNet
Copy link

Just noticed, that the letters/digits may be reassigned (idx_to_token) after subsequent running preprocess.py (unless I miss/confuse something else). I do not have the previous version of .json file (will save it next time), but now I'm getting lots of "don1t"s and "it1s" instead of "don't" ad "it's" in the output. So it looks like "1" took the place of "'", "'" took the place of "!", etc.

Can the tokens be lexicographically ordered by preprocess.py, so such issues can be avoided?

@gwern
Copy link

gwern commented Jun 15, 2016

I would appreciate it if the data generation were deterministic. When your RNN takes weeks to train and you decide you need to change something in the data, it'd be nice if you didn't have to start over.

@dgcrouse
Copy link

We are working on a new accessory script that can encode new data using a JSON schema from a previous dataset to address this. Going full deterministic can be really slow and cause lots of problems, so this is the next best thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants