Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to generate texts longer than 1024 tokens #2

Open
minimaxir opened this issue Apr 18, 2019 · 7 comments · May be fixed by #87 or #199
Open

Allow users to generate texts longer than 1024 tokens #2

minimaxir opened this issue Apr 18, 2019 · 7 comments · May be fixed by #87 or #199
Labels
enhancement New feature or request

Comments

@minimaxir
Copy link
Owner

It likely isn't possible to do it at the generation level (like other frameworks), but we can hack it by:

  1. Generate full text.
  2. Feed latter half of previous text as a prefix.
  3. Repeat until done.
@minimaxir minimaxir added the enhancement New feature or request label Apr 18, 2019
@aletote
Copy link

aletote commented May 9, 2019

Why half of the previous text and not all?

@yeldarby
Copy link

Would this be something you'd be willing to accept a PR on? I'd be willing to give it a go next week.

@rafaeelaudibert
Copy link

Is there some work on this? I would like to have this feature implemented

@woctezuma
Copy link
Contributor

woctezuma commented May 28, 2019

Why half of the previous text and not all?

I guess that the length is computed on the generated text, including the prefix.

If you feed the whole previous text as a prefix, then you would not be able to generate anything more if the length of the input is already at the max.

By feeding half of the previous text, you are guaranteed to have space(*) left for the rest of the generation process. This allows to circumvent the length constraint by iterating.

(*) at least half the length

@minimaxir
Copy link
Owner Author

I'm currently not working on this; if there's a PR, I'll merge it.

The hard part is that the 1024 limit is done at the tensor level; not sure what's necessary to shift it to handle it efficiently. (especially in the batch case)

@rafaeelaudibert
Copy link

Yeah, the batch case is where I fail to get it working, as I already am using a code with it working for only one batch, but can't figure it out how to properly and efficiently make it work for multiple batches, so I can't create a PR right now.

Yet, I'd be glad if someone would create a PR that solves this problem

@cedspam
Copy link

cedspam commented Jun 5, 2019

Why half of the previous text and not all?
you need some space left on the model sequence to generate your text, so it can be up to max len minus one token. it's a compromize between generation speed and generated text contextuality

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
6 participants