Try the demo.
I used download-urls.py to quickly download the HTML from poetryfoundation.org based on the urls in romantic-urls.txt
.
Then I used Parse Poetry.ipynb
to parse the HTML and extract the title, author, and poem. There are some glitches here with newlines being rendered in some places they shouldn't, and not being rendered in places where they should. This notebook saves a bunch of text files to output/
that include metadata as the first few lines.
Then I used Generate GPT-2.ipynb
to generate poems based on random chunks from the poems and the seed words. This notebook saves files to poems.json
and generated.json
. To run this notebook, first get GPT-2 running, and drop the notebook in the gpt-2/src/
directory.
Both Python notebooks import from utils
which I have separately pushed here.
Finally, I load generated.json
and poems.json
with JavaScript in index.html
and display the results.