-
Notifications
You must be signed in to change notification settings - Fork 33
Using Local Source Content
N.B. This functionality is experimental and may change.
Voyant allows users to open an existing corpus or to create their own corpus by pasting in texts, URLs or by uploading documents. All of the corpus creation mechanisms require network transfers, which can significantly slow down the process.
VoyantServer allows you to provide a local source for documents, following a specific pattern. This is especially useful when you're integrating Voyant with an existing collection (it could be a subset of Gutenberg, for instance).
There are two important parameters for using local sources:
-
localSource
: a name (letter characters only) that defines the collection -
input
one or more URLs that have the filename of the local source to use
VoyantServer has a data directory (by default it's a first-level subdirectory within the zip archive that you downloaded; the location can also be overridden in the server-settings.txt file). Within that you can create a directory called trombone-local-sources
(if it's not there already) and within that would be a folder with the same name as the value you specify for localSource
(you can have several such collections and local sources).
The URL for the input assumes one of two formats:
- It's either the filename (last part of the URL) which would be a file directly under the localSource folder
- It's a subdirectory that's defined by the path that follows the
localSource
in the URL.
Examples:
localSource=gutenberg
input=http://examples.com/austen.zip
There's a local file called data/trombone-local-sources/gutenberg/austen.zip (filename)
localSource=gutenberg
input=http://examples.com/texts/emma.txt
There's a local file called data/trombone-local-sources/gutenberg/emma.txt (filename)
localSource=gutenberg
input=http://examples.com/texts/gutenberg/19th/persuasion.pdf
There's a local file called data/trombone-local-sources/gutenberg/19th/persuasion.pdf (path)
Note that if the local file can't be found, an attempt is made to fetch the given URL if it's starts with http or https, so you can use this technique even if not all the files are locally available.
Note also that with both URL formats it's possible to provide multiple input
values:
localSource=gutenberg
input=http://examples.com/texts/gutenberg/19th/persuasion.txt
input=http://examples.com/texts/gutenberg/19th/emma.txt
This functionality is not currently available in the production release but should be available with the next release.