GitHub - morsko1/tatoeba-parser: Export Eng sentences JSON data from Tatoeba project

tatoeba-parser

Parser works with huge amount of data by Tatoeba project.
And it generates some JSON files with data.
Here are file names with data examples:

'data.js' (rus-eng sentences)

{
    "id": 3,
    "textEng": "You can't achieve the impossible without attempting the absurd.",
    "textRus": "Нельзя достичь невозможного, не делая безумных попыток."
}

'data-eng.json'

{
   "id": "1288",
   "text": "I just don't know what to say.",
   "sentenceId": "8003976",
   "hasAudio": true
 }

'data-rus.json'

{
    "id": "243",
    "text": "Один раз в жизни я делаю хорошее дело... И оно бесполезно.",
    "sentenceId": "5507120",
    "hasAudio": true
 }

'data-eng-with-audio.json'

{
   "id": "1277",
   "text": "I have to go to sleep.",
   "sentenceId": "7960374",
   "hasAudio": true
}

'data-rus-with-audio.json'

{
   "id": "5430",
   "text": "Нелегко решать, что правильно, а что нет, но приходится это делать.",
   "sentenceId": "1596576",
   "hasAudio": true
 }

and also minified files.

How to:

download the data from the following links:
sentences
links
sentences with audio
unzip downloaded files to get following files: links.csv, sentences.csv, sentences_with_audio.csv.
create a new directory 'data-input' in the project's root folder.
put links.csv, sentences.csv, sentences_with_audio.csv to 'data-input' directory:
run 'npm start'
result will be placed to 'data-output' directory.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
main.js		main.js
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tatoeba-parser

How to:

About

Releases

Packages

Languages

morsko1/tatoeba-parser

Folders and files

Latest commit

History

Repository files navigation

tatoeba-parser

How to:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages