-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transcripts by file upload #969
Comments
Should we allow multiple transcript files? There might be an uploaded version and a machine generated, should we be able to have both? The element in the feed allows multiple in different formats, we could allow the same. I wonder if we should also keep track of where/which app generated a transcript...like, did this VTT come from Apple, from Descript, AWS? Should we also support human readable, but not terribly machine friendly formats like txt, MS Word docs or even pdfs? Would we have storage of individual audio file segment transcripts? I think if we generate the transcript from segments, those would get generated and stored next to the audio files in s3, like we do with other derivatives. But then what does that mean for the overall transcript? Seems like we'd need a (on demand? async?) process to make a combined transcript from them. Perhaps a field on the MediaResource could indicate if there is a transcript for it, or we could allow a MediaResource to have a Transcript... So what would the model/storage for this episode transcript look like? I'm thinking like an I think this is its own table/model, for |
Implementation (these could be broken up into separate PRs):
|
Per our discussion about Step 1 for transcripts, we want to get some basic support in Feeder. See the doc: https://docs.google.com/document/d/1-0bOHkWMd0LUKt-1sw5h-oUlqwAdVYmtqxPyk4UGEpE/edit#bookmark=id.9j3sg0ayyaju
Tasks (separate issues/PRs)
transcripts
table/model #1047CopyTranscriptTask
as a subclass ofTask
#1048The text was updated successfully, but these errors were encountered: