Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build read_audio function and add audio.json #22

Closed
wants to merge 13 commits into from
30 changes: 0 additions & 30 deletions .mlc-config.json

This file was deleted.

1 change: 1 addition & 0 deletions data/testcorpus/conversations.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"/ulwa1/ulwa014": {"meta": {}, "vectors": []}}
1 change: 1 addition & 0 deletions data/testcorpus/corpus.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{}
1 change: 1 addition & 0 deletions data/testcorpus/index.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"utterances-index": {}, "speakers-index": {}, "conversations-index": {}, "overall-index": {}, "version": 1, "vectors": []}
1 change: 1 addition & 0 deletions data/testcorpus/speakers.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"Tang": {"meta": {}, "vectors": []}, "Yan": {"meta": {}, "vectors": []}}
10 changes: 10 additions & 0 deletions data/testcorpus/utterances.jsonl
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{"id": "0", "conversation_id": "/ulwa1/ulwa014", "text": "U oughs inim t\u00ef samting yan", "speaker": "Tang", "meta": {}, "reply-to": null, "timestamp": 1332718704, "vectors": []}
{"id": "1", "conversation_id": "/ulwa1/ulwa014", "text": "mbam ndul ma wandam ana", "speaker": "Yan", "meta": {}, "reply-to": null, "timestamp": 1332732704, "vectors": []}
{"id": "2", "conversation_id": "/ulwa1/ulwa014", "text": "M\u00ef inim wandam bai anapa nd", "speaker": "Tang", "meta": {}, "reply-to": null, "timestamp": 1332743704, "vectors": []}
{"id": "3", "conversation_id": "/ulwa1/ulwa014", "text": "lunda we nd\u00efm\u00efne in", "speaker": "Yan", "meta": {}, "reply-to": null, "timestamp": 1332754704, "vectors": []}
{"id": "4", "conversation_id": "/ulwa1/ulwa014", "text": "k\u00efnakape ak\u00efnaka", "speaker": "Tang", "meta": {}, "reply-to": null, "timestamp": 1332765704, "vectors": []}
{"id": "5", "conversation_id": "/ulwa1/ulwa014", "text": "coughs nd\u00efm\u00efne we ndul wa le we nd\u00eft\u00ef ak\u00efnakape malimap mat\u00ef yawa mananda", "speaker": "Yan", "meta": {}, "reply-to": null, "timestamp": 1332776704, "vectors": []}
{"id": "6", "conversation_id": "/ulwa1/ulwa014", "text": "mananda", "speaker": "Tang", "meta": {}, "reply-to": null, "timestamp": 1332787704, "vectors": []}
{"id": "7", "conversation_id": "/ulwa1/ulwa014", "text": "da", "speaker": "Yan", "meta": {}, "reply-to": null, "timestamp": 1332788704, "vectors": []}
{"id": "8", "conversation_id": "/ulwa1/ulwa014", "text": "e k\u00efkal awi ak\u00efnakape", "speaker": "Tang", "meta": {}, "reply-to": null, "timestamp": 1332789704, "vectors": []}
{"id": "9", "conversation_id": "/ulwa1/ulwa014", "text": "at\u00efm inim.", "speaker": "Yan", "meta": {}, "reply-to": null, "timestamp": 1332999704, "vectors": []}
1 change: 1 addition & 0 deletions data/ulwa json corpus/conversations.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"/ulwa1/ulwa014": {"meta": {}, "vectors": []}}
1 change: 1 addition & 0 deletions data/ulwa json corpus/corpus.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{}
1 change: 1 addition & 0 deletions data/ulwa json corpus/index.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"utterances-index": {}, "speakers-index": {}, "conversations-index": {}, "overall-index": {}, "version": 1, "vectors": []}
1 change: 1 addition & 0 deletions data/ulwa json corpus/speakers.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"Tang": {"meta": {}, "vectors": []}, "Yan": {"meta": {}, "vectors": []}}
10 changes: 10 additions & 0 deletions data/ulwa json corpus/utterances.jsonl
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{"id": "0", "conversation_id": "/ulwa1/ulwa014", "text": "U oughs inim t\u00ef samting yan", "speaker": "Tang", "meta": {}, "reply-to": null, "timestamp": 1332718704, "vectors": []}
{"id": "1", "conversation_id": "/ulwa1/ulwa014", "text": "mbam ndul ma wandam ana", "speaker": "Yan", "meta": {}, "reply-to": null, "timestamp": 1332732704, "vectors": []}
{"id": "2", "conversation_id": "/ulwa1/ulwa014", "text": "M\u00ef inim wandam bai anapa nd", "speaker": "Tang", "meta": {}, "reply-to": null, "timestamp": 1332743704, "vectors": []}
{"id": "3", "conversation_id": "/ulwa1/ulwa014", "text": "lunda we nd\u00efm\u00efne in", "speaker": "Yan", "meta": {}, "reply-to": null, "timestamp": 1332754704, "vectors": []}
{"id": "4", "conversation_id": "/ulwa1/ulwa014", "text": "k\u00efnakape ak\u00efnaka", "speaker": "Tang", "meta": {}, "reply-to": null, "timestamp": 1332765704, "vectors": []}
{"id": "5", "conversation_id": "/ulwa1/ulwa014", "text": "coughs nd\u00efm\u00efne we ndul wa le we nd\u00eft\u00ef ak\u00efnakape malimap mat\u00ef yawa mananda", "speaker": "Yan", "meta": {}, "reply-to": null, "timestamp": 1332776704, "vectors": []}
{"id": "6", "conversation_id": "/ulwa1/ulwa014", "text": "mananda", "speaker": "Tang", "meta": {}, "reply-to": null, "timestamp": 1332787704, "vectors": []}
{"id": "7", "conversation_id": "/ulwa1/ulwa014", "text": "da", "speaker": "Yan", "meta": {}, "reply-to": null, "timestamp": 1332788704, "vectors": []}
{"id": "8", "conversation_id": "/ulwa1/ulwa014", "text": "e k\u00efkal awi ak\u00efnakape", "speaker": "Tang", "meta": {}, "reply-to": null, "timestamp": 1332789704, "vectors": []}
{"id": "9", "conversation_id": "/ulwa1/ulwa014", "text": "at\u00efm inim.", "speaker": "Yan", "meta": {}, "reply-to": null, "timestamp": 1332999704, "vectors": []}
11 changes: 11 additions & 0 deletions data/ulwa_testdata_convokit_format.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
timestamp,speaker,text,translation,conversation_id,utterance_raw,reply_to
1332718704,Tang,U oughs inim tï samting yan,"Lorem ipsum dolor sit amet.",/ulwa1/ulwa014,U oughs inim tï samting yangama ul matï akïnakape,None
1332732704,Yan,mbam ndul ma wandam ana,"At neque fugit eum reprehenderit labore et exercitationem voluptatem. eos odio aspernatur.",/ulwa1/ulwa014,wimbam ndul ma wandam anapa ol welunda nïkap tu mananda yangama,None
1332743704,Tang,Mï inim wandam bai anapa nd,"a veritatis tempore sit vitae quaerat sed consequatur amet qui nisi facilis et perferendis nisi ut maiores consequatur.",/ulwa1/ulwa014,Mï inim wandam bai anapa ndïtï ka welunda unan,None
1332754704,Yan,lunda we ndïmïne in,,/ulwa1/ulwa014,ata welunda we ndïmïne ind,None
1332765704,Tang,kïnakape akïnaka,,/ulwa1/ulwa014,i akïnakape akïnakap,None
1332776704,Yan,coughs ndïmïne we ndul wa le we ndïtï akïnakape malimap matï yawa mananda,"Et illo facere vel magni necessitatibus est aspernatur numquam",/ulwa1/ulwa014,[coughs] I inim oughs ka lopop mananda bai kïkal yangama we ini,None
1332787704,Tang,mananda,,/ulwa1/ulwa014,n mananda ndïtï ka akïnakape wimbam,None
1332788704,Yan,da,,/ulwa1/ulwa014,da ndïtï ka,None
1332789704,Tang,e kïkal awi akïnakape,"onsequatur amet qui nisi facilis et perferendis nisi ut",/ulwa1/ulwa014,e kïkal awi akïnakape manï lï,None
1332999704,Yan,atïm inim.,"itae quaerat sed consequatur amet",/ulwa1/ulwa014,atïm inim.,None
11 changes: 11 additions & 0 deletions data/ulwa_testdata_sktalk_format.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
begin,end,participant,utterance,translation,source,utterance_raw
00:00:00.917,00:00:05.604,Tang,U oughs inim tï samting yan,"Lorem ipsum dolor sit amet.",/ulwa1/ulwa014,U oughs inim tï samting yangama ul matï akïnakape
00:00:04.830,00:00:09.080,Yan,mbam ndul ma wandam ana,"At neque fugit eum reprehenderit labore et exercitationem voluptatem. eos odio aspernatur.",/ulwa1/ulwa014,wimbam ndul ma wandam anapa ol welunda nïkap tu mananda yangama
00:00:06.090,00:00:09.450,Tang,Mï inim wandam bai anapa nd,"a veritatis tempore sit vitae quaerat sed consequatur amet qui nisi facilis et perferendis nisi ut maiores consequatur.",/ulwa1/ulwa014,Mï inim wandam bai anapa ndïtï ka welunda unan
00:00:09.534,00:00:10.333,Yan,lunda we ndïmïne in,,/ulwa1/ulwa014,ata welunda we ndïmïne ind
00:00:10.333,00:00:11.143,Tang,kïnakape akïnaka,,/ulwa1/ulwa014,i akïnakape akïnakap
00:00:11.143,00:00:18.240,Yan,coughs ndïmïne we ndul wa le we ndïtï akïnakape malimap matï yawa mananda,"Et illo facere vel magni necessitatibus est aspernatur numquam",/ulwa1/ulwa014,[coughs] I inim oughs ka lopop mananda bai kïkal yangama we ini
00:00:11.477,00:00:12.205,Tang,mananda,,/ulwa1/ulwa014,n mananda ndïtï ka akïnakape wimbam
00:00:14.390,00:00:15.696,Yan,da,,/ulwa1/ulwa014,da ndïtï ka
00:00:17.972,00:00:20.722,Tang,e kïkal awi akïnakape,"onsequatur amet qui nisi facilis et perferendis nisi ut",/ulwa1/ulwa014,e kïkal awi akïnakape manï lï
00:00:18.240,00:00:21.970,Yan,atïm inim.,"itae quaerat sed consequatur amet",/ulwa1/ulwa014,atïm inim.
1 change: 1 addition & 0 deletions data/vamale json corpus/conversations.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"/vamale1/vamaleE": {"meta": {}, "vectors": []}, "/vamale1/vamaleie": {"meta": {}, "vectors": []}, "/vamale1/vamale-vie": {"meta": {}, "vectors": []}, "/vamale1/vamale-MS": {"meta": {}, "vectors": []}, "/vamale1/vamaleMS": {"meta": {}, "vectors": []}, "/vamale1/vamalen-MS": {"meta": {}, "vectors": []}, "/vamale1/vamale4": {"meta": {}, "vectors": []}}
1 change: 1 addition & 0 deletions data/vamale json corpus/corpus.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{}
1 change: 1 addition & 0 deletions data/vamale json corpus/index.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"utterances-index": {}, "speakers-index": {}, "conversations-index": {}, "overall-index": {}, "version": 1, "vectors": []}
1 change: 1 addition & 0 deletions data/vamale json corpus/speakers.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"Ric": {"meta": {}, "vectors": []}, "Rie": {"meta": {}, "vectors": []}}
10 changes: 10 additions & 0 deletions data/vamale json corpus/utterances.jsonl
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{"id": "0", "conversation_id": "/vamale1/vamaleE", "text": "Aoo angovi \u0263ase aoo vinjo thamauke va nyau yanle?", "speaker": "Ric", "meta": {}, "reply-to": null, "timestamp": 1332712704, "vectors": []}
{"id": "1", "conversation_id": "/vamale1/vamaleie", "text": "ta\u0263ua va thamauke \u0272akoe va angovi \u0263ase.", "speaker": "Rie", "meta": {}, "reply-to": null, "timestamp": 1332722704, "vectors": []}
{"id": "2", "conversation_id": "/vamale1/vamale-vie", "text": "Va vinjo ta\u0263ua daa thanga \u0272akoe daa \u0263ase angovi", "speaker": "Ric", "meta": {}, "reply-to": null, "timestamp": 1332732704, "vectors": []}
{"id": "3", "conversation_id": "/vamale1/vamale-MS", "text": "a nyau thapoke va \u0272akoe", "speaker": "Rie", "meta": {}, "reply-to": null, "timestamp": 1332742704, "vectors": []}
{"id": "4", "conversation_id": "/vamale1/vamaleMS", "text": "\u0272akoe va \u0263ananda konaa va \u0263ananda vinjo ", "speaker": "Ric", "meta": {}, "reply-to": null, "timestamp": 1332755704, "vectors": []}
{"id": "5", "conversation_id": "/vamale1/vamaleMS", "text": "o \u0263ase va angovi angovi nea", "speaker": "Rie", "meta": {}, "reply-to": null, "timestamp": 1332766704, "vectors": []}
{"id": "6", "conversation_id": "/vamale1/vamalen-MS", "text": "au nyau aoo thamauke \u0263ase daa ya", "speaker": "Ric", "meta": {}, "reply-to": null, "timestamp": 1332777704, "vectors": []}
{"id": "7", "conversation_id": "/vamale1/vamaleMS", "text": "uke! Daa \u0263ananda thamauke va \u0263ase t", "speaker": "Rie", "meta": {}, "reply-to": null, "timestamp": 1332788804, "vectors": []}
{"id": "8", "conversation_id": "/vamale1/vamale4", "text": "yanle konaa daa thamauke \u0263ananda va", "speaker": "Ric", "meta": {}, "reply-to": null, "timestamp": 1332799904, "vectors": []}
{"id": "9", "conversation_id": "/vamale1/vamale4", "text": "ga nyau va vinjo konaa daa \u0263anand", "speaker": "Rie", "meta": {}, "reply-to": null, "timestamp": 1332792704, "vectors": []}
11 changes: 11 additions & 0 deletions data/vamale_testdata_convokit_format.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
timestamp,speaker,text,conversation_id,utterance_raw,reply_to
1332712704,Ric,Aoo angovi ɣase aoo vinjo thamauke va nyau yanle?,/vamale1/vamaleE,Aoo angovi ɣase aoo vinjo thamauke va nyau yanle?,None
1332722704,Rie,taɣua va thamauke ɲakoe va angovi ɣase.,/vamale1/vamaleie,ua va thamauke ɲakoe v,None
1332732704,Ric,Va vinjo taɣua daa thanga ɲakoe daa ɣase angovi,/vamale1/vamale-vie,aa thanga ɲakoe daa ɣase,None
1332742704,Rie,a nyau thapoke va ɲakoe,/vamale1/vamale-MS,thapoke va ɲak,None
1332755704,Ric,ɲakoe va ɣananda konaa va ɣananda vinjo ,/vamale1/vamaleMS,ɣnaa va ɣananda vin,None
1332766704,Rie,o ɣase va angovi angovi nea,/vamale1/vamaleMS,va angovi angovi,None
1332777704,Ric,au nyau aoo thamauke ɣase daa ya,/vamale1/vamalen-MS,limyau aoo thamauke ɣase daaa,None
1332788804,Rie,uke! Daa ɣananda thamauke va ɣase t,/vamale1/vamaleMS,ke! Daa ɣananda tham,None
1332799904,Ric,yanle konaa daa thamauke ɣananda va,/vamale1/vamale4,konaa daa,None
1332792704,Rie,ga nyau va vinjo konaa daa ɣanand,/vamale1/vamale4,au va vinjo konaa daa ɣa,None
11 changes: 11 additions & 0 deletions data/vamale_testdata_sktalk_format.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
begin,end,participant,utterance,translation,source,utterance_raw
00:01:33.740,00:01:36.200,Ric,Aoo angovi ɣase aoo vinjo thamauke va nyau yanle?,,/vamale1/vamaleE,Aoo angovi ɣase aoo vinjo thamauke va nyau yanle?
00:02:25.065,00:02:26.445,Rie,taɣua va thamauke ɲakoe va angovi ɣase.,,/vamale1/vamaleie,ua va thamauke ɲakoe v
00:02:26.385,00:02:27.925,Ric,Va vinjo taɣua daa thanga ɲakoe daa ɣase angovi,,/vamale1/vamale-vie,aa thanga ɲakoe daa ɣase
00:03:03.530,00:03:05.130,Rie,a nyau thapoke va ɲakoe,,/vamale1/vamale-MS,thapoke va ɲak
00:03:05.595,00:03:08.355,Ric,ɲakoe va ɣananda konaa va ɣananda vinjo ,,/vamale1/vamaleMS,ɣnaa va ɣananda vin
00:03:08.710,00:03:12.720,Rie,o ɣase va angovi angovi nea,,/vamale1/vamaleMS,va angovi angovi
00:03:12.950,00:03:16.760,Ric,au nyau aoo thamauke ɣase daa ya,,/vamale1/vamalen-MS,limyau aoo thamauke ɣase daaa
00:03:17.070,00:03:19.470,Rie,uke! Daa ɣananda thamauke va ɣase t,,/vamale1/vamaleMS,ke! Daa ɣananda tham
00:15:44.815,00:15:46.395,Ric,yanle konaa daa thamauke ɣananda va,,/vamale1/vamale4,konaa daa
00:26:03.605,00:26:05.345,Rie,ga nyau va vinjo konaa daa ɣanand,,/vamale1/vamale4,au va vinjo konaa daa ɣa
Loading