Skip to content

Commit

Permalink
Fix AN4 dataset links (#6926)
Browse files Browse the repository at this point in the history
* Fix an4 dataset link in docs

Signed-off-by: Vladimir Bataev <[email protected]>

* Remove broken a4 dataset links from tutorials

Signed-off-by: Vladimir Bataev <[email protected]>

---------

Signed-off-by: Vladimir Bataev <[email protected]>
  • Loading branch information
artbataev authored Jun 28, 2023
1 parent 3b4f37a commit 69747d8
Show file tree
Hide file tree
Showing 8 changed files with 8 additions and 8 deletions.
2 changes: 1 addition & 1 deletion docs/source/asr/datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ AN4 Dataset
This is a small dataset recorded and distributed by Carnegie Mellon University. It consists of recordings of people spelling out
addresses, names, etc. Information about this dataset can be found on the `official CMU site <http://www.speech.cs.cmu.edu/databases/an4/>`_.

#. `Download and extract the dataset <http://www.speech.cs.cmu.edu/databases/an4/an4_sphere.tar.gz>`_ (which is labeled "NIST's Sphere audio (.sph) format (64M)".
#. `Download and extract the dataset <https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz>`_ (which is labeled "NIST's Sphere audio (.sph) format (64M)".

#. Convert the ``.sph`` files to ``.wav`` using sox, and build one training and one test manifest.

Expand Down
2 changes: 1 addition & 1 deletion tutorials/asr/ASR_for_telephony_speech.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@
"# Download the dataset. This will take a few moments...\n",
"print(\"******\")\n",
"if not os.path.exists(data_dir + '/an4_sphere.tar.gz'):\n",
" an4_url = 'https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz' # for the original source, please visit http://www.speech.cs.cmu.edu/databases/an4/an4_sphere.tar.gz \n",
" an4_url = 'https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz'\n",
" an4_path = wget.download(an4_url, data_dir)\n",
" print(f\"Dataset downloaded at: {an4_path}\")\n",
"else:\n",
Expand Down
2 changes: 1 addition & 1 deletion tutorials/asr/ASR_with_NeMo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@
"# Download the dataset. This will take a few moments...\n",
"print(\"******\")\n",
"if not os.path.exists(data_dir + '/an4_sphere.tar.gz'):\n",
" an4_url = 'https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz' # for the original source, please visit http://www.speech.cs.cmu.edu/databases/an4/an4_sphere.tar.gz \n",
" an4_url = 'https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz'\n",
" an4_path = wget.download(an4_url, data_dir)\n",
" print(f\"Dataset downloaded at: {an4_path}\")\n",
"else:\n",
Expand Down
2 changes: 1 addition & 1 deletion tutorials/asr/ASR_with_Subword_Tokenization.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -372,7 +372,7 @@
"# Download the dataset. This will take a few moments...\r\n",
"print(\"******\")\r\n",
"if not os.path.exists(data_dir + '/an4_sphere.tar.gz'):\r\n",
" an4_url = 'https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz' # for the original source, please visit http://www.speech.cs.cmu.edu/databases/an4/an4_sphere.tar.gz \r\n",
" an4_url = 'https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz'\r\n",
" an4_path = wget.download(an4_url, data_dir)\r\n",
" print(f\"Dataset downloaded at: {an4_path}\")\r\n",
"else:\r\n",
Expand Down
2 changes: 1 addition & 1 deletion tutorials/asr/ASR_with_Transducers.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@
"# Download the dataset. This will take a few moments...\n",
"print(\"******\")\n",
"if not os.path.exists(data_dir + '/an4_sphere.tar.gz'):\n",
" an4_url = 'https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz' # for the original source, please visit http://www.speech.cs.cmu.edu/databases/an4/an4_sphere.tar.gz \n",
" an4_url = 'https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz'\n",
" an4_path = wget.download(an4_url, data_dir)\n",
" print(f\"Dataset downloaded at: {an4_path}\")\n",
"else:\n",
Expand Down
2 changes: 1 addition & 1 deletion tutorials/asr/Online_Noise_Augmentation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@
"# Download the dataset. This will take a few moments...\n",
"print(\"******\")\n",
"if not os.path.exists(data_dir + '/an4_sphere.tar.gz'):\n",
" an4_url = 'https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz' # for the original source, please visit http://www.speech.cs.cmu.edu/databases/an4/an4_sphere.tar.gz \n",
" an4_url = 'https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz'\n",
" an4_path = wget.download(an4_url, data_dir)\n",
" print(f\"Dataset downloaded at: {an4_path}\")\n",
"else:\n",
Expand Down
2 changes: 1 addition & 1 deletion tutorials/asr/asr_adapters/ASR_with_Adapters.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@
"# Download the dataset. This will take a few moments...\n",
"print(\"******\")\n",
"if not os.path.exists(data_dir + '/an4_sphere.tar.gz'):\n",
" an4_url = 'https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz' # for the original source, please visit http://www.speech.cs.cmu.edu/databases/an4/an4_sphere.tar.gz \n",
" an4_url = 'https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz'\n",
" an4_path = wget.download(an4_url, data_dir)\n",
" print(f\"Dataset downloaded at: {an4_path}\")\n",
"else:\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@
"# Download the dataset. This will take a few moments...\n",
"print(\"******\")\n",
"if not os.path.exists(data_dir + '/an4_sphere.tar.gz'):\n",
" an4_url = 'https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz' # for the original source, please visit http://www.speech.cs.cmu.edu/databases/an4/an4_sphere.tar.gz \n",
" an4_url = 'https://dldata-public.s3.us-east-2.amazonaws.com/an4_sphere.tar.gz'\n",
" an4_path = wget.download(an4_url, data_dir)\n",
" print(f\"Dataset downloaded at: {an4_path}\")\n",
"else:\n",
Expand Down

0 comments on commit 69747d8

Please sign in to comment.