Add documentation about code style and creating new recipes. (#27)

k2-fsa · Aug 25, 2021 · 184dbb3 · 184dbb3
1 parent 96e7f5c
commit 184dbb3
Show file tree

Hide file tree

Showing 16 changed files with 304 additions and 29 deletions.
diff --git a/.flake8 b/.flake8
@@ -5,7 +5,6 @@ max-line-length = 80
 per-file-ignores =
     # line too long
     egs/librispeech/ASR/conformer_ctc/conformer.py: E501,
-    egs/librispeech/ASR/conformer_ctc/decode.py: E501,
 
 exclude =
   .git,

diff --git a/.github/workflows/style_check.yml b/.github/workflows/style_check.yml
@@ -45,7 +45,7 @@ jobs:
 
       - name: Install Python dependencies
         run: |
-          python3 -m pip install --upgrade pip black flake8
+          python3 -m pip install --upgrade pip black==21.6b0 flake8==3.9.2
 
       - name: Run flake8
         shell: bash

diff --git a/docs/requirements.txt b/docs/requirements.txt
@@ -1 +1,2 @@
 sphinx_rtd_theme
+sphinx
diff --git a/docs/source/contributing/code-style.rst b/docs/source/contributing/code-style.rst
@@ -0,0 +1,67 @@
+.. _follow the code style:
+
+Follow the code style
+=====================
+
+We use the following tools to make the code style to be as consistent as possible:
+
+  - `black <https://github.com/psf/black>`_, to format the code
+  - `flake8 <https://github.com/PyCQA/flake8>`_, to check the style and quality of the code
+  - `isort <https://github.com/PyCQA/isort>`_, to sort ``imports``
+
+The following versions of the above tools are used:
+
+  - ``black == 12.6b0``
+  - ``flake8 == 3.9.2``
+  - ``isort == 5.9.2``
+
+After running the following commands:
+
+  .. code-block::
+
+    $ git clone https://github.com/k2-fsa/icefall
+    $ cd icefall
+    $ pip install pre-commit
+    $ pre-commit install
+
+it will run the following checks whenever you run ``git commit``, **automatically**:
+
+    .. figure:: images/pre-commit-check.png
+       :width: 600
+       :align: center
+
+       pre-commit hooks invoked by ``git commit`` (Failed).
+
+If any of the above checks failed, your ``git commit`` was not successful.
+Please fix any issues reported by the check tools.
+
+.. HINT::
+
+  Some of the check tools, i.e., ``black`` and ``isort`` will modify
+  the files to be commited **in-place**. So please run ``git status``
+  after failure to see which file has been modified by the tools
+  before you make any further changes.
+
+After fixing all the failures, run ``git commit`` again and
+it should succeed this time:
+
+    .. figure:: images/pre-commit-check-success.png
+       :width: 600
+       :align: center
+
+       pre-commit hooks invoked by ``git commit`` (Succeeded).
+
+If you want to check the style of your code before ``git commit``, you
+can do the following:
+
+  .. code-block:: bash
+
+    $ cd icefall
+    $ pip install black==21.6b0 flake8==3.9.2 isort==5.9.2
+    $ black --check your_changed_file.py
+    $ black your_changed_file.py  # modify it in-place
+    $
+    $ flake8 your_changed_file.py
+    $
+    $ isort --check your_changed_file.py  # modify it in-place
+    $ isort your_changed_file.py
diff --git a/docs/source/contributing/doc.rst b/docs/source/contributing/doc.rst
@@ -0,0 +1,45 @@
+Contributing to Documentation
+=============================
+
+We use `sphinx <https://www.sphinx-doc.org/en/master/>`_
+for documentation.
+
+Before writing documentation, you have to prepare the environment:
+
+  .. code-block:: bash
+
+    $ cd docs
+    $ pip install -r requirements.txt
+
+After setting up the environment, you are ready to write documentation.
+Please refer to `reStructuredText Primer <https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html>`_
+if you are not familiar with ``reStructuredText``.
+
+After writing some documentation, you can build the documentation **locally**
+to preview what it looks like if it is published:
+
+  .. code-block:: bash
+
+    $ cd docs
+    $ make html
+
+The generated documentation is in ``docs/build/html`` and can be viewed
+with the following commands:
+
+  .. code-block:: bash
+
+    $ cd docs/build/html
+    $ python3 -m http.server
+
+It will print::
+
+  Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
+
+Open your browser, go to `<http://0.0.0.0:8000/>`_, and you will see
+the following:
+
+    .. figure:: images/doc-contrib.png
+       :width: 600
+       :align: center
+
+       View generated documentation locally with ``python3 -m http.server``.
diff --git a/docs/source/contributing/how-to-create-a-recipe.rst b/docs/source/contributing/how-to-create-a-recipe.rst
@@ -0,0 +1,156 @@
+How to create a recipe
+======================
+
+.. HINT::
+
+  Please read :ref:`follow the code style` to adjust your code sytle.
+
+.. CAUTION::
+
+  ``icefall`` is designed to be as Pythonic as possible. Please use
+  Python in your recipe if possible.
+
+Data Preparation
+----------------
+
+We recommend you to prepare your training/test/validate dataset
+with `lhotse <https://github.com/lhotse-speech/lhotse>`_.
+
+Please refer to `<https://lhotse.readthedocs.io/en/latest/index.html>`_
+for how to create a recipe in ``lhotse``.
+
+.. HINT::
+
+  The ``yesno`` recipe in ``lhotse`` is a very good example.
+
+  Please refer to `<https://github.com/lhotse-speech/lhotse/pull/380>`_,
+  which shows how to add a new recipe to ``lhotse``.
+
+Suppose you would like to add a recipe for a dataset named ``foo``.
+You can do the following:
+
+.. code-block::
+
+  $ cd egs
+  $ mkdir -p foo/ASR
+  $ cd foo/ASR
+  $ touch prepare.sh
+  $ chmod +x prepare.sh
+
+If your dataset is very simple, please follow
+`egs/yesno/ASR/prepare.sh <https://github.com/k2-fsa/icefall/blob/master/egs/yesno/ASR/prepare.sh>`_
+to write your own ``prepare.sh``.
+Otherwise, please refer to
+`egs/librispeech/ASR/prepare.sh <https://github.com/k2-fsa/icefall/blob/master/egs/yesno/ASR/prepare.sh>`_
+to prepare your data.
+
+
+Training
+--------
+
+Assume you have a fancy model, called ``bar`` for the ``foo`` recipe, you can
+organize your files in the following way:
+
+.. code-block::
+
+  $ cd egs/foo/ASR
+  $ mkdir bar
+  $ cd bar
+  $ tourch README.md model.py train.py decode.py asr_datamodule.py pretrained.py
+
+For instance , the ``yesno`` recipe has a ``tdnn`` model and its directory structure
+looks like the following:
+
+.. code-block:: bash
+
+  egs/yesno/ASR/tdnn/
+  |-- README.md
+  |-- asr_datamodule.py
+  |-- decode.py
+  |-- model.py
+  |-- pretrained.py
+  `-- train.py
+
+**File description**:
+
+  - ``README.md``
+
+    It contains information of this recipe, e.g., how to run it, what the WER is, etc.
+
+  - ``asr_datamodule.py``
+
+    It provides code to create PyTorch dataloaders with train/test/validation dataset.
+
+  - ``decode.py``
+
+    It takes as inputs the checkpoints saved during the training stage to decode the test
+    dataset(s).
+
+  - ``model.py``
+
+    It contains the definition of your fancy neural network model.
+
+  - ``pretrained.py``
+
+    We can use this script to do inference with a pre-trained model.
+
+  - ``train.py``
+
+    It contains training code.
+
+
+.. HINT::
+
+  Please take a look at
+
+    - `egs/yesno/tdnn <https://github.com/k2-fsa/icefall/tree/master/egs/yesno/ASR/tdnn>`_
+    - `egs/librispeech/tdnn_lstm_ctc <https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/tdnn_lstm_ctc>`_
+    - `egs/librispeech/conformer_ctc <https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/conformer_ctc>`_
+
+  to get a feel what the resulting files look like.
+
+.. NOTE::
+
+  Every model in a recipe is kept to be as self-contained as possible.
+  We tolerate duplicate code among different recipes.
+
+
+The training stage should be invocable by:
+
+  .. code-block::
+
+    $ cd egs/foo/ASR
+    $ ./bar/train.py
+    $ ./bar/train.py --help
+
+
+Decoding
+--------
+
+Please refer to
+
+  - `<https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/conformer_ctc/decode.py>`_
+
+    If your model is transformer/conformer based.
+
+  - `<https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/tdnn_lstm_ctc/decode.py>`_
+
+    If your model is TDNN/LSTM based, i.e., there is no attention decoder.
+
+  - `<https://github.com/k2-fsa/icefall/blob/master/egs/yesno/ASR/tdnn/decode.py>`_
+
+    If there is no LM rescoring.
+
+The decoding stage should be invocable by:
+
+  .. code-block::
+
+    $ cd egs/foo/ASR
+    $ ./bar/decode.py
+    $ ./bar/decode.py --help
+
+Pre-trained model
+-----------------
+
+Please demonstrate how to use your model for inference in ``egs/foo/ASR/bar/pretrained.py``.
+If possible, please consider creating a Colab notebook to show that.
diff --git a/docs/source/contributing/images/doc-contrib.png b/docs/source/contributing/images/doc-contrib.png
diff --git a/docs/source/contributing/images/pre-commit-check-success.png b/docs/source/contributing/images/pre-commit-check-success.png
diff --git a/docs/source/contributing/images/pre-commit-check.png b/docs/source/contributing/images/pre-commit-check.png
diff --git a/docs/source/contributing/index.rst b/docs/source/contributing/index.rst
@@ -0,0 +1,22 @@
+Contributing
+============
+
+Contributions to ``icefall`` are very welcomed.
+There are many possible ways to make contributions and
+two of them are:
+
+  - To write documentation
+  - To write code
+
+    - (1) To follow the code style in the repository
+    - (2) To write a new recipe
+
+In this page, we describe how to contribute documentation
+and code to ``icefall``.
+
+.. toctree::
+   :maxdepth: 2
+
+   doc
+   code-style
+   how-to-create-a-recipe
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -22,3 +22,4 @@ speech recognition recipes using `k2 <https://github.com/k2-fsa/k2>`_.
 
    installation/index
    recipes/index
+   contributing/index
diff --git a/docs/source/installation/index.rst b/docs/source/installation/index.rst
@@ -202,22 +202,6 @@ The following shows an example about setting up the environment.
   valtree-3.1.0 lhotse-0.8.0.dev-2a1410b-clean lilcom-1.1.1 numpy-1.21.2 packaging-21.0 pycparser-2.20 pyparsing-2.4.7 pyyaml-5.4.1 sor
   tedcontainers-2.4.0 toolz-0.11.1 torchaudio-0.9.0 tqdm-4.62.1
 
-**NOTE**: After installing ``lhotse``, you will encounter the following error:
-
-.. code-block::
-
-  $ lhotse download --help
-  -bash: /ceph-fj/fangjun/test-icefall/bin/lhotse: python: bad interpreter: No such file or directory
-
-The correct fix is:
-
-.. code-block::
-
-  echo '#!/usr/bin/env python3' | cat - $(which lhotse) > /tmp/lhotse-bin
-  chmod +x /tmp/lhotse-bin
-  mv /tmp/lhotse-bin $(which lhotse)
-
-
 (5) Download icefall
 ~~~~~~~~~~~~~~~~~~~~
 

diff --git a/egs/librispeech/ASR/conformer_ctc/decode.py b/egs/librispeech/ASR/conformer_ctc/decode.py
@@ -78,16 +78,16 @@ def get_parser():
         Supported values are:
             - (1) 1best. Extract the best path from the decoding lattice as the
               decoding result.
-            - (2) nbest. Extract n paths from the decoding lattice; the path with
-              the highest score is the decoding result.
+            - (2) nbest. Extract n paths from the decoding lattice; the path
+              with the highest score is the decoding result.
             - (3) nbest-rescoring. Extract n paths from the decoding lattice,
               rescore them with an n-gram LM (e.g., a 4-gram LM), the path with
               the highest score is the decoding result.
-            - (4) whole-lattice-rescoring. Rescore the decoding lattice with an n-gram LM
-              (e.g., a 4-gram LM), the best path of rescored lattice is the
-              decoding result.
-            - (5) attention-decoder. Extract n paths from the LM rescored lattice,
-              the path with the highest score is the decoding result.
+            - (4) whole-lattice-rescoring. Rescore the decoding lattice with an
+              n-gram LM (e.g., a 4-gram LM), the best path of rescored lattice
+              is the decoding result.
+            - (5) attention-decoder. Extract n paths from the LM rescored
+              lattice, the path with the highest score is the decoding result.
             - (6) nbest-oracle. Its WER is the lower bound of any n-best
               rescoring method can achieve. Useful for debugging n-best
               rescoring method.

diff --git a/egs/librispeech/ASR/tdnn_lstm_ctc/decode.py b/egs/librispeech/ASR/tdnn_lstm_ctc/decode.py
@@ -42,8 +42,8 @@
     get_texts,
     setup_logger,
     store_transcripts,
-    write_error_stats,
     str2bool,
+    write_error_stats,
 )
 
 
@@ -98,7 +98,7 @@ def get_params() -> AttributeDict:
             #  - nbest
             #  - nbest-rescoring
             #  - whole-lattice-rescoring
-            "method": "1best",
+            "method": "whole-lattice-rescoring",
             # num_paths is used when method is "nbest" and "nbest-rescoring"
             "num_paths": 30,
         }

diff --git a/egs/yesno/ASR/README.md b/egs/yesno/ASR/README.md
@@ -10,5 +10,5 @@ get the following WER:
 ```
 
 Please refer to
-<https://icefal1.readthedocs.io/en/latest/recipes/yesno.html>
+<https://icefall.readthedocs.io/en/latest/recipes/yesno.html>
 for detailed instructions.