Skip to content

Commit

Permalink
Information about the corpus data format has been moved to the chatte…
Browse files Browse the repository at this point in the history
…rbot-corpus documentation
  • Loading branch information
gunthercox committed Aug 24, 2017
1 parent 324c312 commit b611cbd
Showing 1 changed file with 6 additions and 65 deletions.
71 changes: 6 additions & 65 deletions docs/corpus.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@ ChatterBot Corpus

This is a :term:`corpus` of dialog data that is included in the chatterbot module.

Additional information about the ``chatterbot-corpus`` module can be found
in the `ChatterBot Corpus Documentation`_.

Corpus language availability
----------------------------

Expand All @@ -18,70 +21,8 @@ check out the `chatterbot_corpus/data`_ directory in the separate chatterbot-cor

https://github.com/gunthercox/chatterbot-corpus

The :code:`chatterbot-corpus` is distributed in its own Python package so that it can
be released and upgraded independently from the :code:`chatterbot` package.


Data Format
-----------

The data file contained in ChatterBot Corpus is formatted using `YAML`_ syntax.
This format is used because it is easily readable by both humans and machines.

.. list-table:: Corpus Properties
:widths: 15 10 30
:header-rows: 1

* - Property
- Required
- Description
* - categories
- Required
- A list of categories that describe the conversations.
* - conversations
- Optional
- A list of conversations. Each conversation is denoted as a list.

Here is an example of the corpus data:

.. code-block:: yaml
:name: corpus-example.yml
categories:
- english
- greetings
conversations:
- - Hello
- Hi
- - Hello
- Hi, how are you?
- I am doing well.
- - Good day to you sir!
- Why thank you.
- - Hi, How is it going?
- It's going good, your self?
- Mighty fine, thank you.
The values in this example have the following relationships.

.. list-table:: Evaluated statement relationships
:widths: 15 40
:header-rows: 1

* - Statement
- Response
* - Hello
- Hi
* - Hello
- Hi, how are you?
* - Hi, how are you?
- I am doing well.
* - Good day to you sir!
- Why thank you.
* - Hi, How is it going?
- It's going good, your self?
* - It's going good, your self?
- Mighty fine, thank you.
The ``chatterbot-corpus`` is distributed in its own Python package so that it can
be released and upgraded independently from the ``chatterbot`` package.


Exporting your chat bot's database as a training corpus
Expand All @@ -104,4 +45,4 @@ Here is an example:
:language: python

.. _chatterbot_corpus/data: https://github.com/gunthercox/chatterbot-corpus/tree/master/chatterbot_corpus/data
.. _YAML: http://www.yaml.org/
.. _ChatterBot Corpus Documentation: http://chatterbot-corpus.readthedocs.io/

0 comments on commit b611cbd

Please sign in to comment.