Skip to content

Commit

Permalink
Merge branch 'release'
Browse files Browse the repository at this point in the history
  • Loading branch information
yingzhanguri committed Jun 30, 2017
2 parents dc670ff + 053e3b4 commit dc42784
Show file tree
Hide file tree
Showing 14 changed files with 1,740 additions and 28 deletions.
10 changes: 10 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
v0.31 (2017-06-30)
------------------

- The `psamm-import` tool has been moved from the `psamm-import` package to
the main PSAMM package. This means that to import SBML files the
`psamm-import` package is no longer needed. To use the model-specific Excel
importers, the `psamm-import` package is still needed. With this release
of PSAMM, the `psamm-import` package should be updated to at least 0.16.
- The tutorial was updated with additional sections on using gap-filling
procedures on models.

v0.30 (2017-06-23)
------------------
Expand Down
8 changes: 0 additions & 8 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,15 +38,7 @@ Use ``pip`` to install (it is recommended to use a Virtualenv_):
$ pip install psamm
The ``psamm-import`` tool is developed in `a separate repository`_. After
installing PSAMM the ``psamm-import`` tool can be installed using:

.. code:: shell
$ pip install git+https://github.com/zhanglab/psamm-import.git
.. _Virtualenv: https://virtualenv.pypa.io/
.. _a separate repository: https://github.com/zhanglab/psamm-import

Documentation
-------------
Expand Down
7 changes: 5 additions & 2 deletions docs/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,11 @@ the environment by running
$ source env/bin/activate
The *psamm-import* tool is developed in a separate Git repository. After
installing PSAMM, the *psamm-import* tool can be installed using:
The *psamm-import* tool is included in the main PSAMM repository. Some
additional model specific importers for Excel format models associated
with publications are maintained in a separate repository. After
installing PSAMM, support for these import functions can be added through
installing this additional program:

.. code-block:: shell
Expand Down
119 changes: 117 additions & 2 deletions docs/tutorial/curation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -473,11 +473,126 @@ a network based optimization to identify metabolites with no production pathways
(psamm-env) $ psamm-model gapcheck --method gapfind --unrestricted-exchange
These methods included in the ``gapcheck`` function can be used to identify various kinds of
'gaps' in a metabolic model network. `PSAMM` also includes two functions for filling these gaps
'gaps' in a metabolic model network. `PSAMM` also includes three functions for filling these gaps
through the addition of artificial reactions or reactions from a supplied database. The
functions ``gapfill`` and ``fastgapfill`` can be used to perform these gapfilling procedures
functions ``gapfill``, ``fastgapfill``, and ``completepath`` can be used to perform these gapfilling procedures
during the process of generating and curating a model.

GapFill
~~~~~~~
The ``gapfill`` function in PSAMM can be used to apply a GapFill algorithm based on [Kumar07]_ to a metabolic model
to search for and identify reactions that can be added into a model to unblock the production of a specific
compound or set of compounds. To provide an example of how to utilize this ``gapfill`` function a version of
the E. coli core model has been provided in the `tutorial-part-2/Gapfilling_Model/` directory. In this directory
is the E. coli core model with a small additional, incomplete pathway, added that contains the following reactions:

.. code-block:: yaml
- id: rxn1
equation: succ_c[c] => a[c]
- id: rxn3
equation: b[c] => c[c] + d[c]
This small additional pathway converts succinate to an artificial compound 'a'. The other reaction can convert compound
'b' to 'c' and 'd'. There is no reaction to convert 'a' to 'b' though, and this can be considered a metabolic gap.
In an additional reaction database, but not included in the model itself, is an additional reaction:

- id: rxn2
equation: a[c] => b[c]


This reaction, if added would be capable of unblocking the production of 'c' or 'd', by allowing for the conversion
of compound 'a' to 'b'. In most cases when performing gap-filling on a model a larger database of non-model reactions
could be used. For this test case the production of compound 'd[c]' could be unblocked by running the following command:

.. code-block:: shell
(psamm-env) psamm-model gapfill --compound d[c]
This would produce an output that first lists all of the reactions from the original metabolic model. Then lists the
included gap-filling reactions with their associated penalty values. And lastly will list any reactions where the
gap-filling result suggests that the flux bounds of the reaction be changed. A sample of the reaction is shown below::

....
TPI Model 0 Dihydroxyacetone-phosphate[c] <=> Glyceraldehyde-3-phosphate[c]
rxn1 Model 0 Succinate[c] => a[c]
rxn3 Model 0 b[c] => c[c] + d[c]
rxn2 Add 1 a[c] => b[c]

Some additional options can be used to refine the gap-filling. The first of these options is ``--no-implicit-sinks``
option that can be added to the command. If this option is used then the gap-filling will be performed with no
implicit sinks for compounds, meaning that all compounds produced need to be consumed by other reactions in the
metabolic model. By default, if this option is not used with the command, then implicit sinks are added for all
compounds in the model meaning that any compound that is produced in excess can be removed through the added sinks.

The other way to refine the gap-filling procedure is through defining specific penalty values for the addition of
reactions from different sources. Penalties can be set for specific reactions in a gap-filling database
through a tab separated file provided in the command using the ``--penalty`` option. Additionally penalty values
for all database reactions can be set using the ``--db-penalty`` option followed by a penalty value. Similarly
penalty values can be assigned to added transport reactions using the ``--tp-penalty`` option and to added
exchange reactions using the ``--ex-penalty`` option. An example of a command that applies these penalties
to a gap-filling simulation would be like follows:

.. code-block:: shell
(psamm-env) $ psamm-model gapfill --compound d[c] --ex-penalty 100 --tp-penalty 10 --db-penalty 1
The ``gapfill`` function in PSAMM can be used through the model construction process to help identify potential
new reactions to add to a model and to explore how metabolic gaps effect the capabilities of a metabolic
network.

FastGapFill
~~~~~~~~~~~

The ``fastgapfill`` function in `PSAMM` is different gap-filling method that uses the FastGapFill algorithm
to attempt to generate a gap-filled model that is entirely flux consistent [Thiele14]_. The implementation
of this algorithm in `PSAMM` can be utilized for unblocking an entire metabolic model or for unblocking
specific reactions in a network. Often times unblocking all of the reactions in a model at the same time
will not produce the most meaningful and easy to understand results so only performing this function on a
subset of reactions is preferable. To do this the ``--subset`` option can be used to provide a file that
contains a list of reactions to unblock. In this example that list would look like this:

.. code-block:: shell
rxn1
rxn3
This file can be provided to the command to unblock the small artificial pathway that was added to the E. coli core
model:


.. code-block:: shell
(psamm-env) $ psamm-model fastgapfill --subset subset.tsv
In this case the output from this command will show the following::

....
TPI Model 0 Dihydroxyacetone-phosphate[c] <=> Glyceraldehyde-3-phosphate[c]
rxn1 Model 0 Succinate[c] => a[c]
rxn3 Model 0 b[c] => c[c] + d[c]
EX_c[e] Add 1 c[e] <=>
EX_d[e] Add 1 d[e] <=>
EX_succ_c[e] Add 1 Succinate[e] <=>
TP_c[c]_c[e] Add 1 c[c] <=> c[e]
TP_d[c]_d[e] Add 1 d[c] <=> d[e]
TP_succ_c[c]_succ_c[e] Add 1 Succinate[c] <=> Succinate[e]
rxn2 Add 1 a[c] => b[c]

The output will first list the model reactions which are labeled with the 'Model' tag in the second column
of the output. `PSAMM` will list out any artificial exchange and transporters, as well as any gap reactions
included from the larger database. These will be labeled with the `Add` tag in the second column. When compared
to the ``gapfill`` results from the previous section it can be seen that the ``fastgapfill`` result suggests
some artificial transporters and exchange reactions for certain compounds. This is due to this method
trying to find a flux consistent gap-filling solution.

Penalty values can be assigned for different types of reactions in the same way that they are in the ``gapfill``
command. With ``--ex-penalty`` for artificial exchange reactions, ``--tp-penalty`` for artificial transporters,
``--db-penalty`` for new database reactions, and penalties on specific reactions through a penalty file provided
with the ``--penalty`` option.

Search Functions in PSAMM
-------------------------

Expand Down
24 changes: 19 additions & 5 deletions docs/tutorial/import_export.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,11 +36,25 @@ The ``psamm-import`` program supports the import of models in various formats.
For the SBML format, it supports the COBRA-compliant SBML specifications, the FBC
specifications, and the basic SBML specifications in levels 1, 2, and 3;
for the JSON format, it supports the import of JSON files directly from the
`BiGG`_ database or from locally downloaded versions;
the support for importing from Excel file is model specific and are available
for 17 published models. There is also a generic Excel import for models
produced by the ModelSEED pipeline. To see a list of these models or model
formats that are supported, use the command:
`BiGG`_ database or from locally downloaded versions.

The support for importing from Excel file is model specific and are available
for 17 published models. This import requires the installation of the separate
psamm-import repository. There is also a generic Excel import for models
produced that were produced by older versions of ModelSEED. Models from the
current ModelSEED can be imported in the SBML format.

To install the ``psamm-import`` package for Excel format models use the following
command:

.. code-block:: shell
(psamm-env) $ pip install git+https://github.com/zhanglab/psamm-import.git
This install will make the Excel importers available from the command line when the
``psamm-import`` program is called.

To see a list of the models or model formats that are supported for import, use the command:

.. _BiGG: http://bigg.ucsd.edu

Expand Down
2 changes: 1 addition & 1 deletion psamm/command.py
Original file line number Diff line number Diff line change
Expand Up @@ -489,7 +489,7 @@ def main(command_class=None, args=None):
If no command class is specified the user will be able to select a specific
command through the first command line argument. If the ``args`` are
provided, these should be a list of strings that will be used instead of
``sys.argv[1]``. This is mostly useful for testing.
``sys.argv[1:]``. This is mostly useful for testing.
"""

# Set up logging for the command line interface
Expand Down
17 changes: 8 additions & 9 deletions psamm/datasource/sbml.py
Original file line number Diff line number Diff line change
Expand Up @@ -1403,12 +1403,9 @@ def convert_model_entries(
Args:
model: :class:`NativeModel`.
"""
compartment_map = {}
compound_map = {}
reaction_map = {}

def find_new_ids(entries, id_map, type_name):
def find_new_ids(entries):
"""Create new IDs for entries."""
id_map = {}
new_ids = set()
for entry in entries:
new_id = convert_id(entry)
Expand All @@ -1418,15 +1415,17 @@ def find_new_ids(entries, id_map, type_name):
else:
raise ValueError(
'Entity ID {!r} is not unique after conversion'.format(
type_name, entry.id))
entry.id))

id_map[entry.id] = new_id
new_ids.add(new_id)

return id_map

# Find new IDs for all entries
find_new_ids(model.compartments, compartment_map, 'Compartment')
find_new_ids(model.compounds, compound_map, 'Compound')
find_new_ids(model.reactions, reaction_map, 'Reaction')
compartment_map = find_new_ids(model.compartments)
compound_map = find_new_ids(model.compounds)
reaction_map = find_new_ids(model.reactions)

# Create new compartment entries
new_compartments = []
Expand Down
Loading

0 comments on commit dc42784

Please sign in to comment.