Skip to content

Commit

Permalink
Upload documentation from latest commit
Browse files Browse the repository at this point in the history
  • Loading branch information
actions-user committed May 15, 2024
1 parent 59f3be5 commit 2b004a2
Show file tree
Hide file tree
Showing 16 changed files with 272 additions and 70 deletions.
4 changes: 2 additions & 2 deletions _sources/about.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ If you want to learn more about the idea behind `matchmaps`, along with some exa

> [MatchMaps: Non-isomorphous difference maps for X-ray crystallography](https://www.biorxiv.org/content/10.1101/2023.09.01.555333v2)
If what you're looking for is a user's guide, you can find that [here](quickstart.md). But if you're looking for more details about how `matchmaps` works, read on!
If you're looking for a user guide, you can find that [here](quickstart.md). But if you're looking for more details about how `matchmaps` works, read on!

## Abstract
Conformational change mediates the biological functions of macromolecules. Crystal-lographic measurements can map these changes with extraordinary sensitivity as a function of mutations, ligands, and time. The isomorphous difference map remains the gold standard for detecting structural differences between datasets. Isomorphous difference maps combine the phases of a chosen reference state with the observed changes in structure factor amplitudes to yield a map of changes in electron density. Such maps are much more sensitive to conformational change than structure refinement is, and are unbiased in the sense that observed differences do not depend on refinement of the perturbed state. However, even minute changes in unit cell properties can render isomorphous difference maps useless. This is unnecessary. Here we describe a generalized procedure for calculating observed difference maps that retains the high sensitivity to conformational change and avoids structure refinement of the perturbed state. We have implemented this procedure in an open-source python package, MatchMaps, that can be run in any software environment supporting PHENIX and CCP4. Through examples, we show that MatchMaps “rescues” observed difference electron density maps for poorly-isomorphous crystals, corrects artifacts in nominally isomorphous difference maps, and extends to detecting differences across copies within the asymmetric unit, or across altogether different crystal forms.
Conformational change mediates the biological functions of macromolecules. Crystallographic measurements can map these changes with extraordinary sensitivity as a function of mutations, ligands, and time. A popular method for detecting structural differences between crystallographic datasets is the isomorphous difference map. Isomorphous difference maps combine the phases of a chosen reference state with the observed changes in structure factor amplitudes to yield a map of changes in electron density. Such maps are much more sensitive to conformational change than structure refinement is, and are unbiased in the sense that observed differences do not depend on refinement of the perturbed state. However, even modest changes in unit cell properties can render isomorphous difference maps useless. This is unnecessary. Here we describe a generalized procedure for calculating observed difference maps that retains the high sensitivity to conformational change and avoids structure refinement of the perturbed state. We have implemented this procedure in an open-source python package, MatchMaps, that can be run in any software environment supporting PHENIX and CCP4. Through examples, we show that MatchMaps "rescues" observed difference electron density maps for poorly-isomorphous crystals, corrects artifacts in nominally isomorphous difference maps, and extends to detecting differences across copies within the asymmetric unit, or across altogether different crystal forms.

## Algorithm overview

Expand Down
4 changes: 3 additions & 1 deletion _sources/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,10 @@ This software is part of the [Reciprocal Space Station](https://rs-station.githu
:hidden:
Quickstart guide <quickstart>
Command-line options <cli>
Troubleshooting and advanced usage <troubleshooting>
Visualizing results <visualization>
About the algorithm <about>
Full command-line API <cli>
```
16 changes: 2 additions & 14 deletions _sources/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,15 +30,15 @@ Though `matchmaps` is a python package, it relies on two pieces of external soft

```{eval-rst}
.. note::
Please note that phenix 1.21 is **not** supported at this time. I hope to update `matchmaps` to support this in the near future. In the meantime, make sure you're using phenix 1.20 or earlier, and everything should work fine.
Please note that phenix 1.21 is **not** supported at this time. I hope to update `matchmaps` to support this in the near future. In the meantime, make sure you're using phenix 1.20 or earlier, and everything should work fine. As of `matchmaps 0.6.3`, an error should be thrown if you're using phenix 1.21 reminding you to downgrade.
```
When actually using `matchmaps` in the command-line, you'll need to have both ccp4 and phenix active. Doing that will look something like:
```bash
source /path/to/phenix/phenix_env.sh
/path/to/ccp4/start
```

At this point, you should be good to go! Please [file an issue on github](https://github.com/dennisbrookner/matchmaps/issues) is this is not working.
At this point, you should be good to go! Please [file an issue on github](https://github.com/rs-station/matchmaps/issues) is this is not working.

## Input files

Expand Down Expand Up @@ -116,24 +116,12 @@ After your `matchmaps` run completes successfully, it will write out a file (cal

If you'd then like to run `matchmaps` again with slightly different parameters, you can use this script as a starting point. No need to remember exactly which parameters you used the first time!

## Other useful options

- `--on-as-stationary`: The `matchmaps` algorithm always involves an alignment in real-space of the "on" and "off" maps. By default, the "off" map is stationary, and the "on" map is moved. This is typically desired, such that everything lines up with your "off" structural model. However, say that your structures are "apo" and "bound", and you would like to line up your maps with a "bound" structure (which you never have to supply to `matchmaps`!). In this case, you could use the `--on-as-stationary` flag.
- `--dmin`: The input `mtz` files are truncated to equal resolution by default. If you would like, the `mtz`s may be truncated even more stringently.
- `--unmasked-radius`: How far away from the protein model do you expect to see difference signal? Use this flag to change the behavior of the `_unmasked` difference map output to show more or less signal far from the protein. Defaults to 5 A. See [below](#important-map-outputs) for more details.
- `--no-bss`: If included, skip the bulk solvent scaling step of phenix.refine. Like `--unmasked-radius`, this option may be useful in situtations where you expect signal far away from your protein model. For example, bulk solvent scaling may "flatten" or otherwise alter signal for an unmodeled bound ligand.
- `--spacing`: This flag defines the approximate size of the voxels in your real-space maps. The default (0.5 A) is fine for most purposes. For making figures in PyMOL, you might want finer spacing (0.25 A or so); this comes at a cost of much larger file size. If your computer/coot is being really slow, you could consider increasing the spacing.
- `--verbose`: Use this option to print out to the terminal all of the log output from CCP4 and phenix. This is disabled by default because it's very annoying, but it can be useful for debugging purposes.
- `--rbr-selections`: When doing rigid-body refinement, refine as multiple explicitly defined rigid bodies rather than a single rigid body containing everything. This flag is admittedly a little finnicky; please [file an issue](https://github.com/rs-station/matchmaps/issues) if you have any trouble.

Note that most of the command-line options have short and long versions, e.g. `-i` vs. `--input-dir`. For clarity, the long names have been used exclusively on this page. The [full documentation](cli.md) lits all short and long options.

## Output files

Below is a quick tour of the output files that `matchmaps` will produce and what you might want to do with them.

### Important `.map` outputs

Let's assume that your input files are called `off.mtz` and `on.mtz`. The following files created by `matchmaps` may be of interest:

- `on_minus_off.map`: This is your difference map! It should contain positive and negative signal in the vicinity (>= 2 Angstroms) of your protein model.
Expand Down
49 changes: 49 additions & 0 deletions _sources/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Troubleshooting and advanced usage

If you've read through the [quickstart guide](quickstart.md) and you still have questions or issues, read on for more information.

## Solvent masking and bulk-solvent scaling
The goal of `matchmaps` is to visualize difference density around the protein molecule. However, the question remains: what does "around" mean exactly? To address this ambiguity, `matchmaps` writes out two difference maps. The main difference map is solvent masked pretty strictly and only contains signal within 2 A of the protein model. The second difference map, which will include the suffix `_unmasked`, is solvent masked more laxly. The default masking radius for the `_unmasked` map is 5 A, but you can change this value using the `--unmasked-radius` flag. This might be useful if you expect to see signal farther away from your protein model, e.g. a bound ligand.

Another important parameter for visualizing difference density far from your protein model is bulk-solvent scaling (BSS). BSS is a typical feature of crystallographic refinement, and including it will typically result in better refinement statistics and nicer maps. By default, `phenix.refine` will perform BSS prior to rigid-body refinement. However, if you expect to see signal far away from your protein model, you may find that BSS will "flatten" or otherwise alter this signal. You can disable BSS using the `--no-bss` flag. I strongly recommend this when analyzing an apo/bound pair via `matchmaps`.

Also, for using `matchmaps` with bound ligands, see the `--on-as-stationary` flag [below](#miscellaneous-useful-options).

### Symmetry-related molecules

A side effect of the `matchmaps` real-space alignment approach is that while the ON and OFF models will end up aligned, the *symmetry mates* of the ON and OFF models will necessarily end up *misaligned*. This is a big reason that solvent masking is so important: otherwise, difference density for the misaligned symmetry mates will dominate your maps. Unfortunately, even after strict solvent masking, your final difference map is likely to contain this artifactual signal. The good news is that this artifactual signal is pretty easy to identify and disregard. The bad news is that if you're using a feature like Coot's "find difference peaks," you're likely to see these a lot.

In the medium-to-long term, I aim for the next generation of `matchmaps` masking approaches to resolve this issue more elegantly. If you have ideas about this, or if you just want to bug me about it, I would defintely welcome an [issue](https://github.com/rs-station/matchmaps/issues) and/or pull request on GitHub! In the shorter term, if this signal is becoming a large issue, you could try applying an even stricter solvent radius (say, 1 A) to the `--unmasked-radius` flag and see if that helps.

## Resolution cuts and error weighting
By default, `matchmaps` will simply truncate the two input reflection files to equal resolution. However, if you expect the highest-resolution reflections to still be noisy after this, or if your difference maps look very noisy, you might consider cutting resolution even further. You can do this by providing a resolution cut to the `--dmin` flag.

Alternatively, rather than just truncating at a particular resolution, you can apply an error weighting to the reflections. Error weighting is performed immediately prior to the Fourier transform. Weights are computed via the following formula:

```{eval-rst}
.. math::
\frac{1}{1 + \alpha \frac{{(\sigma F)}^2}{{<\sigma F>}^2}}
```

where $\sigma F$ is the error estimate for the reflection, $<\sigma F>$ is the average error estimate across all reflections, and $\alpha$ is a weighting parameter. By default, $\alpha = 0$, e.g. no weighting. You can use the `--alpha` flag to supply your own value and thus include error-weighting.

## Multiple protein chains

You may have multiple protein chains in your model. This presents you with some interesting opportunities for difference maps, but also some potential headaches.

### Refining chains individually

One possibility is that your `matchmaps` difference map contains essentially no signal for one protein chain, but strong signal throughout the other chain indicating a global motion. This signal is "real" -- it tells you that the relative packing of these two chains together is different in your two datasets -- but it probably doesn't make for a very useful map. Instead, you might consider using the `--rbr-selections` flag to rigid-body refine each chain separately. If your chains are A and B, then you would use the flag as `--rbr-selections A B`. `matchmaps` will then produce a difference map specific to each chain. This flag is admittedly a little finnicky; please [file an issue](https://github.com/rs-station/matchmaps/issues) if you have any trouble.

### Comparing chains to each other

Assuming your data is some sort of homo-multimer / non-crystallographic symmetry, another option is to use `matchmaps.ncs` to compare the protein chains within a dataset against each other. If you have multiple datasets, of course, you could still run `matchmaps.ncs` on each dataset and see how the difference map changes. See the full documentation for `matchmaps.ncs` [here](cli.md#matchmaps-ncs)

## Miscellaneous useful options

- `--on-as-stationary`: The `matchmaps` algorithm always involves an alignment in real-space of the "on" and "off" maps. By default, the "off" map is stationary, and the "on" map is moved. This is typically desired, such that everything lines up with your "off" structural model. However, say that your structures are "apo" and "bound", and you would like to line up your maps with a "bound" structure (which you never have to supply to `matchmaps`!). In this case, you could use the `--on-as-stationary` flag.
- `--spacing`: This flag defines the approximate size of the voxels in your real-space maps. The default (0.5 A) is fine for most purposes. For making figures in PyMOL, you might want finer spacing (0.25 A or so); this comes at a cost of much larger file size. If your computer/coot is being really slow, you could consider increasing the spacing.
- `--verbose`: Use this option to print out to the terminal all of the log output from CCP4 and phenix. This is disabled by default because it's very annoying, but it can be useful for debugging purposes.
- `--eff`: The `matchmaps` source code contains a hard-coded `.eff` template which is modified, written to a file, and passed to `phenix.refine`. For most cases, this `.eff` template should do the trick. However, if there's something specific that you would like phenix to do, you can pass your own custom `.eff` template to `matchmaps` via the `--eff` flag. There is a lot of potential for error here, because the code has very specific expectations for what the `.eff` file contains. If you're interested in trying this out, I would recommend that you use the [`.eff` template in the source code](https://github.com/rs-station/matchmaps/blob/7531ff1b13da91b01ede273fa9b1f5a99d72a5ca/src/matchmaps/_utils.py#L227) as a starting point. Don't hesitate to [file an issue on GitHub](https://github.com/rs-station/matchmaps/issues) if anything isn't working. Depending on what you're trying to do, we may decide to try and implement your desired functionality directly, so you don't need to provide a custom `.eff` in the long run.

2 changes: 1 addition & 1 deletion _sources/visualization.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Happy difference mapping! From here, working with a `.map` file should be no dif

### Periodic boundary conditions in Coot

`matchmaps` always produces outputs in spacegroup P1, and thus will sometimes produces outputs in P1 and with an orthorhombic unit cell. Unfortunately, Coot assumes that P1 orthorhombic `.map` files are cryo-EM maps, and thus does not render periodic boundary conditions! This is an issue, because often we wish to visualize parts of the protein model which lie outside of the unit cell. As a workaround, any P1 orthorhombic maps produced by `matchmaps` will artifically set `alpha=90.006`. This difference is imperceptible, but is enough to convince Coot that the map is crystallographic in origin. In all likelihood, a user of `matchmaps` will not need to deal with this issue directly. However, it is good to keep in mind in case you are ever working with a `matchmaps` output for another purpose downstream, or if a related bug arises. Of course, if your maps in Coot are ever abruptly ending at the edge of the unit cells, please [file an issue on GitHub](https://github.com/rs-station/matchmaps/issues).
`matchmaps` always produces outputs in spacegroup P1, and thus sometimes produces outputs in P1 and with an orthorhombic unit cell. Unfortunately, Coot assumes that P1 orthorhombic `.map` files are cryo-EM maps, and thus does not render periodic boundary conditions! This is an issue, because often we wish to visualize parts of the protein model which lie outside of the unit cell. As a workaround, any P1 orthorhombic maps produced by `matchmaps` will artifically set `alpha=90.006`. This difference is imperceptible, but is enough to convince Coot that the map is crystallographic in origin. In all likelihood, a user of `matchmaps` will not need to deal with this issue directly. However, it is good to keep in mind in case you are ever working with a `matchmaps` output for another purpose downstream, or if a related bug arises. Of course, if your maps in Coot are ever abruptly ending at the edge of the unit cells, please [file an issue on GitHub](https://github.com/rs-station/matchmaps/issues).

## Working with `.map` files in PyMOL

Expand Down
2 changes: 1 addition & 1 deletion _static/documentation_options.js
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
const DOCUMENTATION_OPTIONS = {
VERSION: '0.1.dev1+g7531ff1',
VERSION: '0.1.dev1+ge96248f',
LANGUAGE: 'en',
COLLAPSE_INDEX: false,
BUILDER: 'html',
Expand Down
Loading

0 comments on commit 2b004a2

Please sign in to comment.