Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpolating many points in Python #323

Closed
jrobsontull opened this issue Jul 5, 2024 · 2 comments
Closed

Interpolating many points in Python #323

jrobsontull opened this issue Jul 5, 2024 · 2 comments

Comments

@jrobsontull
Copy link

jrobsontull commented Jul 5, 2024

We rely on Gemmi for a lot of operations involving cryo-EM and crystallography maps and often have to interpolate many points fast. We are happy with the interpolate_value but we are confused about usage of the interpolate_values function. We often have a numpy array of 3D coordinates that we want to interpolate values for and a slow approach of doing this would be:

import gemmi
import numpy as np

dmap = gemmi.read_ccp4_map('map.mrc')
coords = np.array([[1.0, 1.1, 1.2], [2.0, 2.1, 2.2]])
interpolated = [dmap.grid.interpolate_value(gemmi.Position(*coord)) for coord in coords]

We couldn't understand how to use interpolate_values in this context for faster interpolation of an array of points. These points represent positions of atoms in a structure. My understanding is that interpolate_values will modify the input array to interpolate points at each position of the input:

.def("interpolate_values",
However, we don't neccessarily want to interpolate over a grid that is the same shape as the map grid. Rather, we only want to interpolate an array of atomic coordinates. Is there a way to do this with interpolate_values?

Instead we ended up writing our own interpolation function to transform coordinates into the correct system but this doesn't handle NCS and crystallographic symmetry, like Gemmi's interpolate_value does:

import gemmi
import numpy as np

# Use numpy broadcasting to transform the points and interpolate all the values
def interpolate_values(arr: np.ndarray, trans_mat: np.ndarray) -> np.ndarray:
  # do work...
  return arr

# Transformation matrix for moving to different coordinate systems e.g. fractionalization
trans_mat = np.array([
  [1.0, 0.0, 0.0],
  [0.0, 1.0, 0.0],
  [0.0, 0.0, 1.0]
])

dmap = gemmi.read_ccp4_map('map.mrc')
coords = np.array([[1.0, 1.1, 1.2], [2.0, 2.1, 2.2]])
interpolated = interpolate_values(coords, trans_mat)

However, we're reluctant to build all the logic needed for crystallographic and non-crystallographic symmetry. We would love to use Gemmi more throughout our pipelines and were wondering if there is already a good solution. Also happy to look into opening a pull request if this doesn't exist.

@wojdyr
Copy link
Member

wojdyr commented Sep 23, 2024

I just added function interpolate_position_array, equivalent to the function from your PR #324 but in nanobind. Switching to nanobind decreased the overhead of calling functions – calling interpolate_value is now 3x faster, but it's still much slower than a new function, even for small arrays.

For now, I'm leaving the PR open, because I still need to add documentation and tests.

@wojdyr wojdyr closed this as completed Sep 23, 2024
@jrobsontull
Copy link
Author

Thanks for this @wojdyr. Do let me know if I can contribute to anything else. We make good use of this package at our company for working with volume data and look forward to new features!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants