Skip to content

Commit

Permalink
Merge pull request #138 from saezlab/dev
Browse files Browse the repository at this point in the history
1.4.0
  • Loading branch information
dbdimitrov authored Sep 2, 2024
2 parents 1746c5b + 3b14ecc commit 058a768
Show file tree
Hide file tree
Showing 16 changed files with 1,297 additions and 1,291 deletions.
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 1.3.0
current_version = 1.4.0
commit = True
tag = True
files = pyproject.toml liana/__init__.py
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ LIANA+ is a scalable framework that integrates and extends existing methods and

## Development & Contributions

We welcome suggestions, ideas, and contributions! Please use do not hesitate to contact us, or use the issues or the [LIANA+ Development project](https://github.com/orgs/saezlab/projects/16) to make suggestions.
We welcome suggestions, ideas, and contributions! Please use do not hesitate to contact us, open issues, and check the [contributions guide](https://liana-py.readthedocs.io/en/latest/contributing.html).

## Vignettes
A set of extensive vignettes can be found in the [LIANA+ documentation](https://liana-py.readthedocs.io/en/latest/).
Expand Down Expand Up @@ -46,7 +46,7 @@ For further information please check LIANA's [API documentation](https://liana-p

## Cite LIANA+:

Dimitrov D., Schäfer P.S.L, Farr E., Rodriguez Mier P., Lobentanzer S., Dugourd A., Tanevski J., Ramirez Flores R.O. and Saez-Rodriguez J. 2023 LIANA+: an all-in-one cell-cell communication framework. BioRxiv. https://www.biorxiv.org/content/10.1101/2023.08.19.553863v1
Dimitrov D., Schäfer P.S.L, Farr E., Rodriguez Mier P., Lobentanzer S., Badia-i-Mompel P., Dugourd A., Tanevski J., Ramirez Flores R.O. and Saez-Rodriguez J. LIANA+ provides an all-in-one framework for cellcell communication inference. Nat Cell Biol (2024). https://doi.org/10.1038/s41556-024-01469-w

Dimitrov, D., Türei, D., Garrido-Rodriguez M., Burmedi P.L., Nagai, J.S., Boys, C., Flores, R.O.R., Kim, H., Szalai, B., Costa, I.G., Valdeolivas, A., Dugourd, A. and Saez-Rodriguez, J. Comparison of methods and resources for cell-cell communication inference from single-cell RNA-Seq data. Nat Commun 13, 3224 (2022). https://doi.org/10.1038/s41467-022-30755-0

Expand Down
10 changes: 0 additions & 10 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,16 +39,6 @@

html_theme = 'furo'
html_static_path = ["_static"]
html_theme_options = {
"light_css_variables": {
"color-brand-primary": "#2980B9",
"color-brand-content": "#2980B9",
},
"dark_css_variables": {
"color-brand-primary": "#2980B9",
"color-brand-content": "#2980B9",
},
}
html_context = dict(
display_github=True,
github_user='saezlab',
Expand Down
266 changes: 140 additions & 126 deletions docs/source/notebooks/basic_usage.ipynb

Large diffs are not rendered by default.

285 changes: 70 additions & 215 deletions docs/source/notebooks/misty.ipynb

Large diffs are not rendered by default.

738 changes: 360 additions & 378 deletions docs/source/notebooks/mofatalk.ipynb

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions docs/source/reference.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Reference
----------

Dimitrov D., Schäfer P.S.L, Farr E., Rodriguez Mier P., Lobentanzer S., Dugourd A., Tanevski J., Ramirez Flores R.O. and Saez-Rodriguez J. 2023 LIANA+: an all-in-one cell-cell communication framework. BioRxiv. https://www.biorxiv.org/content/10.1101/2023.08.19.553863v1
Dimitrov D., Schäfer P.S.L, Farr E., Rodriguez Mier P., Lobentanzer S., Badia-i-Mompel P., Dugourd A., Tanevski J., Ramirez Flores R.O. and Saez-Rodriguez J. LIANA+ provides an all-in-one framework for cellcell communication inference. Nat Cell Biol (2024). https://doi.org/10.1038/s41556-024-01469-w

Dimitrov, D., Türei, D., Garrido-Rodriguez M., Burmedi P.L., Nagai, J.S., Boys, C., Flores, R.O.R., Kim, H., Szalai, B., Costa, I.G., Valdeolivas, A., Dugourd, A. and Saez-Rodriguez, J. Comparison of methods and resources for cell-cell communication inference from single-cell RNA-Seq data. Nat Commun 13, 3224 (2022). https://doi.org/10.1038/s41467-022-30755-0

Similarly, please consider citing any of the methods and/or resources implemented in liana, that were particularly relevant for your research!
Similarly, please consider citing any of the methods and/or resources implemented in liana, that were relevant for your research!
11 changes: 11 additions & 0 deletions docs/source/release_notes.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,17 @@
Changelog
=============

1.4.0 (02.09.2024)

- Now published at Nat Cell Bio.

- Correctly referred to PK tutorial for orthology conversion

- Added ``batch_key`` and ``min_var_nbatches`` to control te way batches are selected in ``li.multi.lrs_to_views``.
This might result in minor differences of how many interactions are considered per view, as I also changed the order of filtering.

- Changed ``max_neighbours`` in ``li.ut.spatial_neighbors`` to be a fixed number (default=100), rather than a fraction of the spots as this was making RAM explode for large spatial formats.

1.3.0 (12.07.2024)

- Minor improvements to documentation, specifically changed to the furo theme. Resolved issues with latex not being rendered and plot sizes being off.
Expand Down
2 changes: 1 addition & 1 deletion liana/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = '1.3.0'
__version__ = '1.4.0'

from liana import method as mt, plotting as pl, resource as rs, multi as mu, utils as ut, testing

Expand Down
8 changes: 6 additions & 2 deletions liana/method/sp/_misty/_misty_constructs.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,12 +50,13 @@ def genericMistyData(intra,
add_para=True,
spatial_key='spatial',
set_diag=False,
kernel = 'misty_rbf', ## TODO change to gaussian kernel
kernel = 'misty_rbf',
bandwidth = 100,
zoi = 0,
cutoff = 0.1,
add_juxta=True,
n_neighs = 6,
max_neighs = 18,
verbose=False
):

Expand Down Expand Up @@ -98,6 +99,8 @@ def genericMistyData(intra,
A bandwidth of 5 times the bandwidth of the paraview is used to ensure that the nearest neighbors within the radius.
n_neighs : `int`, optional (default: 6)
The number of neighbors to consider when constructing the juxtaview.
max_neighs: `int`, optional (default: 18)
The maximum number of neighbors to consider when constructing the Paraview.
verbose : `bool`, optional (default: False)
Whether to print progress.
Expand Down Expand Up @@ -135,6 +138,7 @@ def genericMistyData(intra,
bandwidth=bandwidth,
kernel=kernel,
set_diag=set_diag,
max_neighbours=max_neighs,
inplace=False,
cutoff=cutoff,
zoi=zoi
Expand Down Expand Up @@ -165,7 +169,7 @@ def lrMistyData(adata,
nz_threshold=0.1,
use_raw = False,
layer = None,
spatial_key='spatial', ## TODO Change to Gaussian kernel
spatial_key='spatial',
kernel = 'misty_rbf',
bandwidth = 100,
set_diag = False,
Expand Down
41 changes: 23 additions & 18 deletions liana/multi/to_mudata.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,8 @@ def lrs_to_views(adata: AnnData,
lrs_per_sample:int = 10,
samples_per_view: int = 3,
min_variance:int = 0,
min_var_nbatches = 1,
batch_key=None,
lr_sep: str = V.lr_sep,
cell_sep: str='&',
var_sep: str=':',
Expand Down Expand Up @@ -164,6 +166,11 @@ def lrs_to_views(adata: AnnData,
min_variance
Reflects the minimum required variance across samples for each interaction in each view.
NaNs are ignored when computing the variance.
batch_key
Key in `adata.obs` that represents the batch information. Used solely when computing the variance.
If batch_key is not `None`, the variance is computed per batch, and the ``
min_var_nbatches
Reflect the minimum number of batches (>=) that must have a variance above `min_variance` for an interaction to be included in the view.
%(lr_sep)s
cell_sep
Separator to use for the cell names in the views.
Expand Down Expand Up @@ -209,7 +216,10 @@ def lrs_to_views(adata: AnnData,
# concat columns (needed for MOFA)
liana_res['interaction'] = liana_res[ligand_key] + lr_sep + liana_res[receptor_key]
liana_res['ct_pair'] = liana_res[source_key] + cell_sep + liana_res[target_key]
liana_res = liana_res[[sample_key, 'ct_pair', 'interaction', score_key]]
keys = [sample_key, 'ct_pair', 'interaction', score_key]
if batch_key is not None:
keys.append(batch_key)
liana_res = liana_res[keys]

# get scores & invert if necessary
liana_res = process_scores(liana_res=liana_res,
Expand All @@ -218,9 +228,8 @@ def lrs_to_views(adata: AnnData,

# count samples per interaction
count_pairs = (liana_res.
drop(columns=score_key).
groupby(['interaction', 'ct_pair']).
count().
count()[[sample_key]].
rename(columns={sample_key: 'count'}).
reset_index()
)
Expand All @@ -232,8 +241,7 @@ def lrs_to_views(adata: AnnData,
liana_res = liana_res.merge(count_pairs.drop(columns='count') , how='inner')

# Keep only samples above a certain number of LRs
count_lrs = (liana_res.
drop(columns=score_key).
count_lrs = (liana_res[[sample_key, 'ct_pair', 'interaction']].
groupby([sample_key, 'ct_pair']).
count().
rename(columns={'interaction': 'count'}).
Expand All @@ -243,28 +251,26 @@ def lrs_to_views(adata: AnnData,
liana_res = liana_res.merge(count_lrs.drop(columns='count') , how='inner')

# convert to anndata views
views = liana_res['ct_pair'].unique()
views = tqdm(views, disable=not verbose)

lr_adatas = {}
views = tqdm(liana_res['ct_pair'].unique(), disable=not verbose)
for view in views:
lrs_per_ct = liana_res[liana_res['ct_pair']==view]
lrs_wide = lrs_per_ct.pivot(index='interaction',
columns=sample_key,
values=score_key)

index = 'interaction' if batch_key is None else ['interaction', batch_key]
# check variance
ints_to_keep = (lrs_per_ct.groupby(index).apply(lambda x: np.nanvar(x[score_key])) > min_variance).groupby('interaction').sum() >= min_var_nbatches
ints_to_keep = ints_to_keep[ints_to_keep].index

lrs_wide = lrs_per_ct[lrs_per_ct['interaction'].isin(ints_to_keep)].\
pivot(index='interaction',
columns=sample_key,
values=score_key)
lrs_wide.index = view + var_sep + lrs_wide.index
lrs_wide = lrs_wide.replace(np.nan, lr_fill)

if lrs_wide.shape[0] >= lrs_per_view: # check if enough LRs
temp = _dataframe_to_anndata(lrs_wide)

# keep only variables with variance > min_variance
temp = temp[:, np.nanvar(temp.X, axis=0) > min_variance]

if (temp.shape[0] >= samples_per_view): # check if enough samples
lr_adatas[view] = temp

# to mdata
mdata = MuData(lr_adatas)

Expand Down Expand Up @@ -350,7 +356,6 @@ def filter_view_markers(mdata: MuData,
def _process_meta(adata, mdata, sample_key, obs_keys):
if obs_keys is not None:
metadata = adata.obs[[sample_key, *obs_keys]].drop_duplicates()

sample_n = adata.obs[sample_key].nunique()
if metadata.shape[0] != sample_n:
raise ValueError('`obs_keys` must be unique per sample in `adata.obs`')
Expand Down
1 change: 1 addition & 0 deletions liana/tests/test_misty.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ def test_misty_bypass():
bandwidth=10,
add_juxta=True,
set_diag=True,
max_neighs=10,
cutoff=0)
misty(model=RandomForestModel, alphas=1, bypass_intra=True, seed=42, n_estimators=11)
assert np.isin(['juxta', 'para'], misty.uns['target_metrics'].columns).all()
Expand Down
46 changes: 46 additions & 0 deletions liana/tests/test_multi.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,52 @@ def test_lrs_to_views():
assert len(mdata.varm_keys())==3


def test_lrs_to_views_batch():
adata = generate_toy_adata()
adata.obs['batch'] = 1
adata2 = adata.copy()
adata2.obs['batch'] = 2
adata2.obs['sample'] = adata2.obs['sample'].apply(lambda x: x+'2')
adata3 = adata.copy()
adata3.obs['sample'] = adata3.obs['sample'].apply(lambda x: x+'3')
adata = adata.concatenate([adata2, adata3], join='inner', batch_key='sample_number')

liana_res = sample_lrs(by_sample=True)
liana_res2 = liana_res.copy()
liana_res2['sample'] = liana_res['sample'].apply(lambda x: x+'2')
liana_res['batch']=1
liana_res2['batch']=2
liana_res3 = liana_res.copy()
liana_res3['sample'] = liana_res3['sample'].apply(lambda x: x+'3')
# add some variance
liana_res2['specificity_rank'] = liana_res2['specificity_rank'] + 0.1
liana_res3['specificity_rank'] = liana_res3['specificity_rank'] + 0.2
liana_res = pd.concat([liana_res, liana_res2, liana_res3])
adata.uns['liana_results'] = liana_res

mdata = lrs_to_views(adata=adata,
sample_key='sample',
score_key='specificity_rank',
uns_key = 'liana_results',
obs_keys = ['case', 'batch'],
source_key='source',
target_key='target',
ligand_key='ligand_complex',
receptor_key='receptor_complex',
lr_prop=0.1,
lrs_per_sample=1,
lrs_per_view=5,
samples_per_view=0,
min_variance=0,
batch_key='batch',
min_var_nbatches=1,
verbose=True
)

assert mdata.shape == (12, 16)
assert 'case' in mdata.obs.columns
assert 'batch' in mdata.obs.columns
assert len(mdata.varm_keys())==3

def test_adata_to_views():
"""Test adata_to_views."""
Expand Down
7 changes: 2 additions & 5 deletions liana/utils/spatial_neighbors.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ def _linear(distance_mtx, bandwidth):
def spatial_neighbors(adata: AnnData,
bandwidth=None,
cutoff=0.1,
max_neighbours=None,
max_neighbours=100,
kernel='gaussian',
set_diag=False,
zoi=0,
Expand All @@ -48,7 +48,7 @@ def spatial_neighbors(adata: AnnData,
Values below this cutoff will be set to 0.
max_neighbours
Maximum nearest neighbours to be considered when generating spatial connectivity weights.
Essentially, the maximum number of edges in the graph. Default is `None`, which will use n = adata.shape[0]/10.
Essentially, the maximum number of edges in the spatial connectivity graph.
kernel
Kernel function used to generate connectivity weights.
It controls the shape of the connectivity weights.
Expand Down Expand Up @@ -100,9 +100,6 @@ def spatial_neighbors(adata: AnnData,
else:
_reference = reference

if max_neighbours is None:
max_neighbours = int(adata.shape[0] / 10)

tree = NearestNeighbors(n_neighbors=max_neighbours + 1, # +1 to exclude self
algorithm='ball_tree',
metric='euclidean').fit(_reference)
Expand Down
Loading

0 comments on commit 058a768

Please sign in to comment.