Skip to content

Commit

Permalink
Editorial changes to notebook comp_bio_data_integration_scvi.ipynb (#…
Browse files Browse the repository at this point in the history
  • Loading branch information
pablo-gar authored Apr 11, 2023
1 parent a494ae3 commit 093e181
Showing 1 changed file with 8 additions and 18 deletions.
Original file line number Diff line number Diff line change
@@ -1,20 +1,17 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "5b269fef",
"metadata": {},
"source": [
"# Integration of data from the Census\n",
"\n",
"The Census is a versioned container for the single-cell data hosted at [CELLxGENE Discover](https://cellxgene.cziscience.com/). The Census utilizes [SOMA](https://github.com/single-cell-data/SOMA/blob/main/abstract_specification.md) powered by [TileDB](https://tiledb.com/products/tiledb-embedded) for storing, accessing, and efficiently filtering data.\n",
"# Integrating multi-dataset slices of data\n",
"\n",
"The Census contains data from multiple studies providing an opportunity to perform inter-dataset analysis. To this end integration of data has to be performed first to account for batch effects.\n",
"\n",
"This notebook provides a demonstration for integrating two Census datasets using [`scvi-tools`](https://docs.scvi-tools.org/en/stable/index.html). The goal is not to provide an exhaustive guide on proper integration, but to showcase what information in the Census can inform data integration.\n",
"This notebook provides a demonstration for integrating two Census datasets using [`scvi-tools`](https://docs.scvi-tools.org/en/stable/index.html). **The goal is not to provide an exhaustive guide on proper integration, but to showcase what information in the Census can inform data integration.**\n",
"\n",
"We will go over the following:\n",
"**Contents**\n",
"\n",
"1. Finding and fetching data from mouse liver (10X Genomics and Smart-Seq2).\n",
"1. Gene-length normalization of Smart-Seq2 data.\n",
Expand All @@ -24,7 +21,7 @@
" 1. Integration with batch defined as `dataset_id` + `donor_id`.\n",
" 1. Integration with batch defined as `dataset_id` + `donor_id` + `assay_ontology_term_id` + `suspension_type`.\n",
"\n",
"## Finding and fetching data from mouse liver\n",
"## Finding and fetching data from mouse liver (10X Genomics and Smart-Seq2)\n",
"\n",
"Let's load all modules needed for this notebook."
]
Expand Down Expand Up @@ -61,12 +58,11 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e13d1bdf",
"metadata": {},
"source": [
"Now we can open the Census, if you are not familiar with the basics of the Census API you should take a look at the notebook \"Learning about the CELLxGENE Census\" at `comp_bio_census_info.ipynb`."
"Now we can open the Census, if you are not familiar with the basics of the Census API you should take a look at the notebook [Learning about the CELLxGENE Census](https://cellxgene-census.readthedocs.io/en/latest/notebooks/analysis_demo/comp_bio_census_info.html)."
]
},
{
Expand All @@ -87,7 +83,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "af907e87",
"metadata": {},
Expand Down Expand Up @@ -229,7 +224,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "0b1a5ec0",
"metadata": {},
Expand Down Expand Up @@ -295,14 +289,13 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "13fcff57",
"metadata": {},
"source": [
"## Gene-length normalization of Smart-Seq2 data.\n",
"\n",
"Smart-seq2 read counts have to be normalized by gene length. For full details on gene-length normalization take a look at the notebook \"Normalizing full-length gene sequencing data from the Census\" at `comp_bio_normalizing_full_gene_sequencing.ipynb`.\n",
"Smart-seq2 read counts have to be normalized by gene length. For full details on gene-length normalization take a look at the notebook [Normalizing full-length gene sequencing data from the Census](https://cellxgene-census.readthedocs.io/en/latest/notebooks/analysis_demo/comp_bio_normalizing_full_gene_sequencing.html).\n",
"\n",
"Let's first get the gene lengths from `var.feature_length`."
]
Expand Down Expand Up @@ -413,7 +406,7 @@
"\n",
"Here we will use the \"single-cell Variational Inference\" model or scVI which uses a deep generative model for the integration of spatial transcriptomic data and scRNA-seq data.\n",
"\n",
"For comprehensive usage and best practices of scVI please refer to the [doc site](https://docs.scvi-tools.org/en/stable/index.html) of `scvi-tools`.\n",
"**For comprehensive usage and best practices of scVI please refer to the [doc site](https://docs.scvi-tools.org/en/stable/index.html) of `scvi-tools`.**\n",
"\n",
"### Inspecting data prior to integration\n",
"\n",
Expand Down Expand Up @@ -601,7 +594,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "cd3c3555",
"metadata": {},
Expand Down Expand Up @@ -806,7 +798,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "777576a8",
"metadata": {},
Expand Down Expand Up @@ -1013,7 +1004,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "d4682a88",
"metadata": {},
Expand Down Expand Up @@ -1046,7 +1036,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.10.10"
},
"vscode": {
"interpreter": {
Expand Down

0 comments on commit 093e181

Please sign in to comment.