Skip to content

Commit

Permalink
Add notebook guidelines (#316)
Browse files Browse the repository at this point in the history
  • Loading branch information
pablo-gar authored Apr 3, 2023
1 parent b3d1656 commit c9a5756
Showing 1 changed file with 123 additions and 0 deletions.
123 changes: 123 additions & 0 deletions docs/cell_census_notebook_guidelines.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Cell Census API notebook/vignette editorial guidelines

API demonstration code that is part of the documentation should be deposited here:

- Python notebooks [`cell-census/api/python/notebooks`](https://github.com/chanzuckerberg/cell-census/tree/main/api/python/notebooks)
- R vignettes [`cell-census/api/r/CellCensus/vignettes`](https://github.com/chanzuckerberg/cell-census/tree/main/api/r/CellCensus/vignettes)


These assets are user-facing and are automatically rendered to the doc-sites, they are one of the primary means by which users get onboarded to the product. Thus the following guidelines need to be followed to ensure readability and a consistent experience.

## Guidelines

### Title

* It must use the highest-level markdown header `#`.
* Unless needed, it should not contain "Cell Census".
* It should be concise, self-explanatory, and if possible indicate an action.

Examples:

:white_check_mark: `# Exploring all data from a tissue.`

:white_check_mark: `# Query and extract slices of data.`

:x: `# Census datasets presence` *not self-explanatory, has "Census" in it.*

:x: `# Normalizing full-length gene sequencing data from the Cell Census` *has "Census" in it.*

### Introduction

Introductory text must be included right underneath the title.

* It must provide a one paragraph summary of the notebook's goals.
* It must not contain an explanation of the Cell Census.

Examples:

:white_check_mark:

> This notebook provides a demonstration for integrating two Cell Census datasets using scvi-tools. The goal is not to provide an exhaustive guide on proper integration, but to showcase what information in the Cell Census can inform data integration.
:x: *it contains a long explanation of what the Cell Census is, and the goal is not clear*

> The Cell Census is a versioned container for the single-cell data hosted at CELLxGENE Discover. The Cell Census utilizes SOMA powered by TileDB for storing, accessing, and efficiently filtering data.
>
>This notebook shows you how to learn about the Cell Census contents and how to query it.
### Table of Contents

Immediately after the introduction a table of contents must be provided:

* It must be placed under the bolded word "**Contents**" .
* It must contain an ordered list of the second-level headers (`##`) used for [Sections](#Sections).
* If necessary it may contain sub-lists corresponding to lower-level headers (`###`, etc)

Example:

:white_check_mark:

> **Contents**
>
> 1. Learning about the lung data.
> 2. Fetching all human lung data from the Cell Census.
> 3. Obtaining QC metrics for this data slice.
### Sections

The rest of the notebook/vignette content must be organized within sections:

* The section title must use the second-level markdown header `##`. **This is important as the python doc-site renders these in the sidebar and in the full view of all notebooks.**
* The section title should be concise, self-explanatory, and if possible indicate an action.
* The section's contents and presence or absence of sub-headers are left to the discretion of the writer.
* The section's non-code content should be kept as succinct as possible.


## Example notebook/vignette

```
# Integrating data with SCVI.
This notebook provides a demonstration for integrating two
Cell Census datasets using `scvi-tools`. The goal is not to
provide an exhaustive guide on proper integration, but to showcase
what information in the Cell Census can inform data integration.
**Contents**
1. Finding and fetching data from mouse liver.
2. Gene-length normalization of Smart-Seq2 data.
3. Integration with scvi-tools.
1. Inspecting data prior to integration.
2. Integration with batch defined as dataset_id.
3. Integration with batch defined as dataset_id + donor_id.
4. Integration with batch defined as dataset_id + donor_id + assay_ontology_term_id + suspension_type.
## Finding and fetching data from mouse liver
Let's load all modules needed for this notebook.
\code
import cell_census
import scanpy as sc
import numpy as np
import scvi
from scipy.sparse import csr_matrix
\code
Now we can open the Cell Census
\code
census = cell_census.open_soma(census_version="latest")
\code
In this notebook we will use Tabula Muris Senis data
from the liver as it contains cells from both 10X
Genomics and Smart-Seq2 technologies.
Let's query the datasets table of the Cell Census by
filtering on collection_name for "Tabula Muris Senis"
and dataset_title for "liver".
[...]
```

0 comments on commit c9a5756

Please sign in to comment.