Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial changes to notebook census_axis_query.ipynb #371

Merged
merged 2 commits into from
Apr 11, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 20 additions & 13 deletions api/python/notebooks/api_demo/census_axis_query.ipynb
Original file line number Diff line number Diff line change
@@ -1,17 +1,23 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Axis Query Example\n",
"# Querying axis metadata\n",
"\n",
"_Goal:_ demonstrate basic axis metadata handling using Pandas.\n",
"This notebook provides examples for basic axis metadata handling using Pandas. The Census stores `obs` (cell) and `var` (gene) metadata in `SOMADataFrame` objects via the [`TileDB-SOMA` API](https://github.com/single-cell-data/TileDB-SOMA) ([documentation](https://tiledbsoma.readthedocs.io/en/latest/)), which can be queried and read as a Pandas `DataFrame` using `TileDB-SOMA`. \n",
"\n",
"The Census stores obs (cell) metadata in a SOMA DataFrame, which can be queried and read as a Pandas DataFrame. The Census also has a convenience package which simplifies opening the census.\n",
"Note that Pandas `DataFrame` is an in-memory object, therefore queries should be small enough for results to fit in memory.\n",
"\n",
"**Contents**\n",
"\n",
"1. Opening the Census\n",
"1. Summarizing cell metadata\n",
" 1. Example: Summarize all cell types\n",
" 1. Example: Summarize a subset of cell types, selected with a `value_filter`\n",
"1. Full Census metadata stats\n",
"\n",
"Pandas DataFrame is an in-memory object. Take care that queries are small enough for results to fit in memory.\n",
"\n",
"## Opening the Census\n",
"\n",
Expand Down Expand Up @@ -47,17 +53,19 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Summarize Census cell metadata\n",
"## Summarizing cell metadata\n",
"\n",
"Once the Census is open you can use its `TileDB-SOMA` methods as it is itself a `SOMACollection`. You can thus access the metadata `SOMADataFrame` objects encoding cell and gene metadata.\n",
"\n",
"Tips:\n",
"\n",
"- You can read an _entire_ SOMA dataframe into a Pandas DataFrame using `soma_df.read().concat().to_pandas()`, allowing the use of the standard Pandas API.\n",
"- You can read an _entire_ `SOMADataFrame` into a Pandas `DataFrame` using `soma_df.read().concat().to_pandas()`, allowing the use of the standard Pandas API.\n",
"- Queries will be much faster if you request only the DataFrame columns required for your analysis (e.g., `column_names=[\"cell_type_ontology_term_id\"]`).\n",
"- You can also further refine query results by using a `value_filter`, which will filter the census for matching records.\n",
"\n",
"### Example 1 - Summarize all cell types\n",
"### Example: Summarize all cell types\n",
"\n",
"This example reads the cell metadata (obs) into a Pandas DataFrame, and summarizes in a variety of ways using Pandas API."
"This example reads the cell metadata (`obs`) into a Pandas DataFrame, and summarizes in a variety of ways using Pandas API."
]
},
{
Expand Down Expand Up @@ -119,7 +127,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Summarize a subset of cell types, selected with a `value_filter`\n",
"### Example: Summarize a subset of cell types, selected with a `value_filter`\n",
"\n",
"This example utilizes a SOMA \"value filter\" to read the subset of cells with `tissue_ontology_term_id` equal to `UBERON:0002048` (lung tissue), and summarizes the query result using Pandas."
]
Expand Down Expand Up @@ -254,7 +262,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Full census stats\n",
"## Full Census metadata stats\n",
"\n",
"This example queries all organisms in the Census, and summarizes the diversity of various metadata lables."
]
Expand Down Expand Up @@ -308,7 +316,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -348,7 +355,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.10.10"
},
"vscode": {
"interpreter": {
Expand Down