Editorial changes to notebook census_axis_query.ipynb (#371)

* editorial changes to notebook * Update api/python/notebooks/api_demo/census_axis_query.ipynb Co-authored-by: Emanuele Bezzi <[email protected]> --------- Co-authored-by: Emanuele Bezzi <[email protected]>
chanzuckerberg · Apr 11, 2023 · db60f13 · db60f13
1 parent 093e181
commit db60f13
Showing 1 changed file with 20 additions and 13 deletions.
diff --git a/api/python/notebooks/api_demo/census_axis_query.ipynb b/api/python/notebooks/api_demo/census_axis_query.ipynb
@@ -1,17 +1,23 @@
 {
  "cells": [
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Axis Query Example\n",
+    "# Querying axis metadata\n",
     "\n",
-    "_Goal:_ demonstrate basic axis metadata handling using Pandas.\n",
+    "This notebook provides examples for basic axis metadata handling using Pandas. The Census stores `obs` (cell) and `var` (gene) metadata in `SOMADataFrame` objects via the [`TileDB-SOMA` API](https://github.com/single-cell-data/TileDB-SOMA) ([documentation](https://tiledbsoma.readthedocs.io/en/latest/)), which can be queried and read as a Pandas `DataFrame` using `TileDB-SOMA`. \n",
     "\n",
-    "The Census stores obs (cell) metadata in a SOMA DataFrame, which can be queried and read as a Pandas DataFrame. The Census also has a convenience package which simplifies opening the census.\n",
+    "Note that Pandas `DataFrame` is an in-memory object, therefore queries should be small enough for results to fit in memory.\n",
+    "\n",
+    "**Contents**\n",
+    "\n",
+    "1. Opening the Census\n",
+    "1. Summarizing cell metadata\n",
+    "   1. Example: Summarize all cell types\n",
+    "   1. Example: Summarize a subset of cell types, selected with a `value_filter`\n",
+    "1. Full Census metadata stats\n",
     "\n",
-    "Pandas DataFrame is an in-memory object. Take care that queries are small enough for results to fit in memory.\n",
     "\n",
     "## Opening the Census\n",
     "\n",
@@ -47,17 +53,19 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Summarize Census cell metadata\n",
+    "## Summarizing cell metadata\n",
+    "\n",
+    "Once the Census is open you can use its `TileDB-SOMA` methods as it is itself a `SOMACollection`. You can thus access the metadata `SOMADataFrame` objects encoding cell and gene metadata.\n",
     "\n",
     "Tips:\n",
     "\n",
-    "- You can read an _entire_ SOMA dataframe into a Pandas DataFrame using `soma_df.read().concat().to_pandas()`, allowing the use of the standard Pandas API.\n",
+    "- You can read an _entire_ `SOMADataFrame` into a Pandas `DataFrame` using `soma_df.read().concat().to_pandas()`, allowing the use of the standard Pandas API.\n",
     "- Queries will be much faster if you request only the DataFrame columns required for your analysis (e.g., `column_names=[\"cell_type_ontology_term_id\"]`).\n",
     "- You can also further refine query results by using a `value_filter`, which will filter the census for matching records.\n",
     "\n",
-    "### Example 1 - Summarize all cell types\n",
+    "### Example: Summarize all cell types\n",
     "\n",
-    "This example reads the cell metadata (obs) into a Pandas DataFrame, and summarizes in a variety of ways using Pandas API."
+    "This example reads the cell metadata (`obs`) into a Pandas DataFrame, and summarizes in a variety of ways using Pandas API."
    ]
   },
   {
@@ -119,7 +127,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### Summarize a subset of cell types, selected with a `value_filter`\n",
+    "### Example: Summarize a subset of cell types, selected with a `value_filter`\n",
     "\n",
     "This example utilizes a SOMA \"value filter\" to read the subset of cells with `tissue_ontology_term_id` equal to `UBERON:0002048` (lung tissue), and summarizes the query result using Pandas."
    ]
@@ -254,7 +262,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Full census stats\n",
+    "## Full Census metadata stats\n",
     "\n",
     "This example queries all organisms in the Census, and summarizes the diversity of various metadata lables."
    ]
@@ -308,7 +316,6 @@
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -348,7 +355,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.6"
+   "version": "3.10.10"
   },
   "vscode": {
    "interpreter": {