diff --git a/DESCRIPTION b/DESCRIPTION index d35fb57..a2deb87 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -7,7 +7,57 @@ Authors@R: c( "Stefano", "Mangiola", email = "mangiolastefano@gmail.com", - role = c("aut", "cre") + role = c("aut", "cre", "rev") + ), + person( + "Michael", + "Milton", + email = "milton.m@wehi.edu.au", + role = c("aut", "rev") + ), + person( + "Martin", + "Morgan", + email = "Martin.Morgan@RoswellPark.org", + role = c("ctb", "rev") + ), + person( + "Vincent", + "Carey", + email = "stvjc@channing.harvard.edu", + role = c("ctb", "rev") + ), + person( + "Julie", + "Iskander", + email = "iskander.j@wehi.edu.au", + role = c( "rev") + ), + person( + "Tony", + "Papenfuss", + email = "papenfuss@wehi.edu.au", + role = c( "rev") + ), + person( + "Silicon Valley Foundation", + "CZF2019-002443", + role = c( "fnd") + ), + person( + "NIH NHGRI", + "5U24HG004059-18", + role = c( "fnd") + ), + person( + "Victoria Cancer Agnency", + "ECRF21036", + role = c( "fnd") + ), + person( + "NHMRC", + "1116955", + role = c( "fnd") )) Description: Provides access to a copy of the Human Cell Atlas, but with harmonised metadata. This allows for uniform querying across numerous diff --git a/README.Rmd b/README.Rmd index e443126..d5a81bb 100644 --- a/README.Rmd +++ b/README.Rmd @@ -3,9 +3,12 @@ title: "CuratedAtlasQueryR" output: github_document --- -`CuratedAtlasQuery` is a query interface that allow the programmatic exploration and retrieval of the harmonised, curated and reannotated CELLxGENE single-cell human cell atlas. Data can be retrieved at cell, sample, or dataset levels based on filtering criteria. + +[![Lifecycle:maturing](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://www.tidyverse.org/lifecycle/#maturing) + -# Query interface + +`CuratedAtlasQuery` is a query interface that allow the programmatic exploration and retrieval of the harmonised, curated and reannotated CELLxGENE single-cell human cell atlas. Data can be retrieved at cell, sample, or dataset levels based on filtering criteria. ```{r, include = FALSE} # Note: knit this to the repo readme file using: @@ -16,8 +19,27 @@ knitr::opts_chunk$set( ) ``` -```{r, echo=FALSE, out.height = "139px", out.width = "120px"} -knitr::include_graphics("inst/logo.png") +```{r, echo=FALSE, out.height = c("139px"), out.width = "120x" } +knitr::include_graphics(c("inst/logo.png")) +``` + +```{r, echo=FALSE, out.height = c("58px"), out.width = c("155x", "129px", "202px", "219px")} +knitr::include_graphics(c( + "inst/svcf_logo.jpeg", + "inst/czi_logo.png", + "inst/bioconductor_logo.jpg", + "inst/vca_logo.png" +)) +``` + +[website](https://stemangiola.github.io/CuratedAtlasQueryR) + +# Query interface + +## Installation + +```{r, eval=FALSE} +devtools::install_github("stemangiola/CuratedAtlasQueryR") ``` ## Load the package @@ -38,7 +60,7 @@ get_metadata() ### Explore the tissue -```{r, eval=FALSE} +```{r} get_metadata() |> dplyr::distinct(tissue, file_id) ``` @@ -189,7 +211,7 @@ Through harmonisation and curation we introduced custom column, not present in t - `tissue_harmonised`: a coarser tissue name for better filtering - `age_days`: the number of days corresponding to the age -- `cell_type_harmonised`: the consensus call identiti (for immune cells) using the original and three novel annotations using Seurat Azimuth and SingleR +- `cell_type_harmonised`: the consensus call identity (for immune cells) using the original and three novel annotations using Seurat Azimuth and SingleR - `confidence_class`: an ordinal class of how confident `cell_type_harmonised` is. 1 is complete consensus, 2 is 3 out of four and so on. - `cell_annotation_azimuth_l2`: Azimuth cell annotation - `cell_annotation_blueprint_singler`: SingleR cell annotation using Blueprint reference @@ -201,6 +223,15 @@ Through harmonisation and curation we introduced custom column, not present in t # RNA abundance -The `raw` assay includes RNA abundance in the positive real scale (not transformed with non-linear functions, e.g. log sqrt). Originally CELLxGENE include a mix of scales and tranformations specified in the `x_normalization` column. +The `raw` assay includes RNA abundance in the positive real scale (not transformed with non-linear functions, e.g. log sqrt). Originally CELLxGENE include a mix of scales and transformations specified in the `x_normalization` column. The `cpm` assay includes counts per million. + +--- + +This project has been funded by + +- *Silicon Valley Foundation* CZF2019-002443 +- *Bioconductor core funding* NIH NHGRI 5U24HG004059-18 +- *Victoria Cancer Agency* ECRF21036 +- *Australian National Health and Medical Research Council* 1116955 diff --git a/README.md b/README.md index bf37a2e..6526e2d 100644 --- a/README.md +++ b/README.md @@ -1,14 +1,29 @@ CuratedAtlasQueryR ================ + + +[![Lifecycle:maturing](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://www.tidyverse.org/lifecycle/#maturing) + + `CuratedAtlasQuery` is a query interface that allow the programmatic exploration and retrieval of the harmonised, curated and reannotated CELLxGENE single-cell human cell atlas. Data can be retrieved at cell, sample, or dataset levels based on filtering criteria. + + + + +[website](https://stemangiola.github.io/CuratedAtlasQueryR) + # Query interface - +## Installation + +``` r +devtools::install_github("stemangiola/CuratedAtlasQueryR") +``` ## Load the package @@ -24,8 +39,8 @@ library(stringr) ``` r get_metadata() -#> # Source: table [?? x 56] -#> # Database: sqlite 3.40.0 [/stornext/Home/data/allstaff/m/mangiola.s/.cache/R/CuratedAtlasQueryR/metadata.sqlite] +#> # Source: table [?? x 56] +#> # Database: DuckDB 0.6.2-dev1166 [unknown@Linux 3.10.0-1160.81.1.el7.x86_64:R 4.2.0/:memory:] #> .cell sampl…¹ .sample .samp…² assay assay…³ file_…⁴ cell_…⁵ cell_…⁶ devel…⁷ #> #> 1 AAACCT… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea… @@ -52,6 +67,21 @@ get_metadata() ``` r get_metadata() |> dplyr::distinct(tissue, file_id) +#> # Source: SQL [?? x 2] +#> # Database: DuckDB 0.6.2-dev1166 [unknown@Linux 3.10.0-1160.81.1.el7.x86_64:R 4.2.0/:memory:] +#> tissue file_id +#> +#> 1 blood 07beec85-51be-4d73-bb80-8f85b7b643d5 +#> 2 blood 3431ab62-b11d-445f-a461-1408d2b29f8c +#> 3 blood 5500774a-6ebe-4ddf-adce-90302b7cd007 +#> 4 blood 550760cb-ede9-4e6b-b6ab-7152f2ce29e1 +#> 5 blood a0396bf6-cd6d-42d9-b1b5-c66b19d312ae +#> 6 cortex of kidney a1035da5-137b-4fac-8435-d1e4af20851c +#> 7 blood a139b1d6-eba0-484d-860c-4fb810e17615 +#> 8 prefrontal cortex 27e51147-93c7-40c5-a6a3-da4b203e05ba +#> 9 macula lutea proper 28d54b40-7a92-40cf-b164-a6c3158f55f6 +#> 10 fovea centralis 28d54b40-7a92-40cf-b164-a6c3158f55f6 +#> # … with more rows ``` ``` r @@ -277,7 +307,7 @@ present in the original CELLxGENE metadata - `tissue_harmonised`: a coarser tissue name for better filtering - `age_days`: the number of days corresponding to the age -- `cell_type_harmonised`: the consensus call identiti (for immune cells) +- `cell_type_harmonised`: the consensus call identity (for immune cells) using the original and three novel annotations using Seurat Azimuth and SingleR - `confidence_class`: an ordinal class of how confident @@ -297,7 +327,16 @@ present in the original CELLxGENE metadata The `raw` assay includes RNA abundance in the positive real scale (not transformed with non-linear functions, e.g. log sqrt). Originally -CELLxGENE include a mix of scales and tranformations specified in the +CELLxGENE include a mix of scales and transformations specified in the `x_normalization` column. The `cpm` assay includes counts per million. + +------------------------------------------------------------------------ + +This project has been funded by + +- *Silicon Valley Foundation* CZF2019-002443 +- *Bioconductor core funding* NIH NHGRI 5U24HG004059-18 +- *Victoria Cancer Agency* ECRF21036 +- *Australian National Health and Medical Research Council* 1116955 diff --git a/inst/bioconductor_logo.jpg b/inst/bioconductor_logo.jpg new file mode 100644 index 0000000..5e1e5f5 Binary files /dev/null and b/inst/bioconductor_logo.jpg differ diff --git a/inst/czi_logo.png b/inst/czi_logo.png new file mode 100644 index 0000000..94014af Binary files /dev/null and b/inst/czi_logo.png differ diff --git a/inst/svcf_logo.jpeg b/inst/svcf_logo.jpeg new file mode 100644 index 0000000..bbc166e Binary files /dev/null and b/inst/svcf_logo.jpeg differ diff --git a/inst/vca_logo.png b/inst/vca_logo.png new file mode 100644 index 0000000..e1cce78 Binary files /dev/null and b/inst/vca_logo.png differ diff --git a/vignettes/Introduction.Rmd b/vignettes/Introduction.Rmd index 117ed3b..0f0d692 100644 --- a/vignettes/Introduction.Rmd +++ b/vignettes/Introduction.Rmd @@ -7,9 +7,11 @@ vignette: > %\VignetteEncoding{UTF-8} --- -`CuratedAtlasQuery` is a query interface that allow the programmatic exploration and retrieval of the harmonised, curated and reannotated CELLxGENE single-cell human cell atlas. Data can be retrieved at cell, sample, or dataset levels based on filtering criteria. + +[![Lifecycle:maturing](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://www.tidyverse.org/lifecycle/#maturing) + -# Query interface +`CuratedAtlasQuery` is a query interface that allow the programmatic exploration and retrieval of the harmonised, curated and reannotated CELLxGENE single-cell human cell atlas. Data can be retrieved at cell, sample, or dataset levels based on filtering criteria. ```{r, include = FALSE} # Note: knit this to the repo readme file using: @@ -20,9 +22,25 @@ knitr::opts_chunk$set( ) ``` -```{r, echo=FALSE, out.height = "139px", out.width = "120px"} -system.file("logo.png", package="CuratedAtlasQueryR") |> - knitr::include_graphics() +```{r, echo=FALSE, out.height = c("139px"), out.width = "120x" } +knitr::include_graphics(c("../inst/logo.png")) +``` + +```{r, echo=FALSE, out.height = c("58px"), out.width = c("155x", "129px", "202px", "219px")} +knitr::include_graphics(c( + "../inst/svcf_logo.jpeg", + "../inst/czi_logo.png", + "../inst/bioconductor_logo.jpg", + "../inst/vca_logo.png" +)) +``` + +# Query interface + +## Installation + +```{r, eval=FALSE} +devtools::install_github("stemangiola/CuratedAtlasQueryR") ``` ## Load the package @@ -175,8 +193,7 @@ meta |> ``` ```{r, echo=FALSE, message=FALSE, warning=FALSE} -system.file("NCAM1_figure.png", package="CuratedAtlasQueryR") |> - knitr::include_graphics() +knitr::include_graphics("../inst/NCAM1_figure.png") ``` # Cell metadata @@ -197,7 +214,7 @@ Through harmonisation and curation we introduced custom column, not present in t - `tissue_harmonised`: a coarser tissue name for better filtering - `age_days`: the number of days corresponding to the age -- `cell_type_harmonised`: the consensus call identiti (for immune cells) using the original and three novel annotations using Seurat Azimuth and SingleR +- `cell_type_harmonised`: the consensus call identity (for immune cells) using the original and three novel annotations using Seurat Azimuth and SingleR - `confidence_class`: an ordinal class of how confident `cell_type_harmonised` is. 1 is complete consensus, 2 is 3 out of four and so on. - `cell_annotation_azimuth_l2`: Azimuth cell annotation - `cell_annotation_blueprint_singler`: SingleR cell annotation using Blueprint reference @@ -209,6 +226,6 @@ Through harmonisation and curation we introduced custom column, not present in t # RNA abundance -The `raw` assay includes RNA abundance in the positive real scale (not transformed with non-linear functions, e.g. log sqrt). Originally CELLxGENE include a mix of scales and tranformations specified in the `x_normalization` column. +The `raw` assay includes RNA abundance in the positive real scale (not transformed with non-linear functions, e.g. log sqrt). Originally CELLxGENE include a mix of scales and transformations specified in the `x_normalization` column. The `cpm` assay includes counts per million.