Skip to content

nf-osi/usagereports

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

usagereports

Intro

This is an R package with functions and templates to generate figures for data usage PDF reports or presentations. No real data lives here.

The collection of functions in R are prefixed with their intent:

  • query_* : Query and compile data from data warehouse, portal assets, Google Analytics, etc.
  • to_* : Take data output from query_* and massage to the structure needed for specific plots or other forms.
  • plot_* : Generate plots that go into the report.
  • simd_* : Simulate example data for the corresponding plots.

Why have a reporting package?

  1. For heavily guided usage and workflow to put together a full biannual or annual PDF report deliverable for a sponsor funder. See the supporting flowchart below; figures are approximately numbered by the order in which they appear in the "suggested" report format.
flowchart TD
    
    classDef fig fill:orange,stroke:#333,stroke-width:0px;
    class fig1,fig2,fig3,fig4,fig5,fig6,fig7,fig8,fig9,fig10 fig;
    style datawarehouse fill:#625191,stroke-width:0px
    style synapse fill:#125e81,stroke-width:0px
    style google fill:#e9b4ce,stroke-width:0px
    style datawarehouse color:white
    style synapse color:white


    subgraph datawarehouse
        dw[(db warehouse)] -- query_data_by_funding_agency --> files[[files]] 
        dw -- query_file_snapshot --> file_summary_data(file_summary_data)
        file_summary_data -- plot_bar_available_files --> fig2:::fig
        files -- to_deidentified_export --> data(data) 
        data -- plot_lollipop_download_by_project --> fig4:::fig
        data -- plot_downloads_datetime --> fig5:::fig
        data -- filter --> filtered_data(filtered_data)
        filtered_data -- plot_lollipop_download_by_project --> fig6:::fig
        filtered_data -- plot_downloads_datetime --> fig7:::fig
    end
    
    subgraph synapse
        studies(Portal - Studies) -- query_data_status_snapshots --> data_status(data_status)
        data_status -- to_sankey_data --> sankey_data(sankey_data)
        sankey_data(sankey_data) -- plot_sankey_status --> fig1:::fig

        filemeta(File meta) --> data_type_breakdown(data_type_breakdown)
        filemeta(File meta) --> data_assay_breakdown(data_assay_breakdown)
        filtered_data -- annotation_join --> filemeta
        data_assay_breakdown -- plot_bar_data_segment --> fig8:::fig
        data_type_breakdown -- plot_bar_data_segment --> fig9:::fig
        filtered_data -- to_summary_users --> data_user_summary(data_user_summary)
        data_user_summary -- plot_user_summary --> fig10:::fig
    end

    subgraph google
        studies --> project_stats
        GA[(Google Analytics)] -- query_ga --> project_stats(project_stats)
        project_stats -- plot_pageviews --> fig3:::fig
    end
    
Loading
  1. Build in Synapse-default themes and color palettes.

  2. As a good starting place and conceptual catalog of interesting metrics/data products, even if you don't ultimately use any of the queries/plotting utils here. Consider contributing if you come up with something that others might also find useful.

  3. As a playground and learning resource for R data manipulation and Rmarkdown.

Templates

Examples of overall report

We have examples of past reports to better show how figures appear. There is even a Streamlit (Python) version of the template! But these are internal, so please reach out.

Data prep example templates

Synapse (teal domain) and Google Analytics (pink domain): Set up with rmarkdown::draft(file = "Data-prep-Syn-GA-YYYY-MM", template = "prepare-data-synapse-ga", package = "usagereports")

Installation

OS dependencies

This needs libsodium for encrypting/de-encrypting coded data.

  • deb: libsodium-dev (Debian, Ubuntu, etc)
  • brew: libsodium (OSX)

R dev package dependencies

This relies on a non-CRAN packages that can be installed via devtools:

  • devtools::install_github("davidsjoberg/ggsankey")

Then:

  • devtools::install_github("nf-osi/usagereports")
  • (Or for potential contributors) Clone this repo and install locally with: devtools::install()

Snowflake connection dependencies (optional)

If you'd like to interact with Snowflake without leaving RStudio (which does allow a more seamless workflow for updating figures), see here.

However, this package is also pretty agnostic about which interface is used, so OK to just plug in data exported from the Worksheets UI or the VSCode extension.

Development

Please note that you are using a pre-version 1.0 of the package.

Contributing guide

  • Create a branch for changes and make a pull request against main.
  • To propose a new figure, it is recommended that you add a corresponding function to create example data to help users see what data is expected/what shape they need to get their data into.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •