Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated README for repo #1947

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 28 additions & 75 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,92 +1,44 @@
<!-- markdownlint-disable no-bare-urls no-duplicate-header -->

# Data Analytics as a Service
# Advanced Analytics Workspace (AAW)

Data Analytics as a Service for the Government of Canada and external collaborators.
![AAW](https://github.com/StatCan/aaw/assets/8212170/ed9d44d9-3606-4097-b9a8-2ee715f13ebf)

## Frequently Asked Questions (FAQ)
Welcome to the documentation repository for the Advanced Analytics Workspace (AAW), brought to you by Data Analytics Services (DAS) at Statistics Canada. This documentation is tailored for users and developers of the AAW within the Government of Canada and external collaborators.

If your question does not appear in this document, please reach out to us on our [Slack Support Channel](https://statcan-aaw.slack.com/).
If you have any questions about the project, feel free to connect with us on our [**Slack Support Channel**](https://statcan-aaw.slack.com/).

### Who can access the AAW?
## AAW Repositories

- Anyone with a Statistics Canada (`@statcan.gc.ca`) email address can access the AAW.
1. [General](#advanced-analytics-workspace)
1. [Terraform](#terraform)
1. [Installation](#install-the-aaw-platform-and-infrastructure)
1. [Community Engagement](#community-engagement)
1. [Developer Notes](#developer-notes)

### What data formats are supported in the AAW?
## What AAW Can Do

The AAW includes tools that allow data science users to open almost any file. The AAW supports many commonly used file formats, including (but not limited to):

- csv
- xlsx
- json
- xml
- sas7bdat
- sqlite
- many others... just ask :-)

### How much does the AAW cost?

#### CPU Only

| **Use Case** | **Compute Resources** | | | **Time (Hours/Week)** | **Cost** | | |
|----------------------------|-----------------------|------------|-------|-----------------------|----------|-----------|------------|
| | _CPU_ | _RAM (GB)_ | _GPU_ | | _Weekly_ | _Monthly_ | _Annually_ |
| CPU: Occasional Use | 2 | 8 | 0 | 8 | 1.1367 | 4.88781 | 59.1084 |
| CPU: During Business Hours | 2 | 8 | 0 | 40 | 5.6835 | 24.43905 | 295.542 |
| CPU: 24/7 | 2 | 8 | 0 | 168 | 23.8707 | 102.64401 | 1241.2764 |

#### Add a GPU

| **Use Case** | **Compute Resources** | | | **Time (Hours/Week)** | **Cost** | | |
|----------------------------|-----------------------|------------|-------|-----------------------|----------|-----------|------------|
| | _CPU_ | _RAM (GB)_ | _GPU_ | | _Weekly_ | _Monthly_ | _Annually_ |
| GPU: Occaisonal Use | 0 | 0 | 1 | 8 | 34.468 | 148.2124 | 1792.336 |
| GPU: During Business Hours | 0 | 0 | 1 | 40 | 172.34 | 741.062 | 8961.68 |
| GPU: 24/7 | 0 | 0 | 1 | 168 | 723.828 | 3112.4604 | 37639.056 |

### What are the steps for getting Protected B data into MinIO?

- One must consult with FDI (F.A.I.R. Data Infrastructure) before Protected B Data can be loaded into MinIO. The FDI team owns an Azure Data Factory pipeline for moving data, typically from on premise, to an Azure Storage Account and MinIO is our S3 gateway to that storage account.

### Can we use Power BI on the AAW?

- At the moment, no. We are currently looking into solutions for sharing data between the AAW and CAE (which supports Power BI).

### Does using SAS entail different costs than the others? Are there a limited number of licenses or instances that can be run?

- SAS support is currently experimental and will rely on existing Statistics Canada SAS software licenses.

### How do you suspend your server (to save costs)?

- Press the suspend server button (square to the left of the garbage can icon to delete a server). This will suspend the workspace to save on costs.
- Please keep in mind all data that is not stored on persistent disks (persistent workspace volume or persistent data volumes) is deleted when a workspace is suspended. In particular data on non-persistent disks and data on the filesystem outside the workspace or data volume (for instance the `/tmp` folder) will be permanently lost.
- When suspended the workspace and data volumes remain locked and cannot be deleted or attached to another server.
- To resume a suspended server, press the resume button (triangle icon to the left of the garbage can). When resumed the server will have the same workspace and data volumes as before (with all data kept as-is if the volumes were persistent) and has the exact same specification (CPU, RAM, GPU, and other settings). Things stored outside the home directory and persistent data volumes (like conda virtual environments) will be gone and will need to be recreated if necessary.

### How do I add other people to my namespace (for collaboration)?

- As the workspace owner you can add and remove contributors, this will allow them to access your workspaces.

### Are there any pre-loaded data (datasets) in AAW that we can access and use for both R and Python notebooks?

- Our JupyterLab images come with some example notebooks and data, they can be found in `/aaw-contrib-jupyter-notebooks/`.
- Our R Studio image also has some example notebooks and data, they can be found in `/aaw-contrib-r-notebooks/`.
- Explore our JupyterLab images featuring example notebooks and data, located in `/aaw-contrib-jupyter-notebooks/`.
- The R Studio image also includes example notebooks and data, available in `/aaw-contrib-r-notebooks/`.
- Find more examples in the [Example Notebooks](https://statcan.github.io/aaw/en/1-Experiments/Notebooks/DrawData_EN.html) section.

## Presentations

We highly encourage you to watch our YouTube presentation given at Stratosphere:
We highly recommend watching our YouTube presentation delivered at Stratosphere:

- [YouTube](https://www.youtube.com/watch?v=quYuuEAqNm0)
- [SlideDeck](https://govcloud.blob.core.windows.net/docs/daaas-cncf.pdf)
- [AAW Onboarding Presentation (work in progress)](https://docs.google.com/presentation/d/12yTDlbMCmbg0ccdea2h0vwhs5YTa_GHm_3DieG5A-k8/edit#slide=id.g113e8bbc6e6_0_27)

## Security

A discussion about some of the security best practices in use by this platform:
Explore discussions about some of the security best practices employed by this platform:

- [aaw-security-proposal](https://github.com/StatCan/aaw-security-proposal)

## Advanced Analytics Workspace
## Repositories

### General Advanced Analytics Workspace

The following is a list of all the `general` related repositories for the Advanced Analytics Workspace project.

Expand Down Expand Up @@ -115,12 +67,12 @@ The following is a list of all the `general` related repositories for the Advanc
| [aaw-security-proposal](https://github.com/StatCan/aaw-security-proposal) | Proposal for the implementation of Protected B workloads in AAW | Public |
| [aaw-toleration-injector](https://github.com/StatCan/aaw-toleration-injector) | Kubernetes toleration injector with support for GPUs and Node Pools | Public |

## Terraform
### Terraform

The following is a list of all the `terraform` related repositories for the
Advanced Analytics Workspace project.

### Install the AAW Platform and Infrastructure
#### Install the AAW Platform and Infrastructure

```sh
## Installs AAW Platform and Infrastructure
Expand Down Expand Up @@ -150,7 +102,7 @@ Advanced Analytics Workspace project.
| AAW | [terraform-azure-statcan-aaw-region-environment](https://github.com/statcan/terraform-azure-statcan-aaw-region-environment) | Terraform module of Advanced Analytics Workspaces (AAW) per-region configuration of Azure |
| AAW | [terraform-statcan-aaw-platform](https://github.com/statcan/terraform-statcan-aaw-platform) | Terraform module for the Advanced Analytics Workspaces (AAW) platform |

### Install the Cloud Native Platform
#### Install the Cloud Native Platform

```sh
## Statistics Canada's Cloud Native Platform (CNP)
Expand Down Expand Up @@ -183,7 +135,7 @@ Advanced Analytics Workspace project.
| CNS | [terraform-statcan-kubernetes-core-platform](https://github.com/statcan/terraform-statcan-kubernetes-core-platform) | Terraform module for Statistics Canada Core Kubernetes Platform |
| CNS | [terraform-statcan-kubernetes-app-platform](https://github.com/statcan/terraform-statcan-kubernetes-app-platform) | Terraform module for Statistics Canada Kubernetes Application Platform |

### Misc
#### Misc

| Repository | Description | Visibility |
|-----------------------------------------------------------------------------------------------|------------------------------------------------------------|------------|
Expand Down Expand Up @@ -213,7 +165,8 @@ multilingual support and other UX related enhancements.
| [minio-console](https://github.com/StatCan/minio-console) | Multilingual support for MinIO Console | Public |
| [rstudio](https://github.com/StatCan/rstudio) | Multilingual support for RStudio | Public |

### Developer Notes:
Fix spelling by executing `fix-spelling-en` and `fix-spelling-fr`
Adding to the sensitive or insensitive category
Ignoring will simply ignore the error for this round. It will trigger again next execution.
## Developer Notes

- Fix spelling by executing `fix-spelling-en` and `fix-spelling-fr`
- Adding to the sensitive or insensitive category
- Ignoring will simply ignore the error for this round. It will trigger again next execution.
Loading