diff --git a/.gitbook/assets/service_architecture.png b/.gitbook/assets/service_architecture.png index 4facf7e3..3260972e 100644 Binary files a/.gitbook/assets/service_architecture.png and b/.gitbook/assets/service_architecture.png differ diff --git a/README.md b/README.md index a51f5eaa..c68e0ec4 100644 --- a/README.md +++ b/README.md @@ -17,6 +17,5 @@ description: How to deploy and operate Metaflow * [AWS CloudFormation Deployment](metaflow-on-aws/deployment-guide/aws-cloudformation-deployment.md) * [Manual Deployment](metaflow-on-aws/deployment-guide/manual-deployment.md) * [Operations Guide](metaflow-on-aws/operations-guide/) - - - + * [Metaflow Service Migration Guide](metaflow-on-aws/operations-guide/metaflow-service-migration-guide.md) + * [Metaflow UI Logical Replication Guide](metaflow-on-aws/operations-guide/metaflow-ui-logical-replication-guide.md) \ No newline at end of file diff --git a/SUMMARY.md b/SUMMARY.md index c58769f2..84985164 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -16,4 +16,4 @@ * [Manual Deployment](metaflow-on-aws/deployment-guide/manual-deployment.md) * [Operations Guide](metaflow-on-aws/operations-guide/README.md) * [Metaflow Service Migration Guide](metaflow-on-aws/operations-guide/metaflow-service-migration-guide.md) - + * [Metaflow UI Logical Replication Guide](metaflow-on-aws/operations-guide/metaflow-ui-logical-replication-guide.md) diff --git a/aws-deployment-guide/deployment-guide/aws-cloudformation-deployment.md b/aws-deployment-guide/deployment-guide/aws-cloudformation-deployment.md index 98a428e9..82fa89ee 100644 --- a/aws-deployment-guide/deployment-guide/aws-cloudformation-deployment.md +++ b/aws-deployment-guide/deployment-guide/aws-cloudformation-deployment.md @@ -17,6 +17,13 @@ The major components of the template are: * **AWS Identity and Access Management** - Dedicated roles obeying "principle of least privilege" access to resources such as AWS Batch and Amazon Sagemaker Notebook instances. * **AWS Lambda** _-_ An AWS Lambda function that automates any migrations needed for the Metadata service. +Additional optional components of the template are: + +* **AWS Cloudfront** - Content Delivery Network for Metaflow User Interface static assets. +* **Application Load Balancer** - Application Load Balancer for Metaflow User Interface. + +User Interface can be enabled via CloudFormation template Parameter. This step is covered under **Steps for AWS CloudFormation Deployment - Optional Metaflow User Interface** -section below. + ## Steps for AWS CloudFormation Deployment 1. Navigate to _Services_ and select _CloudFormation_ under the _Management and Governance_ heading \(or search for it in the search bar\) in your AWS console. @@ -45,3 +52,17 @@ Did you choose to enable _APIBasicAuth_ and/or _CustomRole_ and are wondering ho Once you have followed all these steps, you can [configure your metaflow installation](./#configuring-metaflow) using the outputs from the CloudFormation stack. +### Optional Metaflow User Interface (`EnableUI` -parameter) + +Did you choose to enable Metaflow User Interface and are wondering how it works? Below are some details on what needs to be done in order to deploy Metaflow User Interface. + +Please note: This section can be ignored if `EnableUI` -parameter is disabled (this is the default value). + +User Interface is provided as part of the `metaflow-cfn-template.yml` template and doesn't require any additional +configuration besides enabling the `EnableUI` -parameter. You can follow the [AWS CloudFormation Deployment](https://admin-docs.metaflow.org/metaflow-on-aws/deployment-guide/aws-cloudformation-deployment#steps-for-aws-cloudformation-deployment) instructions. + +Once deployed the Cloudformation Stack will provide two outputs: +- `UIServiceUrl` - Application Load Balancer endpoint +- `UIServiceCloudfrontUrl` - Cloudfront distribution (using ALB) endpoint with HTTPS enabled (preferred) + +Please note: Metaflow User Interface doesn't provide any authentication by default. \ No newline at end of file diff --git a/metaflow-on-aws-1/deployment-guide/aws-cloudformation-deployment.md b/metaflow-on-aws-1/deployment-guide/aws-cloudformation-deployment.md index 98a428e9..82fa89ee 100644 --- a/metaflow-on-aws-1/deployment-guide/aws-cloudformation-deployment.md +++ b/metaflow-on-aws-1/deployment-guide/aws-cloudformation-deployment.md @@ -17,6 +17,13 @@ The major components of the template are: * **AWS Identity and Access Management** - Dedicated roles obeying "principle of least privilege" access to resources such as AWS Batch and Amazon Sagemaker Notebook instances. * **AWS Lambda** _-_ An AWS Lambda function that automates any migrations needed for the Metadata service. +Additional optional components of the template are: + +* **AWS Cloudfront** - Content Delivery Network for Metaflow User Interface static assets. +* **Application Load Balancer** - Application Load Balancer for Metaflow User Interface. + +User Interface can be enabled via CloudFormation template Parameter. This step is covered under **Steps for AWS CloudFormation Deployment - Optional Metaflow User Interface** -section below. + ## Steps for AWS CloudFormation Deployment 1. Navigate to _Services_ and select _CloudFormation_ under the _Management and Governance_ heading \(or search for it in the search bar\) in your AWS console. @@ -45,3 +52,17 @@ Did you choose to enable _APIBasicAuth_ and/or _CustomRole_ and are wondering ho Once you have followed all these steps, you can [configure your metaflow installation](./#configuring-metaflow) using the outputs from the CloudFormation stack. +### Optional Metaflow User Interface (`EnableUI` -parameter) + +Did you choose to enable Metaflow User Interface and are wondering how it works? Below are some details on what needs to be done in order to deploy Metaflow User Interface. + +Please note: This section can be ignored if `EnableUI` -parameter is disabled (this is the default value). + +User Interface is provided as part of the `metaflow-cfn-template.yml` template and doesn't require any additional +configuration besides enabling the `EnableUI` -parameter. You can follow the [AWS CloudFormation Deployment](https://admin-docs.metaflow.org/metaflow-on-aws/deployment-guide/aws-cloudformation-deployment#steps-for-aws-cloudformation-deployment) instructions. + +Once deployed the Cloudformation Stack will provide two outputs: +- `UIServiceUrl` - Application Load Balancer endpoint +- `UIServiceCloudfrontUrl` - Cloudfront distribution (using ALB) endpoint with HTTPS enabled (preferred) + +Please note: Metaflow User Interface doesn't provide any authentication by default. \ No newline at end of file diff --git a/metaflow-on-aws/deployment-guide/aws-cloudformation-deployment.md b/metaflow-on-aws/deployment-guide/aws-cloudformation-deployment.md index 98a428e9..82fa89ee 100644 --- a/metaflow-on-aws/deployment-guide/aws-cloudformation-deployment.md +++ b/metaflow-on-aws/deployment-guide/aws-cloudformation-deployment.md @@ -17,6 +17,13 @@ The major components of the template are: * **AWS Identity and Access Management** - Dedicated roles obeying "principle of least privilege" access to resources such as AWS Batch and Amazon Sagemaker Notebook instances. * **AWS Lambda** _-_ An AWS Lambda function that automates any migrations needed for the Metadata service. +Additional optional components of the template are: + +* **AWS Cloudfront** - Content Delivery Network for Metaflow User Interface static assets. +* **Application Load Balancer** - Application Load Balancer for Metaflow User Interface. + +User Interface can be enabled via CloudFormation template Parameter. This step is covered under **Steps for AWS CloudFormation Deployment - Optional Metaflow User Interface** -section below. + ## Steps for AWS CloudFormation Deployment 1. Navigate to _Services_ and select _CloudFormation_ under the _Management and Governance_ heading \(or search for it in the search bar\) in your AWS console. @@ -45,3 +52,17 @@ Did you choose to enable _APIBasicAuth_ and/or _CustomRole_ and are wondering ho Once you have followed all these steps, you can [configure your metaflow installation](./#configuring-metaflow) using the outputs from the CloudFormation stack. +### Optional Metaflow User Interface (`EnableUI` -parameter) + +Did you choose to enable Metaflow User Interface and are wondering how it works? Below are some details on what needs to be done in order to deploy Metaflow User Interface. + +Please note: This section can be ignored if `EnableUI` -parameter is disabled (this is the default value). + +User Interface is provided as part of the `metaflow-cfn-template.yml` template and doesn't require any additional +configuration besides enabling the `EnableUI` -parameter. You can follow the [AWS CloudFormation Deployment](https://admin-docs.metaflow.org/metaflow-on-aws/deployment-guide/aws-cloudformation-deployment#steps-for-aws-cloudformation-deployment) instructions. + +Once deployed the Cloudformation Stack will provide two outputs: +- `UIServiceUrl` - Application Load Balancer endpoint +- `UIServiceCloudfrontUrl` - Cloudfront distribution (using ALB) endpoint with HTTPS enabled (preferred) + +Please note: Metaflow User Interface doesn't provide any authentication by default. \ No newline at end of file diff --git a/metaflow-on-aws/operations-guide/metaflow-ui-logical-replication-guide.md b/metaflow-on-aws/operations-guide/metaflow-ui-logical-replication-guide.md new file mode 100644 index 00000000..727db776 --- /dev/null +++ b/metaflow-on-aws/operations-guide/metaflow-ui-logical-replication-guide.md @@ -0,0 +1,41 @@ +# Metaflow UI Logical Replication Guide + +Administrators may consider logical replication for UI service so that existing infrastructure has minimal impact in terms of load and interference originated from UI service. Read replica is not an option for UI service, since the service is responsible for creating table triggers to the database. + +Logical replication of a table typically starts with taking a snapshot of the data on the publisher database and copying that to the subscriber. Once that is done, the changes on the publisher are sent to the subscriber as they occur in real-time. The subscriber applies the data in the same order as the publisher so that transactional consistency is guaranteed for publications within a single subscription. + +Below you will find high level instructions on how to create publication and subscription between two database instances. + +Please note that PostgreSQL 10.0 or above is required for logical replication. + +## Create publication + +**Prerequisites** +* Database [WAL level](https://www.postgresql.org/docs/10/runtime-config-wal.html) should be set to `logical` + +```sql +CREATE PUBLICATION metaflow_ui_publication + FOR TABLE flows_v3, runs_v3, steps_v3, tasks_v3, artifact_v3, metadata_v3; +``` + +Read more about [creating a publication.](https://www.postgresql.org/docs/10/sql-createpublication.html) + +## Create subscription + +**Prerequisites** +* Subscriber database should have migrations up-to-date + * All tables should exist and schema should be identical to publication +* Subscriber should be able to connect publisher database + +```sql +CREATE SUBSCRIPTION metaflow_ui_subscription + CONNECTION 'host=publication.database.host port=5432 user=postgres password=postgres dbname=postgres' + PUBLICATION metaflow_ui_publication + WITH (enabled = true, copy_data = true); +``` + +Where +* **enabled = true** - Starts replication process immediately +* **copy_data = true** - Copy all existing data from the publication + +Read more about [creating a subscription.](https://www.postgresql.org/docs/10/sql-createsubscription.html) diff --git a/overview/service-architecture.md b/overview/service-architecture.md index e52e80ed..5a1680a3 100644 --- a/overview/service-architecture.md +++ b/overview/service-architecture.md @@ -6,7 +6,7 @@ To benefit from the centralized [experiment tracking and sharing via Client API] ## Shared Mode Architecture -The diagram below shows an overview of services used by Metaflow in the shared mode. The services outlined in yellow are required: Development Environment, Datastore, and Metaflow Service and its database. The services outlined with dashed lines are optional. +The diagram below shows an overview of services used by Metaflow in the shared mode. The services outlined in yellow are required: Development Environment, Datastore, and Metaflow Service and its database. The services outlined with dashed lines are optional: Compute Cluster, Production Scheduler and User Interface. ![](../.gitbook/assets/service_architecture.png) @@ -69,7 +69,13 @@ In the administrator’s point of view, an object store like S3 is effectively m ## Optional Services -The following two services are optional. They provide a way to scale out Metaflow executions and deploy Metaflow workflows in a highly available production scheduler. If your organization doesn’t require elastic scalability and occasional downtime for scheduled workflow executions is acceptable, you may ignore these services. +The following services are optional. They provide a way to scale out Metaflow executions and deploy Metaflow workflows in a highly available production scheduler. User Interface provides a way to monitor workloads efficiently in real-time within the user's own browser environment. + +You may ignore these optional services if: + +* Your organization doesn’t require elastic scalability +* Occasional downtime for scheduled workflow executions is acceptable +* User Interface is not relevant to your your organization ### Compute Cluster @@ -94,6 +100,14 @@ The user can deploy their Metaflow workflow to Production Scheduler with a singl Currently, Metaflow supports [AWS Step Functions as the Production Scheduler](https://docs.metaflow.org/going-to-production-with-metaflow/scheduling-metaflow-flows). For more background about production schedulers, see [the release blog post for Step Functions integration](https://medium.com/@NetflixTechBlog/unbundling-data-science-workflows-with-metaflow-and-aws-step-functions-d454780c6280). +### **User Interface** + +Metaflow provides an optional UI which enables the user to monitor workflows efficiently in real-time within the user’s own browser environment. + +The UI service doesn’t have any built-in support for authentication. We assume that the service is typically deployed inside a (virtual) private network that provides a secure operating environment. + +Optionally, the administrator may choose to replicate the main database to make sure that any load or interference caused by the UI service will not affect the Metadata service operation. Read more about [logical replication](../metaflow-on-aws/operations-guide/metaflow-ui-logical-replication-guide.md). + ## Security Considerations Metaflow relies on the security mechanisms and policies provided by the deployment environment, for instance, VPCs, Security Groups and IAM on AWS. These mechanisms allow you to define as fine-grained security policies as required by your organization.