Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Donotmerge this one - [DSRE-6] - Upgrade Airflow wtmo to 2.1 #1334

Closed
wants to merge 7 commits into from

Conversation

haroldwoo
Copy link
Contributor

@haroldwoo haroldwoo commented Jul 8, 2021

From: https://airflow.apache.org/announcements/

"May 21, 2021

I’m happy to announce that Apache Airflow 2.1.0 was just released. This one includes a raft of fixes and other small improvements, but some notable additions include:

A Create a DAG Calendar View to show the status of your DAG run across time more easily

The cross-dag-dependencies view (which used to be an external plugin) is now part of core

Mask passwords and sensitive info in task logs and UI (finally!)

Improvmenets to webserver start up time (mostly around time spent syncing DAG permissions)

Please note that this release no long includes the HTTP extra provider by default, as we discovered that it pulls in an LGPL dependency (via the requests module of all places) so it is now optional."

Cloud composer versions (just for reference)
https://cloud.google.com/composer/docs/concepts/versioning/composer-versions

More details DSRE-6

Commit messages (because these will eventually be squashed)

  • dag concurrency is now max_active_tasks_per_dag
  • http provider no longer installed by default
  • remove apply_default decorator where the operator inherits directly from BaseOperator
  • default_queue has been moved from celery to operators config section
  • Default min_file_process_interval set to 30 after airflow 2.0.1 bc scheduling decisions have been moved from dagfileprocessor to scheduler
  • Change maintainer for Dockerfile
  • patch security vunerabilities, modify airflow.cfg to be 2.0 compliant
  • Fix upgrade_check 'No additional argument allowed in BaseOperator' (kwargs)
  • Modify all airflow.contrib.operators* import paths.
  • replace hook imports with the 2.0 equivs
  • modify custom task sensor's exception message to specify that it's mozilla specific
  • Remove scheduler run duration config option for 2.0 now that scheduler is considered 'more stable'
  • removed deprecated import mechanism in 2.0, need to explicitly import now
  • changes to sensor imports
  • fix airflowexception import path
  • Normalize all instances of bigquery_conn_id and google_cloud_storage_conn_id to gcp_conn_id
    Except for our backported bigquery 1_10_2 code where new code also refs multiple conn ids.
    The ltv dag references this backported code so is not changed.
    Also this commit replaces GoogleCloudStorageDeleteOperator with the newer GCSDeleteObjectsOperator for 2.0
    (in places I missed with a prev commit)
  • rename dataproc jars argument, increase master disk size to match new default for dataproc clusters 500gb->1tb
  • update pip install endpoints for airflow extras
  • rm apply_defaults decorator, as in 2.0 it is automatically added to all operators via baseoperator metaclass
  • Add configuration for masking connection and variable values in the UI
  • Add new config value for default pool size but commented out since it will no-op
  • breaking changes in oauth google provider for 2.0
  • Bringing airflow.cfg file up to current 2.1, by diffing with our current config and looking at deployed env vars.
  • make requirements pip compile and 2.1 ready
  • Containers will now run (authlib~=0.15.3)
  • adding flask-admin to pip requirements
  • modify airflow connections commands to work with 2.0
  • add webserver config for dev mode that bypasses authentication for local dev
  • rm deprecated webserver rbac config and logic from plugins
  • add airflow-worker.pid to .gitignore
  • add fivetran provider
  • fix bin/test-parse
  • progress on deprecating old custom moz_dataproc_operator
  • Support new Dataproc Operator create/delete, by using cluster generator to construct object and making custom mods.
  • rm moz_dataproc_operator now that functionality was moved to utils/dataproc.py
  • airflow variables cli changed
  • BigQueryToCloudStorageOperator deprecated in favor of BigQueryToGCSOperator
  • add comments and modify dag serialization intervals for local dev testing
  • fix sleep op
  • remove references to get_default_executor as it no longer exists in 2.0 at that import location. It is also called by base clases when not specified so this is redundant
  • replace deprecated AwsHook with AwsBaseHook
  • default_queue has been moved from celery to operators config section
  • rm release_telemetry_aggregates (not in use), and some fixes to import errors for glam and exps live
  • modify deprecated S3ToGoogleCloudStorageTransferOperator to use new operator
  • fix prio dags, deprecated GoogleCloudStorageToGoogleCloudStorageOperator and goolecloudstoragehook is now gcshook
  • can only concat tuples (not lists) to templated fields now
  • So this is potentially part of the fix to the could not deserialize key data load errors. The problem is that you need to load a valid gcp_conn_id into the UI to make some of these errors disappear, because the dag loading tries to actually load the key now
  • Rm references to googlecloudbasehook. We relied on this to parse out project_id from the gcp_conn_id, and in local dev testing before these connections weren't parsed by airflow 1.10.x. But in 2.1, airflow does parse these connections and gives errors because we inject dummy data. Another other option would be to construct better dummy data, or more preferably not load any dags at all which would alleviate the need for adding dummy vars/connections, and have local developers only work on the dag they're interested in
  • Modify prio dags to not use googlecloudbasehook to parse out projectid
  • more changes to support prio changes from prev commit
  • nit to consolidate vars
  • fix and docker-compose comment out devdags for testing
  • fix prio dag, test changes to dags with awsbasehook
  • remove emoji causing issue with webapp not able to startup. UnicodeEncodeError: 'latin-1'
  • Variable.get is now Variable.get_val if you want to retreive from metastore db
  • fix prio dag typo env_var to env_vars, change dataproc_spark_jars etc. to the new values dataproc_jars and related
  • Variable.get no longer works as intended. It imports from secrets backends. Variable.get_val is no longer a class method so class instances need to be created and then method called
  • in 2.0 the dataproc cluster_config object keys have changed. this needs to be tested for numlocalssds and enable component gateway
  • Make variable class construction more explicit
  • explicitly set projectid to short circuit gcs hook projectid lookups that fail locally with dummy connections
  • rm backported bq operator, update ltv to use new operator
  • Fix backfill (alpha) plugin, only basic testing performed
  • Just an update to a comment in glam.py that subdagoperators don't use sequentialexecutor anymore
  • Fix monkey patch gevent warnings (details in DSRE-6), airflow_local_settings.py by importing gevent and monkey.patch_all()
  • Fix mozmenu ui plugin
  • Upgrade from 2.1.0 to 2.1.1 because of secret masking issue effecting ui logs
  • GkePodOperator will now inherit from upstream GkeStartPodOperator. Rm'ing old legacy patch code
  • Bump google cloud dataproc provider to 2.5.0 to address dataprocv1beta2 client code mismatch with clusterConfig when creating spark clusters
  • Modify dataproc code to use patched versions. This wasn't the actual issue but leaving this code here due to the deprecation of the dataproc_v1beta2 services/api anyway. Pinning the working versions of dataproc code as well
  • Fix variable.get calls to not return None
  • Modify dataproc to use upstream code, keep patched code to avoid redeployment when google deprecates beta apis
  • Make local value of scheduler dag_dir_list_interval 30, and 300 in prod via env var
  • add dags/.airflowignore empty file

@haroldwoo haroldwoo added the wip label Jul 8, 2021
@haroldwoo haroldwoo changed the title WIP - Upgrade Airflow wtmo to 2.0 WIP - [DSRE-6] - Upgrade Airflow wtmo to 2.0 Jul 8, 2021
@haroldwoo haroldwoo changed the title WIP - [DSRE-6] - Upgrade Airflow wtmo to 2.0 [DSRE-6] - Upgrade Airflow wtmo to 2.0 Jul 8, 2021
@haroldwoo haroldwoo changed the title [DSRE-6] - Upgrade Airflow wtmo to 2.0 [DSRE-6] - Upgrade Airflow wtmo to 2.1 Aug 17, 2021
@haroldwoo haroldwoo removed the wip label Oct 12, 2021
Copy link
Contributor

@scholtzan scholtzan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@@ -1,5 +1,5 @@
from airflow import DAG
from airflow.operators.sensors import ExternalTaskSensor
from airflow.sensors.external_task import ExternalTaskSensor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -1,7 +1,7 @@
import datetime

from airflow import DAG
from airflow.contrib.hooks.aws_hook import AwsHook
from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
from operators.task_sensor import ExternalTaskCompletedSensor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't need to be part of this Pr, but once we switch to 2.1 we can drop the custom sensor here and just use the default ExternalTaskSensor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aw, our github/jira integration was supposed to change that to a url link to https://mozilla-hub.atlassian.net/browse/DSRE-200

I guess it only works in the description field

dags/incline_dash.py Outdated Show resolved Hide resolved
AwsBaseHook(aws_conn_id=aws_conn_id, client_type='s3').get_credentials() if aws_conn_id else (),
)
if value is not None
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit verbose, but I think the explicitness is a good thing so that future developers working on this DAG can know the dev behavior just by looking at the content of the dag.

dags/probe_scraper.py Outdated Show resolved Hide resolved
@haroldwoo haroldwoo force-pushed the ta-2.0 branch 2 times, most recently from 47377b2 to cbf4a21 Compare October 12, 2021 23:49
@haroldwoo
Copy link
Contributor Author

Sorry this is moving to #1377, I screwed up the branch somehow

@haroldwoo haroldwoo changed the title [DSRE-6] - Upgrade Airflow wtmo to 2.1 Donotmerge this one - [DSRE-6] - Upgrade Airflow wtmo to 2.1 Oct 13, 2021
@haroldwoo haroldwoo closed this Oct 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants