Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Apache Beam operators - refactor operator - common Dataflow logic #976

Closed
wants to merge 22 commits into from

Commits on Feb 3, 2021

  1. Clean-up JS code in UI templates (apache#14019)

    - Use template literals instead of '+' for forming strings, when applicable
    - remove unused variables (gantt.html)
    - remove unused function arguments, when applicable
    XD-DENG authored Feb 3, 2021
    Configuration menu
    Copy the full SHA
    14805cc View commit details
    Browse the repository at this point in the history
  2. Add Apache Beam operators (apache#12814)

    Tobiasz Kędzierski authored Feb 3, 2021
    Configuration menu
    Copy the full SHA
    1872d87 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    ffd05e6 View commit details
    Browse the repository at this point in the history
  4. Prepare to release a new wave of providers. (apache#14013)

    For the regular providers, Vast majority is in `1.0.1` version and it is
    only documentation update - but this way we will have a consistent set
    of documentation (including commit history) as well as when we release
    in PyPI, the READMES will be much smaller and link to the documentation.
    
    We have two new providers (version 1.0.0):
    
    * neo4j
    * apache.beam
    
    There are few providers with changes:
    
    Breaking changes (2.0.0)
    
    * google
    * slack
    
    Feature changes (1.1.0):
    
    * amazon
    * exasol
    * http
    * microsoft.azure
    * openfaas
    * sftp
    * snowflake
    * ssh
    
    There were also few providers with 'real' bugfixes (1.0.1):
    
    * apache.hive
    * cncf.kubernetes
    * docker
    * elasticsearch
    * exasol
    * mysql
    * openfaas
    * papermill
    * presto
    * sendgrid
    * sqlite
    
    The ''backport packages" documentation is prepared only for those
    providers that had actual bugfix/features/breaking changes:
    
    ```
    amazon apache.hive cncf.kubernetes docker elasticsearch exasol google
    http microsoft.azure mysql openfaas papermill presto sendgrid sftp
    slack snowflake sqlite ssh
    ```
    
    Only those will be generated with `2021.2.5` calver version.
    potiuk authored Feb 3, 2021
    Configuration menu
    Copy the full SHA
    88bdcfa View commit details
    Browse the repository at this point in the history

Commits on Feb 4, 2021

  1. Make the role assigned to anonymous users customizable (apache#14042)

    Fixes the issue wherein regardless of what role anonymous users are assigned (via the `AUTH_ROLE_PUBLIC` env var), they can't see any DAGs.
    
    Current behavior causes:
    Anonymous users are handled as a special case by Airflow's DAG-related security methods (`.has_access()` and `.get_accessible_dags()`). Rather than checking the `AUTH_ROLE_PUBLIC` value to check for role permissions, the methods reject access to view or edit any DAGs.
    
    Changes in this PR:
    Rather than hardcoding permission rules inside the security methods, this change checks the `AUTH_ROLE_PUBLIC` value and gives anonymous users all permissions linked to the designated role. 
    
    **This places security in the hands of the Airflow users. If the value is set to `Admin`, anonymous users will have full admin functionality.**
    
    This also changes how the `Public` role is created. Currently, the `Public` role is created automatically by Flask App Builder. This PR explicitly declares `Public` as a default role with no permissions in `security.py`. This change makes it easier to test.
    
    closes: apache#13340
    jhtimmins authored Feb 4, 2021
    Configuration menu
    Copy the full SHA
    78aa921 View commit details
    Browse the repository at this point in the history
  2. Retry critical methods in Scheduler loop in case of OperationalError (a…

    …pache#14032)
    
    In the case of OperationalError (caused deadlocks, network blips), the scheduler will now retry those methods 3 times.
    
    closes apache#11899
    closes apache#13668
    kaxil authored Feb 4, 2021
    Configuration menu
    Copy the full SHA
    914e9ce View commit details
    Browse the repository at this point in the history
  3. Fix broken SLA Mechanism (apache#14056)

    closes apache#14050
    
    We were not de-serializing `BaseOperator.sla` properly, hence
    we were returning float instead of `timedelta` object.
    
    Example: 100.0 instead of timedelta(seconds=100)
    
    And because we had a check in _manage_sla in `SchedulerJob` and `DagFileProcessor`,
    we were skipping SLA.
    
    SchedulerJob:
    https://github.com/apache/airflow/blob/88bdcfa0df5bcb4c489486e05826544b428c8f43/airflow/jobs/scheduler_job.py#L1766-L1768
    
    DagFileProcessor:
    https://github.com/apache/airflow/blob/88bdcfa0df5bcb4c489486e05826544b428c8f43/airflow/jobs/scheduler_job.py#L395-L397
    kaxil authored Feb 4, 2021
    Configuration menu
    Copy the full SHA
    604a37e View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    84ef24c View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    e2a06a3 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    d45739f View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    eb78a8b View commit details
    Browse the repository at this point in the history
  8. Correctly capture debug logs in plugin tests. (apache#14058)

    This fixes the test test_should_load_plugins_from_property, which is currently quarantined as a "Heisentest".
    
    Current behavior:
    The test currently fails because the records that it expects to find in the logger are not present.
    
    Cause:
    While the test sets the logger as "DEBUG", it doesn't specify which logger to update. Python loggers are namespaced (typically based on the current file's path), but this has to be defined explicitly. In the absence of a specified logger, any attempts to lookup will return the BaseLogger instance.
    
    The test is therefore updating the log level for the base logger, but when the test runs, the plugins_manager.py file defines a namespaced logger log = logging.getLogger(__name__) used throughout the file. Since a different logger is used, the original log level, in this case INFO, is used. INFO is a higher level than DEBUG, so the calls to log.debug() get filtered out, and when the test looks for log records it finds an empty list.
    
    Fix:
    Just specify which logger to update when modifying the log level in the test.
    jhtimmins authored Feb 4, 2021
    Configuration menu
    Copy the full SHA
    e80ad5a View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    2a3960f View commit details
    Browse the repository at this point in the history
  10. Update to Pytest 6.0 (apache#14065)

    And pytest 6 removed a class that the rerunfailures plugin was using, so
    we have to upgrade that too.
    ashb authored Feb 4, 2021
    Configuration menu
    Copy the full SHA
    10c026c View commit details
    Browse the repository at this point in the history
  11. Remove permissions to read Configurations for User and Viewer roles (a…

    …pache#14067)
    
    Only `Admin` or `Op` roles should have permissions to view Configurations.
    
    Previously, Users with `User` or `Viewer` role were able to get/view configurations using
    the REST API or in the Webserver. From Airflow 2.0.1, only users with `Admin` or `Op` role would be able
    to get/view Configurations.
    kaxil authored Feb 4, 2021
    Configuration menu
    Copy the full SHA
    3909232 View commit details
    Browse the repository at this point in the history
  12. Fix DAGs mount path in Kubernetes worker pod when gitSync is enabled (a…

    …pache#13826)
    
    * Update pod-template-file.kubernetes-helm-yaml
    
    * Fix ssh-key access issue
    
    This change allows dags.gitSync.containerName to read ssh-key from file system.
    Similar to this https://github.com/varunvora/airflow/blob/ce0e6280d2ea39838e9f0617625cd07a757c3461/chart/templates/scheduler/scheduler-deployment.yaml#L92
    It solves apache#13680 issue for private repositories.
    
    Co-authored-by: Denis Krivenko <[email protected]>
    varunvora and dnskr authored Feb 4, 2021
    Configuration menu
    Copy the full SHA
    5f74219 View commit details
    Browse the repository at this point in the history
  13. Small docs readme update (apache#14062)

    * Add instruction for running docs locally
    
    * Fix RST syntax
    
    * Update docs/README.rst
    
    Co-authored-by: Kaxil Naik <[email protected]>
    
    Co-authored-by: Kaxil Naik <[email protected]>
    leahecole and kaxil authored Feb 4, 2021
    Configuration menu
    Copy the full SHA
    f6cfc41 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    26c2f4d View commit details
    Browse the repository at this point in the history
  15. Fix Kerberos network creation on older docker-compose (apache#14070)

    `attachable` is only a property of compose version 3.1 files, but we are
    on 2.2 still.
    
    This was failing on self-hosted runners with an error
    `networks.example.com value Additional properties are not allowed
    ('attachable' was unexpected)`
    ashb authored Feb 4, 2021
    Configuration menu
    Copy the full SHA
    5dbabdd View commit details
    Browse the repository at this point in the history

Commits on Feb 5, 2021

  1. Configuration menu
    Copy the full SHA
    94a5e08 View commit details
    Browse the repository at this point in the history
  2. Refactor operator - common dataflow logic

    Tobiasz Kędzierski committed Feb 5, 2021
    Configuration menu
    Copy the full SHA
    161a196 View commit details
    Browse the repository at this point in the history
  3. fixup! Refactor operator - common dataflow logic

    Tobiasz Kędzierski committed Feb 5, 2021
    Configuration menu
    Copy the full SHA
    dfa5232 View commit details
    Browse the repository at this point in the history