Skip to content

Commit

Permalink
Merge branch 'main' into patch-1
Browse files Browse the repository at this point in the history
  • Loading branch information
tatiana authored Dec 14, 2023
2 parents 0707c36 + d062543 commit 5ce4c42
Show file tree
Hide file tree
Showing 32 changed files with 1,300 additions and 285 deletions.
28 changes: 16 additions & 12 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ repos:
- id: end-of-file-fixer
- id: mixed-line-ending
- id: pretty-format-json
args: ['--autofix']
args: ["--autofix"]
- id: trailing-whitespace
- repo: https://github.com/codespell-project/codespell
rev: v2.2.6
Expand Down Expand Up @@ -54,7 +54,7 @@ repos:
- --py37-plus
- --keep-runtime-typing
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.1.6
rev: v0.1.7
hooks:
- id: ruff
args:
Expand All @@ -63,30 +63,34 @@ repos:
rev: 23.11.0
hooks:
- id: black
args: [ "--config", "./pyproject.toml" ]
args: ["--config", "./pyproject.toml"]
- repo: https://github.com/asottile/blacken-docs
rev: 1.16.0
hooks:
- id: blacken-docs
alias: black
additional_dependencies: [black>=22.10.0]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: 'v1.7.1'
rev: "v1.7.1"

hooks:
- id: mypy
name: mypy-python
additional_dependencies: [types-PyYAML, types-attrs, attrs, types-requests, types-python-dateutil, apache-airflow]
args: [--config-file, "./pyproject.toml"]
additional_dependencies:
[
types-PyYAML,
types-attrs,
attrs,
types-requests,
types-python-dateutil,
apache-airflow,
]
files: ^cosmos
- repo: https://github.com/pycqa/flake8
rev: 6.1.0
hooks:
- id: flake8
entry: pflake8
additional_dependencies: [pyproject-flake8]

ci:
autofix_commit_msg: 🎨 [pre-commit.ci] Auto format from pre-commit.com hooks
autoupdate_commit_msg: ⬆ [pre-commit.ci] pre-commit autoupdate
skip:
- mypy # build of https://github.com/pre-commit/mirrors-mypy:types-PyYAML,types-attrs,attrs,types-requests,
- mypy # build of https://github.com/pre-commit/mirrors-mypy:types-PyYAML,types-attrs,attrs,types-requests,
#types-python-dateutil,[email protected] for python@python3 exceeds tier max size 250MiB: 262.6MiB
27 changes: 22 additions & 5 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Changelog
=========

1.3.0a2 (2023-11-23)
1.3.0a3 (2023-12-07)
--------------------

Features
Expand All @@ -10,6 +10,24 @@ Features
* Add ``ProfileMapping`` for Snowflake encrypted private key path by @ivanstillfront in #608
* Add support for Snowflake encrypted private key environment variable by @DanMawdsleyBA in #649
* Add ``DbtDocsGCSOperator`` for uploading dbt docs to GCS by @jbandoro in #616
* Add support to select using (some) graph operators when using ``LoadMode.CUSTOM`` and ``LoadMode.DBT_MANIFEST`` by @tatiana in #728
* Add cosmos/propagate_logs Airflow config support for disabling log pr… by @agreenburg in #648
* Add operator_args ``full_refresh`` as a templated field by @joppevos in #623
* Expose environment variables and dbt variables in ``ProjectConfig`` by @jbandoro in #735

Enhancements

* Make Pydantic an optional dependency by @pixie79 in #736
* Create a symbolic link to ``dbt_packages`` when ``dbt_deps`` is False when using ``LoadMode.DBT_LS`` by @DanMawdsleyBA in #730
* Support no ``profile_config`` for ``ExecutionMode.KUBERNETES`` and ``ExecutionMode.DOCKER`` by @MrBones757 and @tatiana in #681 and #731
* Add ``aws_session_token`` for Athena mapping by @benjamin-awd in #663

Others

* Replace flake8 for Ruff by @joppevos in #743
* Reduce code complexity to 8 by @joppevos in #738
* Update conflict matrix between Airflow and dbt versions by @tatiana in #731
* Speed up integration tests by @jbandoro in #732


1.2.5 (2023-11-23)
Expand Down Expand Up @@ -46,14 +64,13 @@ Others
* Docs: add execution config to MWAA code example by @ugmuka in #674
* Docs: highlight DAG examples in docs by @iancmoritz and @jlaneve in #695


1.2.3 (2023-11-09)
------------------

Features
Bug fix

* Add ``ProfileMapping`` for Vertica by @perttus in #540
* Add ``ProfileMapping`` for Snowflake encrypted private key path by @ivanstillfront in #608
* Add ``DbtDocsGCSOperator`` for uploading dbt docs to GCS by @jbandoro in #616
* Fix reusing config across TaskGroups/DAGs by @tatiana in #664


1.2.2 (2023-11-06)
Expand Down
2 changes: 1 addition & 1 deletion cosmos/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
Contains dags, task groups, and operators.
"""
__version__ = "1.3.0a2"
__version__ = "1.3.0a3"


from cosmos.airflow.dag import DbtDag
Expand Down
13 changes: 12 additions & 1 deletion cosmos/airflow/graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
from cosmos.core.graph.entities import Task as TaskMetadata
from cosmos.dbt.graph import DbtNode
from cosmos.log import get_logger
from typing import Union


logger = get_logger(__name__)
Expand Down Expand Up @@ -271,7 +272,17 @@ def build_airflow_graph(
for leaf_node_id in leaves_ids:
tasks_map[leaf_node_id] >> test_task

# Create the Airflow task dependencies between non-test nodes
create_airflow_task_dependencies(nodes, tasks_map)


def create_airflow_task_dependencies(
nodes: dict[str, DbtNode], tasks_map: dict[str, Union[TaskGroup, BaseOperator]]
) -> None:
"""
Create the Airflow task dependencies between non-test nodes.
:param nodes: Dictionary mapping dbt nodes (node.unique_id to node)
:param tasks_map: Dictionary mapping dbt nodes (node.unique_id to Airflow task)
"""
for node_id, node in nodes.items():
for parent_node_id in node.depends_on:
# depending on the node type, it will not have mapped 1:1 to tasks_map
Expand Down
22 changes: 20 additions & 2 deletions cosmos/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import tempfile
from dataclasses import InitVar, dataclass, field
from pathlib import Path
import warnings
from typing import Any, Iterator, Callable

from cosmos.constants import DbtResourceType, TestBehavior, ExecutionMode, LoadMode, TestIndirectSelection
Expand Down Expand Up @@ -39,10 +40,11 @@ class RenderConfig:
:param load_method: The parsing method for loading the dbt model. Defaults to AUTOMATIC
:param select: A list of dbt select arguments (e.g. 'config.materialized:incremental')
:param exclude: A list of dbt exclude arguments (e.g. 'tag:nightly')
:param selector: Name of a dbt YAML selector to use for parsing. Only supported when using ``load_method=LoadMode.DBT_LS``.
:param dbt_deps: Configure to run dbt deps when using dbt ls for dag parsing
:param node_converters: a dictionary mapping a ``DbtResourceType`` into a callable. Users can control how to render dbt nodes in Airflow. Only supported when using ``load_method=LoadMode.DBT_MANIFEST`` or ``LoadMode.DBT_LS``.
:param dbt_executable_path: The path to the dbt executable for dag generation. Defaults to dbt if available on the path.
:param env_vars: A dictionary of environment variables for rendering. Only supported when using ``LoadMode.DBT_LS``.
:param env_vars: (Deprecated since Cosmos 1.3 use ProjectConfig.env_vars) A dictionary of environment variables for rendering. Only supported when using ``LoadMode.DBT_LS``.
:param dbt_project_path Configures the DBT project location accessible on the airflow controller for DAG rendering. Mutually Exclusive with ProjectConfig.dbt_project_path. Required when using ``load_method=LoadMode.DBT_LS`` or ``load_method=LoadMode.CUSTOM``.
"""

Expand All @@ -51,15 +53,21 @@ class RenderConfig:
load_method: LoadMode = LoadMode.AUTOMATIC
select: list[str] = field(default_factory=list)
exclude: list[str] = field(default_factory=list)
selector: str | None = None
dbt_deps: bool = True
node_converters: dict[DbtResourceType, Callable[..., Any]] | None = None
dbt_executable_path: str | Path = get_system_dbt()
env_vars: dict[str, str] = field(default_factory=dict)
env_vars: dict[str, str] | None = None
dbt_project_path: InitVar[str | Path | None] = None

project_path: Path | None = field(init=False)

def __post_init__(self, dbt_project_path: str | Path | None) -> None:
if self.env_vars:
warnings.warn(
"RenderConfig.env_vars is deprecated since Cosmos 1.3 and will be removed in Cosmos 2.0. Use ProjectConfig.env_vars instead.",
DeprecationWarning,
)
self.project_path = Path(dbt_project_path) if dbt_project_path else None

def validate_dbt_command(self, fallback_cmd: str | Path = "") -> None:
Expand Down Expand Up @@ -96,6 +104,11 @@ class ProjectConfig:
:param manifest_path: The absolute path to the dbt manifest file. Defaults to None
:param project_name: Allows the user to define the project name.
Required if dbt_project_path is not defined. Defaults to the folder name of dbt_project_path.
:param env_vars: Dictionary of environment variables that are used for both rendering and execution. Rendering with
env vars is only supported when using ``RenderConfig.LoadMode.DBT_LS`` load mode.
:param dbt_vars: Dictionary of dbt variables for the project. This argument overrides variables defined in your dbt_project.yml
file. The dictionary is dumped to a yaml string and passed to dbt commands as the --vars argument. Variables are only
supported for rendering when using ``RenderConfig.LoadMode.DBT_LS`` and ``RenderConfig.LoadMode.CUSTOM`` load mode.
"""

dbt_project_path: Path | None = None
Expand All @@ -113,6 +126,8 @@ def __init__(
snapshots_relative_path: str | Path = "snapshots",
manifest_path: str | Path | None = None,
project_name: str | None = None,
env_vars: dict[str, str] | None = None,
dbt_vars: dict[str, str] | None = None,
):
# Since we allow dbt_project_path to be defined in ExecutionConfig and RenderConfig
# dbt_project_path may not always be defined here.
Expand All @@ -136,6 +151,9 @@ def __init__(
if manifest_path:
self.manifest_path = Path(manifest_path)

self.env_vars = env_vars
self.dbt_vars = dbt_vars

def validate_project(self) -> None:
"""
Validates necessary context is present for a project.
Expand Down
Loading

0 comments on commit 5ce4c42

Please sign in to comment.