Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add testing #22

Merged
merged 31 commits into from
Aug 11, 2022
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
c6b517d
Add working script to run macro
JCZuurmond Dec 28, 2021
b3b0a0a
Add comment about adapters
JCZuurmond Dec 28, 2021
5bb9728
Try using a project instead of runtime config
JCZuurmond Dec 28, 2021
4e31232
Remove spark credentials and Project
JCZuurmond Dec 28, 2021
37aa6e6
Use connection from soda spark
JCZuurmond Dec 28, 2021
69ed207
Add test requirements
JCZuurmond Dec 28, 2021
236286d
Add pytest ini
JCZuurmond Aug 5, 2022
d72ebf2
Move everything into pytest fixtures
JCZuurmond Dec 28, 2021
18170a1
Copy connection
JCZuurmond Dec 29, 2021
25e5806
Remove pytest-dbt-core code
JCZuurmond Jan 28, 2022
409d827
Add pytest dbt core as test requirement
JCZuurmond Jan 28, 2022
56c848a
Add workflow for testing
JCZuurmond Jan 28, 2022
1560e5e
Bump pytest dbt core version
JCZuurmond May 27, 2022
3014264
Add profile to dbt project
JCZuurmond May 27, 2022
153708b
Add profiles
JCZuurmond May 27, 2022
f9b0db7
Add profiles dir when running pytest
JCZuurmond May 27, 2022
52307bb
Remove redundant from future import annotations
JCZuurmond May 27, 2022
cb447a2
Bump pytest-dbt-core version
JCZuurmond Jul 22, 2022
8b7eb8f
Change version
JCZuurmond Jul 22, 2022
6dfd9f7
Add pyspark dependency
JCZuurmond Jul 22, 2022
91b6bb1
Change pyspark dependency to dbt-spark session
JCZuurmond Jul 22, 2022
30112b0
Change required by to dbt-spark
JCZuurmond Jul 23, 2022
59f2139
Add test docstring
JCZuurmond Aug 5, 2022
74482a7
Make test less strict
JCZuurmond Aug 5, 2022
df29346
Create and delete table with fixture
JCZuurmond Aug 5, 2022
ffe50cb
Fix typo
JCZuurmond Aug 5, 2022
8b13fda
Add section about testing to the documentation
JCZuurmond Aug 5, 2022
29e88d6
Move test macros into tests/unit
JCZuurmond Aug 11, 2022
ee25a3e
Run unit tests only in Github action
JCZuurmond Aug 11, 2022
bbc7923
Merge dev and test requirements
JCZuurmond Aug 11, 2022
d380607
Move conftest into functional
JCZuurmond Aug 11, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions .github/workflows/workflow.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: Test

on:
pull_request:
push:
branches:
- main

jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2

- name: Set up Python 3.9
uses: actions/setup-python@v2
with:
python-version: 3.9

- name: Install dependencies
shell: bash
run: |
sudo apt-get install libsasl2-dev
python -m pip install --upgrade pip
python -m pip install -r test-requirements.txt

- name: Run pytest
shell: bash
run: DBT_PROFILES_DIR=$PWD pytest tests/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see - this tries to run tests/functional/test_utils.py as well (added in #25), which has other requirements (specified in dev-requirements.txt).

Options:

  1. Unify: Add python -m pip install -r dev-requirements.txt into the "Install dependencies" step above—or better yet, unify dev-requirements + test-requirements into a single file—and try running the tests with --profile session. I imagine that a few may fail (as in dbt-spark), but the rest should succeed, and we can mark the failing ones to skip on the "session" profile.
  2. Separate: Treat these as two totally independent testing frameworks, which is really what they are. Create tests/functional for the tests using the dbt-core functional-testing framework, and tests/unit for the tests using the pytest-dbt-core unit-testing framework. Have clearly separate READMEs and files used by each: separate requirements, separate profile setup, all that jazz.

What do you think? I lean toward option 2, "separate" — and then tests/unit could be a clear, self-contained, and real-world demonstration of the potential of pytest-dbt-core

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to option 2, "separate" (especially different folders tests/unit vs. tests/functional).

Separate folders won't undermine the functionality at all, and it will make the delineations between different frameworks more clear.

Side note: I don't see a problem with them sharing a single dev-requirements.txt though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the test_macros.py in a unit subfolder. And, I merged the test-requirements.txt into the dev-requirements.txt. I think it is confusing to have a test-requirements.txt that does not contain all dependencies for the functional tests.

I had to move the conftest.py into the functional subdirectory because it was interfering with the unit tests.

37 changes: 36 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ dispatch:

### Note to maintainers of other packages

The spark-utils package may be able to provide compatibility for your package, especially if your package leverages dbt-utils macros for cross-database compatibility. This package _does not_ need to be specified as a depedency of your package in `packages.yml`. Instead, you should encourage anyone using your package on Apache Spark / Databricks to:
The spark-utils package may be able to provide compatibility for your package, especially if your package leverages dbt-utils macros for cross-database compatibility. This package _does not_ need to be specified as a dependency of your package in `packages.yml`. Instead, you should encourage anyone using your package on Apache Spark / Databricks to:
- Install `spark_utils` alongside your package
- Add a `dispatch` config in their root project, like the one above

Expand All @@ -56,6 +56,41 @@ We welcome contributions to this repo! To contribute a new feature or a fix,
please open a Pull Request with 1) your changes and 2) updated documentation for
the `README.md` file.

## Testing

The macros are tested with [`pytest`](https://docs.pytest.org) and
[`pytest-dbt-core`](https://pypi.org/project/pytest-dbt-core/). For example,
the [`create_tables` macro is tested](./tests/test_macros.py) by:

1. Create a test table (test setup):
``` python
spark_session.sql(f"CREATE TABLE {table_name} (id int) USING parquet")
```
2. Call the macro generator:
``` python
tables = macro_generator()
```
3. Assert test condition:
``` python
assert simple_table in tables
```
4. Delete the test table (test cleanup):
``` python
spark_session.sql(f"DROP TABLE IF EXISTS {table_name}")
```

A macro is fetched using the
[`macro_generator`](https://pytest-dbt-core.readthedocs.io/en/latest/dbt_spark.html#usage)
fixture and providing the macro name trough
[indirect parameterization](https://docs.pytest.org/en/7.1.x/example/parametrize.html?highlight=indirect#indirect-parametrization):

``` python
@pytest.mark.parametrize(
"macro_generator", ["macro.spark_utils.get_tables"], indirect=True
)
def test_create_table(macro_generator: MacroGenerator) -> None:
```

----

### Getting started with dbt + Spark
Expand Down
3 changes: 2 additions & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
name: 'spark_utils'
profile: 'sparkutils'
version: '0.3.0'
config-version: 2
require-dbt-version: [">=1.2.0", "<2.0.0"]
macro-paths: ["macros"]
macro-paths: ["macros"]
8 changes: 8 additions & 0 deletions profiles.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
sparkutils:
target: test
outputs:
test:
type: spark
method: session
schema: test
host: NA # not used, but required by `dbt-spark`
4 changes: 4 additions & 0 deletions pytest.ini
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,7 @@ env_files =
test.env
testpaths =
tests/functional
spark_options =
spark.app.name: spark-utils
spark.executor.instances: 1
spark.sql.catalogImplementation: in-memory
4 changes: 4 additions & 0 deletions test-requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
pytest~=6.2.5
pytest-spark~=0.6.0
JCZuurmond marked this conversation as resolved.
Show resolved Hide resolved
pytest-dbt-core~=0.1.0
dbt-spark[session]~=1.1.0
26 changes: 26 additions & 0 deletions tests/test_macros.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import uuid

import pytest
from dbt.clients.jinja import MacroGenerator
from pyspark.sql import SparkSession


@pytest.fixture
def simple_table(spark_session: SparkSession) -> str:
JCZuurmond marked this conversation as resolved.
Show resolved Hide resolved
"""Create and delete a simple table used for testing."""
table_name = f"default.table_{uuid.uuid4()}".replace("-", "_")
spark_session.sql(f"CREATE TABLE {table_name} (id int) USING parquet")
yield table_name
spark_session.sql(f"DROP TABLE IF EXISTS {table_name}")


@pytest.mark.parametrize(
"macro_generator", ["macro.spark_utils.get_tables"], indirect=True
)
def test_create_table(
macro_generator: MacroGenerator, simple_table: str
) -> None:
"""The `get_tables` macro should return the created table."""
tables = macro_generator()
assert simple_table in tables