diff --git a/website/dbt-versions.js b/website/dbt-versions.js index dfeebbd766e..ae057c461f7 100644 --- a/website/dbt-versions.js +++ b/website/dbt-versions.js @@ -4,6 +4,10 @@ exports.versions = [ EOLDate: "2023-04-27", // TODO estimated for now isPrerelease: true }, + { + version: "1.1", + EOLDate: "2024-5-01" + }, { version: "1.0", EOLDate: "2023-12-03" @@ -19,8 +23,12 @@ exports.versions = [ ] exports.versionedPages = [ - { - "page": "docs/guides/migration-guide/upgrading-to-v1.1", - "firstVersion": "1.1", - } + { + "page": "docs/contributing/testing-a-new-adapter", + "firstVersion": "1.1", + }, + { + "page": "docs/guides/migration-guide/upgrading-to-v1.1", + "firstVersion": "1.1", + } ] diff --git a/website/docs/docs/contributing/building-a-new-adapter.md b/website/docs/docs/contributing/building-a-new-adapter.md index 9dba9dc2853..528ab17e3e0 100644 --- a/website/docs/docs/contributing/building-a-new-adapter.md +++ b/website/docs/docs/contributing/building-a-new-adapter.md @@ -347,13 +347,13 @@ See examples: #### `__version__.py` -To assure that `dbt --version` provides the latest dbt core version the adapter supports, be sure include a `__version__.py` file. The filepath will be `dbt/adapters//__version__.py`. We recommend using the latest dbt core version and as the adapter is made compatiable with later versions, this file will need to be updated. For a sample file, check out this [example](https://github.com/dbt-labs/dbt-core/blob/develop/plugins/snowflake/dbt/adapters/snowflake/__version__.py). +To assure that `dbt --version` provides the latest dbt core version the adapter supports, be sure include a `__version__.py` file. The filepath will be `dbt/adapters//__version__.py`. We recommend using the latest dbt core version and as the adapter is made compatible with later versions, this file will need to be updated. For a sample file, check out this [example](https://github.com/dbt-labs/dbt-core/blob/develop/plugins/snowflake/dbt/adapters/snowflake/__version__.py). It should be noted that both of these files are included in the bootstrapped output of the `create_adapter_plugins.py` so when using that script, these files will be included. ### Testing your new adapter -You can use a pre-configured [dbt adapter test suite](https://github.com/dbt-labs/dbt-adapter-tests) to test that your new adapter works. These tests include much of dbt's basic functionality, with the option to override or disable functionality that may not be supported on your adapter. +This has moved to its own page: ["Testing a new adapter"](testing-a-new-adapter) ### Documenting your new adapter diff --git a/website/docs/docs/contributing/testing-a-new-adapter.md b/website/docs/docs/contributing/testing-a-new-adapter.md new file mode 100644 index 00000000000..73f2a6e2a18 --- /dev/null +++ b/website/docs/docs/contributing/testing-a-new-adapter.md @@ -0,0 +1,489 @@ +--- +title: "Testing a new adapter" +id: "testing-a-new-adapter" +--- + +:::info + +Previously, we offered a packaged suite of tests for dbt adapter functionality: [`pytest-dbt-adapter`](https://github.com/dbt-labs/dbt-adapter-tests). We are deprecating that suite, in favor of the newer testing framework outlined in this document. + +::: + +This document has two sections: +1. ["About the testing framework"](#about-the-testing-framework) describes the standard framework that we maintain for using pytest together with dbt. It includes an example that shows the anatomy of a simple test case. +2. ["Testing your adapter"](#testing-your-adapter) offers a step-by-step guide for using our out-of-the-box suite of "basic" tests, which will validate that your adapter meets a baseline of dbt functionality. + +## Prerequisites +- Your adapter must be compatible with dbt-core **v1.1** or newer +- You should be familiar with **pytest**: https://docs.pytest.org/ + +## About the testing framework + +dbt-core offers a standard framework for running pre-built functional tests, and for defining your own tests. The core testing framework is built using `pytest`, a mature and standard library for testing Python projects. + +The **[`tests` module](https://github.com/dbt-labs/dbt-core/tree/HEAD/core/dbt/tests)** within `dbt-core` includes basic utilities for setting up pytest + dbt. These are used by all "pre-built" functional tests, and make it possible to quickly write your own tests. + +Those utilities allow you to do three basic things: +1. **Quickly set up a dbt "project."** Define project resources via methods such as `models()` and `seeds()`. Use `project_config_update()` to pass configurations into `dbt_project.yml`. +2. **Define a sequence of dbt commands.** The most important utility is `run_dbt()`, which returns the [results](dbt-classes#result-objects) of each dbt command. It takes a list of CLI specifiers (subcommand + flags), as well as an optional second argument, `expect_pass=False`, for cases where you expect the command to fail. +3. **Validate the results of those dbt commands.** For example, `check_relations_equal()` asserts that two database objects have the same structure and content. You can also write your own `assert` statements, by inspecting the results of a dbt command, or querying arbitrary database objects with `project.run_sql()`. + +While all utilities are intended to be reusable, you won't need all of them for every test. In the example below, we'll show a simple test case that uses only a few utilities. + +### Example: a simple test case + +This example will show you the anatomy of a test case using dbt + pytest. We will create reusable components, combine them to form a dbt "project", and define a sequence of dbt commands. Then, we'll use Python `assert` statements to ensure those commands succeed (or fail) as we expect. + +In ["Getting started running basic tests,"](#getting-started-running-basic-tests) we'll offer step-by-step instructions for installing and configuring `pytest`, so that you can run it on your own machine. For now, it's more important to see how the pieces of a test case fit together. + +This example includes a seed, a model, and two tests—one of which will fail. + +1. Define Python strings that will represent the file contents in your dbt project. Defining these in a separate file enables you to reuse the same components across different test cases. The pytest name for this type of reusable component is "fixture." + + + +```python +# seeds/my_seed.csv +my_seed_csv = """ +id,name,some_date +1,Easton,1981-05-20T06:46:51 +2,Lillian,1978-09-03T18:10:33 +3,Jeremiah,1982-03-11T03:59:51 +4,Nolan,1976-05-06T20:21:35 +""".lstrip() + +# models/my_model.sql +my_model_sql = """ +select * from {{ ref('my_seed') }} +union all +select null as id, null as name, null as some_date +""" + +# models/my_model.yml +my_model_yml = """ +version: 2 +models: + - name: my_model + columns: + - name: id + tests: + - unique + - not_null # this test will fail +""" +``` + + + +2. Use the "fixtures" to define the project for your test case. Following the default pytest configurations, the file name must begin with `test_`, and the class name must begin with `Test`. + + + +```python +import pytest +from dbt.tests.util import run_dbt + +# our file contents +from tests.functional.example.fixtures import ( + my_seed_csv, + my_model_sql, + my_model_yml, +) + +# class must begin with 'Test' +class TestExample: + """ + Methods in this class will be of two types: + 1. Fixtures defining the dbt "project" for this test case. + These are scoped to the class, and reused for all tests in the class. + 2. Actual tests, whose names begin with 'test_'. + These define sequences of dbt commands and 'assert' statements. + """ + + # configuration in dbt_project.yml + @pytest.fixture(scope="class") + def project_config_update(self): + return { + "name": "example", + "models": {"+materialized": "view"} + } + + # everything that goes in the "seeds" directory + @pytest.fixture(scope="class") + def seeds(self): + return { + "my_seed.csv": my_seed_csv, + } + + # everything that goes in the "models" directory + @pytest.fixture(scope="class") + def models(self): + return { + "my_model.sql": my_model_sql, + "my_model.yml": my_model_yml, + } + + # continues below +``` + + + +3. Now that we've set up our project, it's time to define a sequence of dbt commands and assertions. These methods will be defined in the same file, on the same class (`TestExampleFailingTest`). In pytest, the methods defining actual tests must have names that begin with `test_`. + + + +```python + # continued from above + + # The actual sequence of dbt commands and assertions + # pytest will take care of all "setup" + "teardown" + def test_run_seed_test(self, project): + """ + Seed, then run, then test. We expect one of the tests to fail + An alternative pattern is to use pytest "xfail" (see below) + """ + # seed seeds + results = run_dbt(["seed"]) + assert len(results) == 1 + # run models + results = run_dbt(["run"]) + assert len(results) == 1 + # test tests + results = run_dbt(["test"], expect_pass = False) # expect failing test + assert len(results) == 2 + # validate that the results include one pass and one failure + result_statuses = sorted(r.status for r in results) + assert result_statuses == ["fail", "pass"] + + @pytest.mark.xfail + def test_build(self, project): + """Expect a failing test""" + # do it all + results = run_dbt(["build"]) +``` + + + +3. Our test is ready to run! The last step is to invoke `pytest` from your command line. We'll walk through the actual setup and configuration of `pytest` in the next section. + + + +```sh +$ python3 -m pytest tests/functional/test_example.py +=========================== test session starts ============================ +platform ... -- Python ..., pytest-..., pluggy-... +rootdir: ... +plugins: ... + +tests/functional/test_example.py .X [100%] + +======================= 1 passed, 1 xpassed in 1.38s ======================= +``` + + + +You can find more ways to run tests, along with a full command reference, in the [pytest usage docs](https://docs.pytest.org/how-to/usage.html). + +We've found the `-s` flag (or `--capture=no`) helpful to print logs from the underlying dbt invocations, and to step into an interactive debugger if you've added one. You can also use environment variables to set [global dbt configs](global-configs), such as `DBT_DEBUG` (to show debug-level logs). + +## Testing your adapter + +Anyone who installs `dbt-core`, and wishes to define their own test cases, can use the framework presented in the first section. The framework is especially useful for testing standard dbt behavior across different databases. + +To that end, we have built and made available a [package of reusable adapter test cases](https://github.com/dbt-labs/dbt-core/tree/HEAD/tests/adapter), for creators and maintainers of adapter plugins. These test cases cover basic expected functionality, as well as functionality that frequently requires different implementations across databases. + +For the time being, this package is also located within the `dbt-core` repository, but separate from the `dbt-core` Python package. + +### Categories of tests + +In the course of creating and maintaining your adapter, it's likely that you will end up implementing tests that fall into three broad categories: + +1. **Basic tests** that every adapter plugin is expected to pass, defined in `tests.adapter.basic`. Given differences across data platforms, these may require slight modification or reimplementation. Significantly overriding or disabling these tests should be with good reason, since each represents basic functionality that dbt users are accustomed to having. For example, if your adapter does not support incremental models, you should disable the test _and_ note that limitation in documentation, READMEs, and usage guides that accompany your adapter. + +2. **Optional tests**, for second-order functionality that is common across plugins, but not required for basic use. Your plugin can opt into these test cases by inheriting existing ones, or reimplementing them with adjustments. More tests will be added as we convert older tests defined on dbt-core and mature plugins to use the standard framework. + +3. **Custom tests**, for behavior that is specific to your adapter / data platform. Each data warehouse has its own specialties and idiosyncracies. We encourage you to use the same `pytest`-based framework, utilities, and fixtures to write your own custom tests for functionality that is unique to your adapter. + +If you run into an issue with the core framework, or the basic/optional test cases—or if you've written a custom test that you believe would be relevant and useful for other adapter plugin developers—please open an issue or PR in the `dbt-core` repository on GitHub. + +## Getting started running basic tests + +In this section, we'll walk through the three steps to start running our basic test cases on your adapter plugin: + +1. Install dependencies +2. Set up and configure pytest +3. Define test cases + +### Install dependencies + +You should already have a virtual environment with `dbt-core` and your adapter plugin installed. You'll also need to install: +- `pytest` +- `dbt-tests-adapter`, the set of basic test cases +- (optional) [`pytest` plugins](https://docs.pytest.org/en/7.0.x/reference/plugin_list.html)--we'll use `pytest-dotenv` below + +During initial development, the `dbt-tests-adapter` package is not yet on PyPi, so please install it directly from GitHub: +```bash +pip install "git+https://github.com/dbt-labs/dbt-core.git#egg=dbt-tests-adapter&subdirectory=tests/adapter" +``` + +Or specify all dependencies in a requirements file like: + + +```txt +pytest +pytest-dotenv +git+https://github.com/dbt-labs/dbt-core.git#egg=dbt-tests-adapter&subdirectory=tests/adapter +``` + + +```sh +pip install -r dev_requirements.txt +``` + +### Set up and configure pytest + +First, set yourself up to run `pytest` by creating a file named `pytest.ini` at the root of your repository: + + + +```python +[pytest] +filterwarnings = + ignore:.*'soft_unicode' has been renamed to 'soft_str'*:DeprecationWarning + ignore:unclosed file .*:ResourceWarning +env_files = + test.env # uses pytest-dotenv plugin + # this allows you to store env vars for database connection in a file named test.env + # rather than passing them in every CLI command, or setting in `PYTEST_ADDOPTS` + # be sure to add "test.env" to .gitignore as well! +testpaths = + tests/functional # name per convention +``` + + + +Then, create a configuration file within your tests directory. In it, you'll want to define all necessary profile configuration for connecting to your data platform in local development and continuous integration. We recommend setting these values with environment variables, since this file will be checked into version control. + + + +```python +import pytest +import os + +# Import the standard functional fixtures as a plugin +# Note: fixtures with session scope need to be local +pytest_plugins = ["dbt.tests.fixtures.project"] + +# The profile dictionary, used to write out profiles.yml +# dbt will supply a unique schema per test, so we do not specify 'schema' here +@pytest.fixture(scope="class") +def dbt_profile_target(): + return { + 'type': '', + 'threads': 1, + 'host': os.getenv('HOST_ENV_VAR_NAME'), + 'user': os.getenv('USER_ENV_VAR_NAME'), + ... + } +``` + + + +### Define test cases + +As in the example above, each test case is defined as a class, and has its own "project" setup. To get started, you can import all basic test cases and try running them without changes. + + + +```python +import pytest + +from dbt.tests.adapter.basic.test_base import BaseSimpleMaterializations +from dbt.tests.adapter.basic.test_singular_tests import BaseSingularTests +from dbt.tests.adapter.basic.test_singular_tests_ephemeral import BaseSingularTestsEphemeral +from dbt.tests.adapter.basic.test_empty import BaseEmpty +from dbt.tests.adapter.basic.test_ephemeral import BaseEphemeral +from dbt.tests.adapter.basic.test_incremental import BaseIncremental +from dbt.tests.adapter.basic.test_generic_tests import BaseGenericTests +from dbt.tests.adapter.basic.test_snapshot_check_cols import BaseSnapshotCheckCols +from dbt.tests.adapter.basic.test_snapshot_timestamp import BaseSnapshotTimestamp + +class TestSimpleMaterializationsMyAdapter(BaseSimpleMaterializations): + pass + + +class TestSingularTestsMyAdapter(BaseSingularTests): + pass + + +class TestSingularTestsEphemeralMyAdapter(BaseSingularTestsEphemeral): + pass + + +class TestEmptyMyAdapter(BaseEmpty): + pass + + +class TestEphemeralMyAdapter(BaseEphemeral): + pass + + +class TestIncrementalMyAdapter(BaseIncremental): + pass + + +class TestGenericTestsMyAdapter(BaseGenericTests): + pass + + +class TestSnapshotCheckColsMyAdapter(BaseSnapshotCheckCols): + pass + + +class TestSnapshotTimestampMyAdapter(BaseSnapshotTimestamp): + pass +``` + + + + +Finally, run pytest: +```sh +python3 -m pytest tests/functional +``` + +### Modifying test cases + +You may need to make slight modifications in a specific test case to get it passing on your adapter. The mechanism to do this is simple: rather than simply inheriting the "base" test with `pass`, you can redefine any of its fixtures or test methods. + +For instance, on Redshift, we need to explicitly cast a column in the fixture input seed to use data type `varchar(64)`: + + + +```python +import pytest +from dbt.tests.adapter.basic.files import seeds_base_csv, seeds_added_csv, seeds_newcolumns_csv +from dbt.tests.adapter.basic.test_snapshot_check_cols import BaseSnapshotCheckCols + +# set the datatype of the name column in the 'added' seed so it +# can hold the '_update' that's added +schema_seed_added_yml = """ +version: 2 +seeds: + - name: added + config: + column_types: + name: varchar(64) +""" + +class TestSnapshotCheckColsRedshift(BaseSnapshotCheckCols): + # Redshift defines the 'name' column such that it's not big enough + # to hold the '_update' added in the test. + @pytest.fixture(scope="class") + def models(self): + return { + "base.csv": seeds_base_csv, + "added.csv": seeds_added_csv, + "seeds.yml": schema_seed_added_yml, + } +``` + + + +As another example, the `dbt-bigquery` adapter asks users to "authorize" replacing a table with a view by supplying the `--full-refresh` flag. The reason: In the table materialization logic, a view by the same name must first be dropped; if the table query fails, the model will be missing. + +Knowing this possibility, the "base" test case offers a `require_full_refresh` switch on the `test_config` fixture class. For BigQuery, we'll switch it on: + + + +```python +import pytest +from dbt.tests.adapter.basic.test_base import BaseSimpleMaterializations + +class TestSimpleMaterializationsBigQuery(BaseSimpleMaterializations): + @pytest.fixture(scope="class") + def test_config(self): + # effect: add '--full-refresh' flag in requisite 'dbt run' step + return {"require_full_refresh": True} +``` + + + +It's always worth asking whether the required modifications represent gaps in perceived or expected dbt functionality. Are these simple implementation details, which any user of this database would understand? Are they limitations worth documenting? + +If, on the other hand, they represent poor assumptions in the "basic" test cases, which fail to account for a common pattern in other types of databases-—please open an issue or PR in the `dbt-core` repository on GitHub. + +### Running with multiple profiles + +Some databases support multiple connection methods, which map to actually different functionality behind the scenes. For instance, the `dbt-spark` adapter supports connections to Apache Spark clusters _and_ Databricks runtimes, which supports additional functionality out of the box, enabled by the Delta file format. + + + +```python +def pytest_addoption(parser): + parser.addoption("--profile", action="store", default="apache_spark", type=str) + + +# Using @pytest.mark.skip_adapter('apache_spark') uses the 'skip_by_adapter_type' +# autouse fixture below +def pytest_configure(config): + config.addinivalue_line( + "markers", + "skip_profile(profile): skip test for the given profile", + ) + +@pytest.fixture(scope="session") +def dbt_profile_target(request): + profile_type = request.config.getoption("--profile") + elif profile_type == "databricks_sql_endpoint": + target = databricks_sql_endpoint_target() + elif profile_type == "apache_spark": + target = apache_spark_target() + else: + raise ValueError(f"Invalid profile type '{profile_type}'") + return target + +def apache_spark_target(): + return { + "type": "spark", + "host": "localhost", + ... + } + +def databricks_sql_endpoint_target(): + return { + "type": "spark", + "host": os.getenv("DBT_DATABRICKS_HOST_NAME"), + ... + } +``` + + + +If there are tests that _shouldn't_ run for a given profile: + + + +```python +# Snapshots require access to the Delta file format, available on our Databricks connection, +# so let's skip on Apache Spark +@pytest.mark.skip_profile('apache_spark') +class TestSnapshotCheckColsSpark(BaseSnapshotCheckCols): + @pytest.fixture(scope="class") + def project_config_update(self): + return { + "seeds": { + "+file_format": "delta", + }, + "snapshots": { + "+file_format": "delta", + } + } +``` + + + +Finally: +```sh +python3 -m pytest tests/functional --profile apache_spark +python3 -m pytest tests/functional --profile databricks_sql_endpoint +``` diff --git a/website/docs/docs/guides/migration-guide/upgrading-to-v1.1.md b/website/docs/docs/guides/migration-guide/upgrading-to-v1.1.md index 26302b9096b..fa3f3320ef5 100644 --- a/website/docs/docs/guides/migration-guide/upgrading-to-v1.1.md +++ b/website/docs/docs/guides/migration-guide/upgrading-to-v1.1.md @@ -22,7 +22,9 @@ There are no breaking changes for end users of dbt. We are committed to providin ### For maintainers of adapter plugins -The abstractmethods `get_response` and `execute` now only return `connection.AdapterReponse` in type hints. Previously, they could return a string. We encourage you to update your methods to return an object of class `AdapterResponse`, or implement a subclass specific to your adapter. This also gives you the opportunity to add fields specific to your adapter's query execution, such as `rows_affected` or `bytes_processed`. +We have reworked the testing suite for adapter plugin functionality. For details on the new testing suite, see: ["Testing a new adapter"](testing-a-new-adapter) + +The abstract methods `get_response` and `execute` now only return `connection.AdapterReponse` in type hints. Previously, they could return a string. We encourage you to update your methods to return an object of class `AdapterResponse`, or implement a subclass specific to your adapter. This also gives you the opportunity to add fields specific to your adapter's query execution, such as `rows_affected` or `bytes_processed`. ### For consumers of dbt artifacts (metadata) @@ -32,9 +34,10 @@ The manifest schema version will be updated to v5. The only change is to the def _Note: If you're contributing docs for a new or updated feature in v1.1, please link those docs changes below!_ +- [**Incremental models**](configuring-incremental-models) can now accept a list of multiple columns as their `unique_key`, for models that need a combination of columns to uniquely identify each row. This is supported by the most common data warehouses, for incremental strategies that make use of the `unique_key` config (`merge` and `delete+insert`). - [**Generic tests**](resource-properties/tests) can define custom names. This is useful to "prettify" the synthetic name that dbt applies automatically. It's needed to disambiguate the case when the same generic test is defined multiple times with different configurations. -### Plugins +### For users of specific adapter plugins - **dbt-bigquery:** Support for finer-grained configuration of query timeout and retry when defining your [connection profile](bigquery-profile). - **dbt-spark** added support for a [`session` connection method](spark-profile#session), for use with a pySpark session, to support rapid iteration when developing advanced or experimental functionality. This connection method is not recommended for new users, and it is not supported in dbt Cloud. diff --git a/website/sidebars.js b/website/sidebars.js index 512d7a6b5d3..230ddda69d7 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -111,6 +111,7 @@ const sidebarSettings = { items: [ "docs/contributing/contributor-license-agreements", "docs/contributing/building-a-new-adapter", + "docs/contributing/testing-a-new-adapter", "docs/contributing/slack-rules-of-the-road", ], },