Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Union schema compatibility #29

Merged
merged 11 commits into from
Oct 12, 2023
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .buildkite/hooks/pre-command
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,5 @@ export CI_SNOWFLAKE_DBT_USER=$(gcloud secrets versions access latest --secret="C
export CI_SNOWFLAKE_DBT_WAREHOUSE=$(gcloud secrets versions access latest --secret="CI_SNOWFLAKE_DBT_WAREHOUSE" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_HOST=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_HOST" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_HTTP_PATH=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_HTTP_PATH" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_TOKEN=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_TOKEN" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_TOKEN=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_TOKEN" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_CATALOG=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_CATALOG" --project="dbt-package-testing-363917")
3 changes: 2 additions & 1 deletion .buildkite/pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ steps:
commands: |
bash .buildkite/scripts/run_models.sh redshift

- label: ":bricks: Run Tests - Databricks"
- label: ":databricks: Run Tests - Databricks"
key: "run_dbt_databricks"
plugins:
- docker#v3.13.0:
Expand All @@ -69,5 +69,6 @@ steps:
- "CI_DATABRICKS_DBT_HOST"
- "CI_DATABRICKS_DBT_HTTP_PATH"
- "CI_DATABRICKS_DBT_TOKEN"
- "CI_DATABRICKS_DBT_CATALOG"
commands: |
bash .buildkite/scripts/run_models.sh databricks
26 changes: 24 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,29 @@
# dbt_microsoft_ads_source v0.UPDATE.UPDATE
# dbt_microsoft_ads_source v0.8.0
[PR #29](https://github.com/fivetran/dbt_microsoft_ads_source/pull/29) includes the following updates:
## Breaking changes
- Updated the following identifiers for consistency with the source name and compatibility with the union schema feature:

## Under the Hood:
| current | previous |
|----------|----------|
|microsoft_ads_account_performance_daily_report_identifier | microsoft_ads_account_daily_report_identifier |
|microsoft_ads_account_performance_daily_report_identifier | microsoft_ads_account_daily_report_identifier
fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved
|microsoft_ads_ad_group_performance_daily_report_identifier | microsoft_ads_ad_group_daily_report_identifier|
|microsoft_ads_ad_performance_daily_report_identifier | microsoft_ads_ad_daily_report_identifier|
|microsoft_ads_campaign_performance_daily_report_identifier | microsoft_ads_campaign_daily_report_identifier|
|microsoft_ads_keyword_performance_daily_report_identifier | microsoft_ads_keyword_daily_report_identifier|
|microsoft_ads_search_query_performance_daily_report_identifier | microsoft_ads_search_query_daily_report_identifier|

- If you are using the previous identifier, be sure to update to the current version!

## Feature update 🎉
- Unioning capability! This adds the ability to union source data from multiple microsoft_ads connectors. Refer to the [README](https://github.com/fivetran/dbt_microsoft_ads_source/blob/main/README.md) for more details.
fivetran-catfritz marked this conversation as resolved.
Show resolved Hide resolved

## Under the hood 🚘
- Updated tmp models to union source data using the `fivetran_utils.union_data` macro.
- To distinguish which source each field comes from, added `source_relation` column in each staging model and applied the `fivetran_utils.source_relation` macro.
- Updated tests to account for the new `source_relation` column.

[PR #26](https://github.com/fivetran/dbt_microsoft_ads_source/pull/26) includes the following updates:
- Incorporated the new `fivetran_utils.drop_schemas_automation` macro into the end of each Buildkite integration test job.
- Updated the pull request [templates](/.github).
# dbt_microsoft_ads_source v0.7.0
Expand Down
14 changes: 13 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ If you are **not** using the [Microsoft Ads transformation package](https://git
```yaml
packages:
- package: fivetran/microsoft_ads_source
version: [">=0.7.0", "<0.8.0"]
version: [">=0.8.0", "<0.9.0"]
```
## Step 3: Define database and schema variables
By default, this package runs using your destination and the `microsoft_ads` schema. If this is not where your Microsoft Ads data is (for example, if your microsoft_ads schema is named `microsoft_ads_fivetran`), add the following configuration to your root `dbt_project.yml` file:
Expand All @@ -58,6 +58,18 @@ vars:
## (Optional) Step 4: Additional configurations
<details><summary>Expand for configurations</summary>

### Union multiple connectors
If you have multiple microsoft_ads connectors in Fivetran and would like to use this package on all of them simultaneously, we have provided functionality to do so. The package will union all of the data together and pass the unioned table into the transformations. You will be able to see which source it came from in the `source_relation` column of each model. To use this functionality, you will need to set either the `microsoft_ads_union_schemas` OR `microsoft_ads_union_databases` variables (cannot do both) in your root `dbt_project.yml` file:

```yml
vars:
microsoft_ads_union_schemas: ['microsoft_ads_usa','microsoft_ads_canada'] # use this if the data is in different schemas/datasets of the same database/project
microsoft_ads_union_databases: ['microsoft_ads_usa','microsoft_ads_canada'] # use this if the data is in different databases/projects but uses the same schema name
```
Please be aware that the native `source.yml` connection set up in the package will not function when the union schema/database feature is utilized. Although the data will be correctly combined, you will not observe the sources linked to the package models in the Directed Acyclic Graph (DAG). This happens because the package includes only one defined `source.yml`.

To connect your multiple schema/database sources to the package models, follow the steps outlined in the [Union Data Defined Sources Configuration](https://github.com/fivetran/dbt_fivetran_utils/tree/releases/v0.4.latest#union_data-source) section of the Fivetran Utils documentation for the union_data macro. This will ensure a proper configuration and correct visualization of connections in the DAG.

### Passing Through Additional Metrics
By default, this package will select `clicks`, `impressions`, and `cost` from the source reporting tables to store into the staging models. If you would like to pass through additional metrics to the staging models, add the below configurations to your `dbt_project.yml` file. These variables allow for the pass-through fields to be aliased (`alias`) if desired, but not required. Use the below format for declaring the respective pass-through variables:

Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'microsoft_ads_source'
version: '0.7.0'
version: '0.8.0'

config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

8 changes: 4 additions & 4 deletions docs/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/run_results.json

Large diffs are not rendered by default.

14 changes: 7 additions & 7 deletions integration_tests/ci/sample.profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ integration_tests:
pass: "{{ env_var('CI_REDSHIFT_DBT_PASS') }}"
dbname: "{{ env_var('CI_REDSHIFT_DBT_DBNAME') }}"
port: 5439
schema: microsoft_ads_source_integration_tests
schema: microsoft_ads_source_integration_tests_2
threads: 8
bigquery:
type: bigquery
method: service-account-json
project: 'dbt-package-testing'
schema: microsoft_ads_source_integration_tests
schema: microsoft_ads_source_integration_tests_2
threads: 8
keyfile_json: "{{ env_var('GCLOUD_SERVICE_KEY') | as_native }}"
snowflake:
Expand All @@ -33,7 +33,7 @@ integration_tests:
role: "{{ env_var('CI_SNOWFLAKE_DBT_ROLE') }}"
database: "{{ env_var('CI_SNOWFLAKE_DBT_DATABASE') }}"
warehouse: "{{ env_var('CI_SNOWFLAKE_DBT_WAREHOUSE') }}"
schema: microsoft_ads_source_integration_tests
schema: microsoft_ads_source_integration_tests_2
threads: 8
postgres:
type: postgres
Expand All @@ -42,13 +42,13 @@ integration_tests:
pass: "{{ env_var('CI_POSTGRES_DBT_PASS') }}"
dbname: "{{ env_var('CI_POSTGRES_DBT_DBNAME') }}"
port: 5432
schema: microsoft_ads_source_integration_tests
schema: microsoft_ads_source_integration_tests_2
threads: 8
databricks:
catalog: null
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
schema: microsoft_ads_source_integration_tests
threads: 2
schema: microsoft_ads_source_integration_tests_2
threads: 8
token: "{{ env_var('CI_DATABRICKS_DBT_TOKEN') }}"
type: databricks
16 changes: 8 additions & 8 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
name: 'microsoft_ads_source_integration_tests'
version: '0.7.0'
version: '0.8.0'

profile: 'integration_tests'
config-version: 2

vars:
microsoft_ads_schema: microsoft_ads_source_integration_tests
microsoft_ads_schema: microsoft_ads_source_integration_tests_2
microsoft_ads_account_history_identifier: "microsoft_ads_account_history_data"
microsoft_ads_account_daily_report_identifier: "microsoft_ads_account_performance_daily_report_data"
microsoft_ads_account_performance_daily_report_identifier: "microsoft_ads_account_performance_daily_report_data"
microsoft_ads_ad_group_history_identifier: "microsoft_ads_ad_group_history_data"
microsoft_ads_ad_group_daily_report_identifier: "microsoft_ads_ad_group_performance_daily_report_data"
microsoft_ads_ad_group_performance_daily_report_identifier: "microsoft_ads_ad_group_performance_daily_report_data"
microsoft_ads_ad_history_identifier: "microsoft_ads_ad_history_data"
microsoft_ads_ad_daily_report_identifier: "microsoft_ads_ad_performance_daily_report_data"
microsoft_ads_ad_performance_daily_report_identifier: "microsoft_ads_ad_performance_daily_report_data"
microsoft_ads_campaign_history_identifier: "microsoft_ads_campaign_history_data"
microsoft_ads_campaign_daily_report_identifier: "microsoft_ads_campaign_performance_daily_report_data"
microsoft_ads_campaign_performance_daily_report_identifier: "microsoft_ads_campaign_performance_daily_report_data"
microsoft_ads_keyword_history_identifier: "microsoft_ads_keyword_history_data"
microsoft_ads_keyword_daily_report_identifier: "microsoft_ads_keyword_performance_daily_report_data"
microsoft_ads_search_daily_report_identifier: "microsoft_ads_search_performance_daily_report_data"
microsoft_ads_keyword_performance_daily_report_identifier: "microsoft_ads_keyword_performance_daily_report_data"
microsoft_ads_search_query_performance_daily_report_identifier: "microsoft_ads_search_performance_daily_report_data"

seeds:
microsoft_ads_source_integration_tests:
Expand Down
3 changes: 3 additions & 0 deletions models/docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,3 +158,6 @@ The time zone associated with this record.
The position of the ad associated with this record. For more information, refer to Microsoft [documentation](https://help.ads.microsoft.com/apex/index/22/en/14009).
{% enddocs %}

{% docs source_relation %}
The source of the record if the unioning functionality is being used. If not this field will be empty.
{% enddocs %}
14 changes: 7 additions & 7 deletions models/src_microsoft_ads.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
version: 2

sources:
- name: microsoft_ads
- name: microsoft_ads # This source will only be used if you are using a single microsoft_ads source connector. If multiple sources are being unioned, their tables will be directly referenced via adapter.get_relation.
schema: "{{ var('microsoft_ads_schema', 'bingads') }}"
database: "{% if target.type != 'spark'%}{{ var('microsoft_ads_database', target.database) }}{% endif %}"

Expand Down Expand Up @@ -32,7 +32,7 @@ sources:
description: '{{ doc("currency_code") }}'

- name: account_performance_daily_report
identifier: "{{ var('microsoft_ads_account_daily_report_identifier', 'account_performance_daily_report') }}"
identifier: "{{ var('microsoft_ads_account_performance_daily_report_identifier', 'account_performance_daily_report') }}"
description: Each record in this table represents the daily performance by account and all non-metric columns.
columns:
- name: date
Expand Down Expand Up @@ -82,7 +82,7 @@ sources:
description: '{{ doc("ad_group_status") }}'

- name: ad_group_performance_daily_report
identifier: "{{ var('microsoft_ads_ad_group_daily_report_identifier', 'ad_group_performance_daily_report') }}"
identifier: "{{ var('microsoft_ads_ad_group_performance_daily_report_identifier', 'ad_group_performance_daily_report') }}"
description: Each record in this table represents the daily performance by account, campaign, ad group and all non-metric columns.
columns:
- name: date
Expand Down Expand Up @@ -140,7 +140,7 @@ sources:
description: '{{ doc("ad_type") }}'

- name: ad_performance_daily_report
identifier: "{{ var('microsoft_ads_ad_daily_report_identifier', 'ad_performance_daily_report') }}"
identifier: "{{ var('microsoft_ads_ad_performance_daily_report_identifier', 'ad_performance_daily_report') }}"
description: Each record in this table represents the daily performance by account, campaign, ad group, ad and all non-metric columns.
columns:
- name: date
Expand Down Expand Up @@ -198,7 +198,7 @@ sources:
description: '{{ doc("campaign_status") }}'

- name: campaign_performance_daily_report
identifier: "{{ var('microsoft_ads_campaign_daily_report_identifier', 'campaign_performance_daily_report') }}"
identifier: "{{ var('microsoft_ads_campaign_performance_daily_report_identifier', 'campaign_performance_daily_report') }}"
description: Each record in this table represents the daily performance by account, campaign and all non-metric columns.
columns:
- name: date
Expand Down Expand Up @@ -248,7 +248,7 @@ sources:
description: '{{ doc("keyword_status") }}'

- name: keyword_performance_daily_report
identifier: "{{ var('microsoft_ads_keyword_daily_report_identifier', 'keyword_performance_daily_report') }}"
identifier: "{{ var('microsoft_ads_keyword_performance_daily_report_identifier', 'keyword_performance_daily_report') }}"
description: Each record in this table represents the daily performance by account, campaign, ad group, ad, keyword and all non-metric columns.
columns:
- name: date
Expand Down Expand Up @@ -289,7 +289,7 @@ sources:
description: '{{ doc("spend") }}'

- name: search_query_performance_daily_report
identifier: "{{ var('microsoft_ads_search_daily_report_identifier', 'search_query_performance_daily_report') }}"
identifier: "{{ var('microsoft_ads_search_query_performance_daily_report_identifier', 'search_query_performance_daily_report') }}"
description: Each record in this table represents the daily performance by account, campaign, ad group, ad, keyword and all non-metric columns.
columns:
- name: date
Expand Down
Loading