Fix tracking bug for jinja sampling #4048

nathaniel-may · 2021-10-12T20:50:13Z

resolves #4038

Description

Comparison for tracking events should be done before we apply any changes. See ticket description for details.

Reviewers

I just reordered existing code so that the sample comparison is the first thing that's done in the function instead of the last.

Checklist

I have signed the CLA
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
I have updated the CHANGELOG.md and added information about my change

jtcohen6

The ordering here needs to be subtly different depending on the conditional branch we're following. In all cases, I'm using the reproduction case outlined in #4038.

If static parsing is enabled, and we're taking a stable sample: We need to compare jinja_sample_node + jinja_sample_config to node + config after we've updated the node's refs, sources, and config (i.e. after line 129). Otherwise, jinja_sample_node will include refs that our real node hasn't had the chance to add.
If we're taking an experimental parser sample (totally turned off for now): We can compare statically_parsed and experimental_sample anywhere along the way. But if we want to compare experimental_sample to the node itself, it needs to be before we've run update_parsed_node_config on that node, since that will add in refs from hooks. That's the problem in v0.21 right now.
If static parsing is disabled, or if the static/experimental parser doesn't return a dict, we're not taking any samples.

So I'm not sure that this reordering is the right move for main going forward, now that the static parser is on and actually mutating node by default. The biggest change we need to make is where we fire _get_exp_sample_result, both in main and in 0.21.latest.

If it simplifies the problem I'd be open to treating each branch separately, with separate PRs.

core/dbt/parser/models.py

nathaniel-may · 2021-10-15T18:20:53Z

Chatted with @jtcohen6 and we figured out that we actually want to be doing this comparison on fully formed nodes. The experimental parser will likely be attempting to expand the capacity of the existing static parser so we cannot compare the two together naively. Instead will have to populate a complete copy of the node with everything after experimentally parsing and check if they match. Thankfully, this is a much sounder way to approach this comparison anyway.

nathaniel-may · 2021-10-15T19:39:38Z

Tried running the debugger with your project @jtcohen6 and I think something's still wrong. Let's connect on Monday.

jtcohen6

@nathaniel-may I just tried this out locally with the simple reproduction case, and it worked perfectly! The new approach also makes a lot of sense conceptually, and feels like something we could easily extend to other parsing methods (and even other node types).

It will be a bit tricky to backport this to 0.21.latest, given all the other changes we've made. As much as we can replicate the exact same approach over there—create an entire (deep)copy of the node, perform all mutations on it in isolation, then compare—I think that will serve us well to avoid spurious mismatches in v0.21.1.

core/dbt/parser/models.py

jtcohen6

~~let's ship it!~~ (see below)

core/dbt/parser/models.py

jtcohen6

thanks for the detailed work teasing this one apart!

fix jinja sampling for static parser automatic commit by git-black, original commits: 21a7b71

nathaniel-may requested review from jtcohen6 and gshank October 12, 2021 20:50

cla-bot bot added the cla:yes label Oct 12, 2021

jtcohen6 reviewed Oct 14, 2021

View reviewed changes

core/dbt/parser/models.py Outdated Show resolved Hide resolved

core/dbt/parser/models.py Outdated Show resolved Hide resolved

core/dbt/parser/models.py Show resolved Hide resolved

nathaniel-may force-pushed the static-parse-tracking-bug branch from 44003e7 to f8be866 Compare October 14, 2021 17:10

jtcohen6 reviewed Oct 15, 2021

View reviewed changes

core/dbt/parser/models.py Outdated Show resolved Hide resolved

jtcohen6 approved these changes Oct 18, 2021

View reviewed changes

core/dbt/parser/models.py Outdated Show resolved Hide resolved

nathaniel-may force-pushed the static-parse-tracking-bug branch from dc01115 to 95a93c5 Compare October 18, 2021 14:28

nathaniel-may mentioned this pull request Oct 18, 2021

[backport] Tracking fix #4093

Merged

4 tasks

jtcohen6 approved these changes Oct 19, 2021

View reviewed changes

jtcohen6 reviewed Oct 19, 2021

View reviewed changes

core/dbt/parser/models.py Show resolved Hide resolved

gshank reviewed Oct 19, 2021

View reviewed changes

core/dbt/parser/models.py Outdated Show resolved Hide resolved

Nathaniel May added 13 commits October 20, 2021 10:29

fix jinja sampling

8e5420b

refactor into populate method

6f5ffcb

fully populate experimental sample for accurate comparison

5c6d49c

add deepcopy method

9f7d2e9

handle experimental parser errors when comparing

3554e94

make all comparison at the end

9b03e6d

minor simplification

1a1a2b3

fix tracking messages

ad0c669

add back failure tracking messages

dc61c3a

prefer partial_deepcopy to deepcopy

6789bd5

fix typo

d7a895e

fix branching

49ecd49

don't conduct a stable sample against an experimental parser run

c17e70b

nathaniel-may force-pushed the static-parse-tracking-bug branch from 2671756 to c17e70b Compare October 20, 2021 14:29

nathaniel-may requested review from gshank and jtcohen6 October 20, 2021 14:43

gshank approved these changes Oct 20, 2021

View reviewed changes

jtcohen6 approved these changes Oct 20, 2021

View reviewed changes

nathaniel-may merged commit 21a7b71 into main Oct 20, 2021

nathaniel-may deleted the static-parse-tracking-bug branch October 20, 2021 16:38

iknox-fa pushed a commit that referenced this pull request Feb 8, 2022

Fix tracking bug for jinja sampling (#4048)

a945047

fix jinja sampling for static parser automatic commit by git-black, original commits: 21a7b71

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tracking bug for jinja sampling #4048

Fix tracking bug for jinja sampling #4048

nathaniel-may commented Oct 12, 2021 •

edited

Loading

jtcohen6 left a comment

nathaniel-may commented Oct 15, 2021

nathaniel-may commented Oct 15, 2021

jtcohen6 left a comment

jtcohen6 left a comment •

edited

Loading

jtcohen6 left a comment

Fix tracking bug for jinja sampling #4048

Fix tracking bug for jinja sampling #4048

Conversation

nathaniel-may commented Oct 12, 2021 • edited Loading

Description

Reviewers

Checklist

jtcohen6 left a comment

Choose a reason for hiding this comment

nathaniel-may commented Oct 15, 2021

nathaniel-may commented Oct 15, 2021

jtcohen6 left a comment

Choose a reason for hiding this comment

jtcohen6 left a comment • edited Loading

Choose a reason for hiding this comment

jtcohen6 left a comment

Choose a reason for hiding this comment

nathaniel-may commented Oct 12, 2021 •

edited

Loading

jtcohen6 left a comment •

edited

Loading