Refactored waiting function for Tableau Jobs #17034

ciancolo · 2021-07-15T15:02:08Z

Hello all,
as suggested in PR #16937 I created a new PR with the following changes:

Creation function to get job status and to check the completion of a Tableau job in TableauHook
Refactored waiting check, removing the TableauJobStatusSensor, in TableauRefreshWorkbookOperator
Refactored job status check in TableauJobStatusSensor
Updated tests

Thank you

airflow/providers/amazon/CHANGELOG.rst

eladkal · 2021-07-15T19:06:26Z

airflow/providers/tableau/hooks/tableau.py

+        while finish_code == TableauJobFinishCode.PENDING:
+            finish_code = self.get_job_status(job_id=job_id)


Do we risk here to overwhelm the Tableau API?

Yeah you're right especially for a long refresh. But now I am a little bit confused, the original request was to remove the sensor from the operator and to implement the waiting code in the hook. But why just not use the sensor and the already implemented poke function instead to create a new one?

What do you think?

We have the issue of waiting for a specific status in other operators.
For example:
EC2StartInstanceOperator allows to pass check_interval to the wait_for_statefunction in the hook:

airflow/airflow/providers/amazon/aws/operators/ec2_start_instance.py

Lines 65 to 69 in 1960e37

ec2_hook.wait_for_state(

instance_id=self.instance_id,

target_state="running",

check_interval=self.check_interval,

)

That way we provide the user the power to decide how much time to wait.

BTW I think it would be nice to change the waiting_until_succeeded to a genericwait_for_state like it's implemented in the EC2Hook that way it will allow more flexibility and users may use it for further custom operators if they need to. For our use case we can wait tillTableauJobFinishCode.SUCCESS.

Ok. Now it's more clear. I will implement the changes.
Thank you.

uranusjr · 2021-07-15T22:17:23Z

Many whitespace changes besides amazon.rst are also irrelavant (or even wrong, like removing the blank line after the first line in a docstring). It’d be best if you could revert all of those.

ciancolo · 2021-07-16T08:24:05Z

@uranusjr @eladkal Yeah, I'm sorry, I had a problem in docs-builds, I'm tried to fix it. I will just remove the changes. It was very strange because I didn't modify the Amazon docs before the docs-build error.

ciancolo · 2021-07-16T11:05:20Z

I pushed the requested changes.

Just a comment, the function wait_for_state returns a bool because it is not certain that the desired state is reached for a particular job. For example, if the refresh job fails, that job will never reach the SUCCESS state. So the function returns True in case the desired state is reached False otherwise.

eladkal · 2021-07-17T20:45:31Z

tests/providers/tableau/hooks/test_tableau.py

+        # Test SUCCESS
+        mock_tableau_server.jobs.get_by_id.return_value.finish_code = 0
+        with TableauHook(tableau_conn_id='tableau_test_password') as tableau_hook:
+            tableau_hook.server = mock_tableau_server
+            jobs_status = tableau_hook.get_job_status(job_id='j1')
+            assert jobs_status == TableauJobFinishCode.SUCCESS
+
+        # Test ERROR
+        mock_tableau_server.jobs.get_by_id.return_value.finish_code = 1
+        with TableauHook(tableau_conn_id='tableau_test_password') as tableau_hook:
+            tableau_hook.server = mock_tableau_server
+            jobs_status = tableau_hook.get_job_status(job_id='j1')
+            assert jobs_status == TableauJobFinishCode.ERROR


Lets also add test case for CANCELED
Also we have a repeated pattern here probably better to use parameterized test

eladkal · 2021-07-17T21:01:53Z

tests/providers/tableau/hooks/test_tableau.py

+                side_effect=[
+                    TableauJobFinishCode.PENDING,
+                    TableauJobFinishCode.PENDING,
+                    TableauJobFinishCode.ERROR,
+                ],


Why PENDING twice?

~~I wanted to simulate a more realistic case, just to be more secure that everything was ok.~~

I read wrong before. You're right, in this case, it is not necessary.

airflow/providers/tableau/hooks/tableau.py

…ok and updated Sensor and Operator.

…_get_job_status.

ciancolo · 2021-07-19T09:07:30Z

Pushed requested changes.

Just a comment, I didn't parametrize the test_wait_for_state function because I think it is more readable and clear in the current status, with the different cases separate.

uranusjr

I think this is good in the sense that it should do what you want it to do, but I’m not really familiar with Tebleau to comment further.

eladkal

LTGM
please fix/explain the duplication (there were 4 tests with the duplicated values you addressed only 1 of them)

Other than that looks OK

tests/providers/tableau/hooks/test_tableau.py

eladkal · 2021-07-20T13:15:30Z

@potiuk since you approved the PR that we splitted out from. Do you have any further comments?
If not I'm happy to merge.

issues addressed

github-actions · 2021-07-21T12:09:01Z

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.

boring-cyborg bot added the area:providers label Jul 15, 2021

eladkal reviewed Jul 15, 2021

View reviewed changes

ciancolo force-pushed the feature-job-waiting-tableau-hook branch from 36ee4e6 to 8c0c9ac Compare July 16, 2021 11:00

eladkal requested changes Jul 17, 2021

View reviewed changes

uranusjr reviewed Jul 19, 2021

View reviewed changes

airflow/providers/tableau/hooks/tableau.py Outdated Show resolved Hide resolved

uranusjr previously requested changes Jul 19, 2021

View reviewed changes

airflow/providers/tableau/hooks/tableau.py Outdated Show resolved Hide resolved

Michele Zanchi added 11 commits July 19, 2021 10:43

Implemented new function for get job status and waiting in Tableau Ho…

9a35616

…ok and updated Sensor and Operator.

Updated test for Tableau.

0e05efa

Mods from pre-commit scripts.

1d04cc9

Fixed docs Tableau code.

ab74f60

Generalized waiting_for_succeeded in wait_for_state in Tableau Hook.

f2c3e88

Updated test for new functionality in Tableau Hook.

2636456

Restored original format of docs Tableau scripts.

26a8924

Changed docs for wait_for_state and get_job_status in Tableau Hook.

f27cefb

Modified test in paremetrizated test and added CANCELED case for test…

5f13b90

…_get_job_status.

Changes from pre-commit scripts.

94c4ac6

Added case CANCELED for test_wait_for_state in Tableau Hook tests.

e79c63f

ciancolo force-pushed the feature-job-waiting-tableau-hook branch from 8c0c9ac to e79c63f Compare July 19, 2021 09:05

uranusjr reviewed Jul 20, 2021

View reviewed changes

eladkal self-requested a review July 20, 2021 06:17

eladkal approved these changes Jul 20, 2021

View reviewed changes

tests/providers/tableau/hooks/test_tableau.py Show resolved Hide resolved

tests/providers/tableau/hooks/test_tableau.py Show resolved Hide resolved

tests/providers/tableau/hooks/test_tableau.py Show resolved Hide resolved

github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Jul 21, 2021

eladkal merged commit 29b6be8 into apache:main Jul 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactored waiting function for Tableau Jobs #17034

Refactored waiting function for Tableau Jobs #17034

ciancolo commented Jul 15, 2021

eladkal Jul 15, 2021

ciancolo Jul 16, 2021

eladkal Jul 16, 2021 •

edited

Loading

ciancolo Jul 16, 2021

uranusjr commented Jul 15, 2021 •

edited

Loading

ciancolo commented Jul 16, 2021

ciancolo commented Jul 16, 2021

eladkal Jul 17, 2021

eladkal Jul 17, 2021

ciancolo Jul 19, 2021 •

edited

Loading

ciancolo commented Jul 19, 2021

uranusjr left a comment

eladkal left a comment

eladkal commented Jul 20, 2021

github-actions bot commented Jul 21, 2021

		while finish_code == TableauJobFinishCode.PENDING:
		finish_code = self.get_job_status(job_id=job_id)

	ec2_hook.wait_for_state(
	instance_id=self.instance_id,
	target_state="running",
	check_interval=self.check_interval,
	)

Refactored waiting function for Tableau Jobs #17034

Refactored waiting function for Tableau Jobs #17034

Conversation

ciancolo commented Jul 15, 2021

eladkal Jul 15, 2021

Choose a reason for hiding this comment

ciancolo Jul 16, 2021

Choose a reason for hiding this comment

eladkal Jul 16, 2021 • edited Loading

Choose a reason for hiding this comment

ciancolo Jul 16, 2021

Choose a reason for hiding this comment

uranusjr commented Jul 15, 2021 • edited Loading

ciancolo commented Jul 16, 2021

ciancolo commented Jul 16, 2021

eladkal Jul 17, 2021

Choose a reason for hiding this comment

eladkal Jul 17, 2021

Choose a reason for hiding this comment

ciancolo Jul 19, 2021 • edited Loading

Choose a reason for hiding this comment

ciancolo commented Jul 19, 2021

uranusjr left a comment

Choose a reason for hiding this comment

eladkal left a comment

Choose a reason for hiding this comment

eladkal commented Jul 20, 2021

github-actions bot commented Jul 21, 2021

eladkal Jul 16, 2021 •

edited

Loading

uranusjr commented Jul 15, 2021 •

edited

Loading

ciancolo Jul 19, 2021 •

edited

Loading