Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Last Run links on home page UI not correct with RBAC UI #10434

Closed
alexbegg opened this issue Aug 20, 2020 · 9 comments · Fixed by #10595
Closed

Last Run links on home page UI not correct with RBAC UI #10434

alexbegg opened this issue Aug 20, 2020 · 9 comments · Fixed by #10595
Assignees
Labels
kind:bug This is a clearly a bug

Comments

@alexbegg
Copy link
Contributor

alexbegg commented Aug 20, 2020

Apache Airflow version:
1.10.11

Kubernetes version (if you are using kubernetes) (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-16T00:04:31Z", GoVersion:"go1.14.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.4", GitCommit:"8d8aa39598534325ad77120c120a22b3a990b5ea", GitTreeState:"clean", BuildDate:"2020-03-13T06:39:58Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: Azure Kubernetes Service
  • OS (e.g. from /etc/os-release): Debian GNU/Linux 10 (buster)
  • Kernel (e.g. uname -a): Linux airflow-web-65cb7d9cb8-qzcbv 4.15.0-1089-azure Http protocol sensor #99~16.04.1-Ubuntu SMP Fri Jun 5 15:30:32 UTC 2020 x86_64 GNU/Linux
  • Install tools: Helm chart "stable/airflow"
  • Others:

What happened:

The "Last Run" link on the home page with RBAC UI is not taking you to the graph of that last run (if it has a time that is NOT 00:00:00), instead it takes you to a graph with no activity shown. It looks like it is because it is not URL-encoding the execution_date in the URL.

What you expected to happen:

Clicking a "Last Run" link when "Graph" is your default view it should take you to the graph of that last run, instead it a graph with no activity shown.

How to reproduce it:

Using RBAC UI, click on a link in the "Last Run" column (that has a time that is NOT 00:00:00) it will take you to a graph view with no activity shown. The URL will show as ending in a date similar to execution_date=2020-08-19T05:00:00+00:00 and if you change all the : to %3A and + to %2B it should take you to the correct last run.

Anything else we need to know:

The URL for the link currently has execution_date=2020-08-19T05:00:00+00:00, whereas if you go to the actual graph page and select that same DAG run the URL is insead execution_date=2020-08-19T05%3A00%3A00%2B00%3A00 which works.

If the link's url has the execution date URL-encoded then the link should work.

@alexbegg alexbegg added the kind:bug This is a clearly a bug label Aug 20, 2020
@alexbegg
Copy link
Contributor Author

alexbegg commented Aug 20, 2020

I believe the fix may be to change the following line from last_run to encodeURIComponent(last_run). I am still trying to figure out how to test this out and make my first PR, but if someone beats me to it go ahead:

.attr("href", "{{ url_for('Airflow.graph') }}?dag_id=" + encodeURIComponent(dag_id) + "&execution_date=" + last_run)

@kaxil
Copy link
Member

kaxil commented Aug 21, 2020

Assigned this issue to you :)

@alexbegg
Copy link
Contributor Author

alexbegg commented Aug 27, 2020

@kaxil I can't reproduce this with Breeze. I ran master branch, v1-10-stable branch (with rbac = True), and even 1.10.11 tag (with rbac = True, to match my Kubernetes deployment) with Breeze and this issue does not come up, even though the issue does come up in my Kubernetes deployment.

I am going to have to recreate the same setup in breeze and see if I can reproduce this issue...

Here is what I found so far after clicking the "Last Run" link:

  • ❌ : status is None - Airflow 1.10.11 in my Kubernetes cluster, with execution_date not URL-encoded in the URL:
    Screen Shot 2020-08-26 at 5 21 10 PM

  • ✅ : status is success - Airflow 1.10.11 in my Kubernetes cluster, with execution_date manually URL-encoded in the URL:
    Screen Shot 2020-08-26 at 5 20 27 PM

  • ✅ : status is success - Airflow 1.10.12 running with Breeze on v1-10-stable branch, with execution_date not URL-encoded in the URL:
    Screen Shot 2020-08-26 at 5 21 40 PM

  • ✅ : status is success - Airflow 1.10.11 running with Breeze on 1.10.11 tag, with execution_date not URL-encoded in the URL:
    Screen Shot 2020-08-26 at 6 14 52 PM

@alexbegg
Copy link
Contributor Author

alexbegg commented Aug 27, 2020

I think I am able to replicate it now. It is only happening if the execution_date time is NOT 00:00:00 (in my first screenshot of my last comment, it was 07:00:00, and in my description it was 05:00:00) because when the "graph" view fails to show the DAG run it seems to be internally processed as if the time is 00:00:00 (maybe stripping off or failing to understand the time), which is why I could not reproduce the problem with a DAG run that actually is for 00:00:00.

If I view the "log" table of the metadata db it clearly shows the problem (I added the "URL" to the end for reference)

URL Route with +: /graph?dag_id=utility_sf_sync_deleted_recs&execution_date=2020-08-11T07:00:00+00:00

id dttm dag_id task_id event execution_date owner extra
14073 2020-08-27 04:10:57.656712+00 utility_sf_sync_deleted_recs graph 2020-08-11 00:00:00+00 (❌ wrong!) admin [('dag_id', 'utility_sf_sync_deleted_recs'), ('execution_date', '2020-08-11T07:00:00 00:00')]

URL Route with + encoded to %2B: /graph?dag_id=utility_sf_sync_deleted_recs&execution_date=2020-08-11T07:00:00%2B00:00

id dttm dag_id task_id event execution_date owner extra
14072 2020-08-27 04:10:10.846183+00 utility_sf_sync_deleted_recs graph 2020-08-11 07:00:00+00 (✅ correct!) admin [('dag_id', 'utility_sf_sync_deleted_recs'), ('execution_date', '2020-08-11T07:00:00+00:00')]

Also notice the extra column, the top wrong one has a space instead of a + in the last item, but the correct one has the + in the value.

@alexbegg
Copy link
Contributor Author

alexbegg commented Aug 27, 2020

I tracked it down. The problem is an invalid value of request.args.get('execution_date') when execution_date is passed in the URL unencoded resulting in it not returning the correct pendulum datetime when used in pendulum.parse() which is used in many places inwww_rbac python methods. The problem is not with the : either, only with the unescaped +.

I edited the code to log out the request.args.get('execution_date') value on each request, and I found:

  • Using execution_date=2020-08-11T07:00:00+00:00 in the URL, and request.args.get('execution_date') resulting in 2020-08-11T07:00:00 00:00 (notice the space)

    >>> pendulum.parse('2020-08-11T07:00:00 00:00')
    <Pendulum [2020-08-11T00:00:00+00:00]>
    

    ❌ Wrong time! It ends up being 00:00:00.

  • Using execution_date=2020-08-11T07:00:00%2B00:00 in the URL, and request.args.get('execution_date') resulting in 2020-08-11T07:00:00+00:00 (notice the+)

    >>> pendulum.parse('2020-08-11T07:00:00+00:00')
    <Pendulum [2020-08-11T07:00:00+00:00]>
    

    ✅ Correct time! It ends up being 07:00:00.

Encoding the execution_date in the URL should fix the issue. Making the PR now.

@kaxil
Copy link
Member

kaxil commented Aug 27, 2020

Thanks @alexbegg for the detailed investigation and the PR :)

@alexbegg
Copy link
Contributor Author

@kaxil that was my first PR in this repo so I am just asking, does anything need to be done on my part for this to be picked up and cherry-picked into 1.10.x? It needs to be fixed in airflow/www_rbac/templates/airflow/dags.html there too.

@JeffryMAC
Copy link

@alexbegg do you think you can help investigating #10556 ? seems you are familiar with the last_run and it's representation in the UI

@kaxil
Copy link
Member

kaxil commented Sep 25, 2020

@kaxil that was my first PR in this repo so I am just asking, does anything need to be done on my part for this to be picked up and cherry-picked into 1.10.x? It needs to be fixed in airflow/www_rbac/templates/airflow/dags.html there too.

I have added "1.10.13" milestone to the PR :) that should be enough

cfei18 pushed a commit to cfei18/incubator-airflow that referenced this issue Mar 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants