Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot access dataflint UI through Knox gateway URL using the #20

Open
ef1236 opened this issue Oct 29, 2024 · 5 comments
Open

Cannot access dataflint UI through Knox gateway URL using the #20

ef1236 opened this issue Oct 29, 2024 · 5 comments
Assignees

Comments

@ef1236
Copy link

ef1236 commented Oct 29, 2024

The bug is the same as described in [https://github.com//issues/13](This bug report)

When I try to access data flint through a knox gateway, it tries to access without the gateway

GET https:///api/v1/applications/<application_id/1/environment 404 (Not Found)

it should be call to: https:///gateway//sparkhistory/history/<application_id>/1/environment/

It looks like version 0.2.5 should have fixed it but I still encounter the same problem

@menishmueli
Copy link
Contributor

@DanielAronovich please take a look, isn't it the same path you simulated with a proxy 1:1?

@DanielAronovich
Copy link
Contributor

@ef1236, you are corret and it should have been solved.

I am on it.

Does the history server works well and the dataflint tab works well, but only when pressing "TO HISTORY SERVER" it breaks?

Does the "To spark UI" button on top of it works well?

Thanks!

@ef1236
Copy link
Author

ef1236 commented Oct 30, 2024

@DanielAronovich Hey thanks for the quick reply

We are using Knox With Cloudera.
The history server dataflint tab only works when I don't go through the Knox gateway.

When I access the sparkhistory server directly using the hostname and port:
Hostname:18489/history/application_/dataflint it queries for the environment endpoint correctly:
Hostname:18489/api/v1/applications/application_/environment

But when I access through the sparkhistory server, the URL is correct but the page tries to query for the environment endpoint with the wrong URL.
When I go through the Knox gateway - exapmle.example/gateway/cdp-proxy/spark3history/history/application_/dataflint,
It queries
exapmle.example/api/v1/applications/application/environment
instead of
exapmle.example/gateway/cdp-proxy/spark3history/api/v1/applications/application/environment
which would have worked (I queried it myself).

@menishmueli
Copy link
Contributor

Wrote a fix and released version 0.2.6. I believe it will work this time
You can look at the fix here: 4786a1b

@ef1236 let me know if version 0.2.6 fixed the problem so we can close this issue

@ef1236
Copy link
Author

ef1236 commented Nov 7, 2024

Hey, 0.2.6 did not fix the problem. But I think I found the problem in the code
In spark-ui/src/utils/UrlUtils.ts in Line 38 the pathToRemove variable should remove the ".*" at the start because then the regex captures the whole start of the URL and removes also the gateway

// old
const pathToRemove = /.*\/history\/[^/]+\/dataflint\/?$/;
// new
const pathToRemove = /\/history\/[^/]+\/dataflint\/?$/;

Then it only captures the part from the /history and not the whole pathname

I Created a pull request with the change

@ef1236 ef1236 mentioned this issue Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants