Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

[cetic/nifi] Readiness Probe Fails #47

Closed
RabbidDog opened this issue Feb 28, 2020 · 5 comments
Closed

[cetic/nifi] Readiness Probe Fails #47

RabbidDog opened this issue Feb 28, 2020 · 5 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@RabbidDog
Copy link

Describe the bug
Nifi readiness probe doesn't succeed.
During out cluster installation we deploy and delete and redeploy Nifi multiple times. And we use persistent storage for the nifi. Randomly, after a redeployment, Nifi will stop working.
We are not doing anything different in our redeployment.

What happened:
The readiness probe fails with the following message

Readiness probe failed: Node not found with CONNECTED state. Full cluster state: % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 10.233.105.161... * TCP_NODELAY set * Connected to nifi-0.nifi-headless.dap-core.svc.cluster.local (10.233.105.161) port 8080 (#0) > GET /nifi-api/controller/cluster HTTP/1.1 > Host: nifi-0.nifi-headless.dap-core.svc.cluster.local:8080 > User-Agent: curl/7.52.1 > Accept: */* > < HTTP/1.1 404 Not Found < Date: Fri, 28 Feb 2020 13:56:06 GMT < X-Frame-Options: SAMEORIGIN < Content-Security-Policy: frame-ancestors 'self' < X-XSS-Protection: 1; mode=block < Content-Type: application/json < Vary: Accept-Encoding < Content-Length: 59 < Server: Jetty(9.4.19.v20190610) < { [59 bytes data] * Curl_http_done: called premature == 0 100 59 100 59 0 0 6072 0 --:--:-- --:--:-- --:--:-- 6555 * Connection #0 to host nifi-0.nifi-headless.dap-core.svc.cluster.local left intact parse error: Invalid numeric literal at line 1, column 5 parse error: Invalid numeric literal at line 1, column 5
We also see these kind of errors

Multi-Attach error for volume "pvc-366c853a-2e59-4443-a1bd-0f8a7d9c2d51" Volume is already exclusively attached to one node and can't be attached to another
What you expected to happen:
The Readiness probe must pass

How to reproduce it (as minimally and precisely as possible):
Hard to say because we don't know what causes it. But best bet to reproduce would be to deploy nifi with persistent volumes. Then delete deployment and redeploy again. The might show up .

Anything else we need to know:

@appunni-m
Copy link

Same issue version: 0.4.2

@alexnuttinck alexnuttinck added the bug Something isn't working label Jun 24, 2020
@fzalila
Copy link

fzalila commented Nov 9, 2020

same issue for me.

@banzo banzo added the help wanted Extra attention is needed label Feb 11, 2021
@webdevops-admin
Copy link

Is any updates for this issue?

@twinklekumarp
Copy link

Hi, any updates for this issue? We are also looking for something with cluster monitoring

@wknickless
Copy link
Contributor

wknickless commented May 10, 2022

@RabbidDog @appunni-dishq @fzalila @webdevops-admin @twinklekumarp I don't think this is a NiFi or Helm chart issue. Whenever you see something like Multi-Attach error for volume "pvc-366c853a-2e59-4443-a1bd-0f8a7d9c2d51" Volume is already exclusively attached to one node and can't be attached to another it means Kubernetes can't provide the pod access to the persistent data it wants.

When this happens on our production Kubernetes clusters I have to reach out to our Kubernetes cluster administrators to find and fix the problem manually.

For more background, see https://blog.mayadata.io/recover-from-volume-multi-attach-error-in-on-prem-kubernetes-clusters

@banzo banzo closed this as completed in 7a98fd2 May 15, 2022
wknickless pushed a commit to wknickless/helm-nifi that referenced this issue May 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

8 participants