add troubleshooting for port listen issues (kubernetes#9185)

jaehnri · Jan 2, 2023 · 28cb6bb · 28cb6bb
1 parent 35ac6ca
commit 28cb6bb
Showing 1 changed file with 114 additions and 0 deletions.
diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md
@@ -365,3 +365,117 @@ Warning  Failed     5m5s (x4 over 6m34s)   kubelet            Failed to pull ima
      a. *.k8s.io -> To ensure you can pull any images from registry.k8s.io
      b. *.gcr.io -> GCP services are used for image hosting. This is part of the domains suggested by GCP to allow and ensure users can pull images from their container registry services.
      c. *.appspot.com -> This a Google domain. part of the domain used for GCR.
+
+## Unable to listen on port (80/443)
+One possible reason for this error is lack of permission to bind to the port.  Ports 80, 443, and any other port < 1024 are Linux privileged ports which historically could only be bound by root.  The ingress-nginx-controller uses the CAP_NET_BIND_SERVICE [linux capability](https://man7.org/linux/man-pages/man7/capabilities.7.html) to allow binding these ports as a normal user (www-data / 101).  This involves two components:
+1. In the image, the /nginx-ingress-controller file has the cap_net_bind_service capability added (e.g. via [setcap](https://man7.org/linux/man-pages/man8/setcap.8.html)) 
+2. The NET_BIND_SERVICE capability is added to the container in the containerSecurityContext of the deployment.
+
+If encountering this on one/some node(s) and not on others, try to purge and pull a fresh copy of the image to the affected node(s), in case there has been corruption of the underlying layers to lose the capability on the executable.
+
+### Create a test pod
+The /nginx-ingress-controller process exits/crashes when encountering this error, making it difficult to troubleshoot what is happening inside the container.  To get around this, start an equivalent container running "sleep 3600", and exec into it for further troubleshooting.  For example:
+```yaml
+apiVersion: v1
+kind: Pod
+metadata:
+  name: ingress-nginx-sleep
+  namespace: default
+  labels:
+    app: nginx
+spec:
+  containers:
+    - name: nginx
+      image: ##_CONTROLLER_IMAGE_##
+      resources:
+        requests:
+          memory: "512Mi"
+          cpu: "500m"
+        limits:
+          memory: "1Gi"
+          cpu: "1"
+      command: ["sleep"]
+      args: ["3600"]
+      ports:
+      - containerPort: 80
+        name: http
+        protocol: TCP
+      - containerPort: 443
+        name: https
+        protocol: TCP
+      securityContext:
+        allowPrivilegeEscalation: true
+        capabilities:
+          add:
+          - NET_BIND_SERVICE
+          drop:
+          - ALL
+        runAsUser: 101
+  restartPolicy: Never
+  nodeSelector:
+    kubernetes.io/hostname: ##_NODE_NAME_##
+  tolerations:
+  - key: "node.kubernetes.io/unschedulable"
+    operator: "Exists"
+    effect: NoSchedule
+```
+* update the namespace if applicable/desired
+* replace `##_NODE_NAME_##` with the problematic node (or remove nodeSelector section if problem is not confined to one node)
+* replace `##_CONTROLLER_IMAGE_##` with the same image as in use by your ingress-nginx deployment
+* confirm the securityContext section matches what is in place for ingress-nginx-controller pods in your cluster
+
+Apply the YAML and open a shell into the pod.
+Try to manually run the controller process:
+```console
+$ /nginx-ingress-controller
+```
+You should get the same error as from the ingress controller pod logs.
+
+Confirm the capabilities are properly surfacing into the pod:
+```console
+$ grep CapBnd /proc/1/status
+CapBnd: 0000000000000400
+```
+The above value has only net_bind_service enabled (per security context in YAML which adds that and drops all). If you get a different value, then you can decode it on another linux box (capsh not available in this container) like below, and then figure out why specified capabilities are not propagating into the pod/container.
+```console
+$ capsh --decode=0000000000000400
+0x0000000000000400=cap_net_bind_service
+```
+
+## Create a test pod as root
+(Note, this may be restricted by PodSecurityPolicy, PodSecurityAdmission/Standards, OPA Gatekeeper, etc. in which case you will need to do the appropriate workaround for testing, e.g. deploy in a new namespace without the restrictions.)
+To test further you may want to install additional utilities, etc.  Modify the pod yaml by:
+* changing runAsUser from 101 to 0
+* removing the "drop..ALL" section from the capabilities.
+
+Some things to try after shelling into this container:
+
+Try running the controller as the www-data (101) user:
+```console
+$ chmod 4755 /nginx-ingress-controller
+$ /nginx-ingress-controller
+```
+Examine the errors to see if there is still an issue listening on the port or if it passed that and moved on to other expected errors due to running out of context.
+
+Install the libcap package and check capabilities on the file:
+```console
+$ apk add libcap
+(1/1) Installing libcap (2.50-r0)
+Executing busybox-1.33.1-r7.trigger
+OK: 26 MiB in 41 packages
+$ getcap /nginx-ingress-controller
+/nginx-ingress-controller cap_net_bind_service=ep
+```
+(if missing, see above about purging image on the server and re-pulling)
+
+Strace the executable to see what system calls are being executed when it fails:
+```console
+$ apk add strace
+(1/1) Installing strace (5.12-r0)
+Executing busybox-1.33.1-r7.trigger
+OK: 28 MiB in 42 packages
+$ strace /nginx-ingress-controller
+execve("/nginx-ingress-controller", ["/nginx-ingress-controller"], 0x7ffeb9eb3240 /* 131 vars */) = 0
+arch_prctl(ARCH_SET_FS, 0x29ea690)      = 0
+...
+```