Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rke2-ingress-nginx does not watch Ingress resources without IngressClassName set #6510

Closed
hyeluoh opened this issue Aug 8, 2024 · 18 comments
Assignees

Comments

@hyeluoh
Copy link

hyeluoh commented Aug 8, 2024

Environmental Info:
RKE2 Version: v1.30.3

rke2 version v1.30.3+rke2r1
go version go1.22.5 X:boringcrypto
Node(s) CPU architecture, OS, and Version:

arm64 centos 7.9
Cluster Configuration:

1 server 3 agents
Describe the bug:

After upgrading from RKE2 v1.29.7 to v1.30.3, services within the Kubernetes cluster that are accessed through Ingress are returning 404 errors.
Steps To Reproduce:

  • update rke2
  • When I entered the nginx-ingress-controller pod, I found that the domain configuration for the inaccessible service was missing.
  • When I deleted and then reconfigured the Ingress settings for the corresponding domain, the service was restored.

Expected behavior:

Actual behavior:

Additional context / logs:

@brandond
Copy link
Member

brandond commented Aug 9, 2024

Can you provide an example showing how exactly you'd configured the ingress settings? What specifically was missing?

@nugzarg
Copy link

nugzarg commented Aug 13, 2024

The reason fro this issue is missing annotation ingressclass.kubernetes.io/is-default-class: "true" for nginx IngressClass.
I'm not sure, but it seems that nginx IngressClass was set automatically as default in previous version. Which is not the case now.
Simple workaround is t0 set this annotation manually, or change nginx-ingress helm chart configuration. Example:

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-ingress-nginx
  namespace: kube-system
spec:
  valuesContent: |-
    controller:
      ingressClassResource:
        name: nginx
        enabled: true
        default: true

@brandond
Copy link
Member

brandond commented Aug 13, 2024

That should be handled when the chart is upgraded, via the .global.systemDefaultIngressClass chart value that is injected into the chart values. Have you customized the ingress chart deployment in any other way? Can you provide the output of kubectl get helmchart -n kube-system rke2-ingress-nginx -o yaml?

@nugzarg
Copy link

nugzarg commented Aug 14, 2024

Yes, I have customized helm chart od nginx ingress. Here the customized helm chart config:

---
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-ingress-nginx
  namespace: kube-system
spec:
  valuesContent: |-
    controller:
      allowSnippetAnnotations: true
      enableAnnotationValidations: true
      hostNetwork: false
      hostPort:
        enabled: false
      service:
        enabled: true
        type: NodePort
        nodePorts:
          http: 32080
          https: 32443
          tcp:
            8080: 32808
        externalTrafficPolicy: Local
      dnsPolicy: ClusterFirst
      ingressClassResource:
        name: nginx
        enabled: true
        default: true
      ingressClass: nginx
      metrics:
        enabled: true
        serviceMonitor:
          enabled: true
      config:
        use-forwarded-headers: true
        compute-full-forwarded-for: true
        proxy-body-size: "200m"
        fail_timeout: "5s"
        enable-modsecurity: true
        enable-owasp-modsecurity-crs: true
        modsecurity-snippet: |
          SecAuditLog /dev/stdout
          SecAuditLogFormat JSON
        log-format-escape-json: "true"
        log-format-upstream: '{ 
          "time_local": "$time_local",
          "time_iso8601": "$time_iso8601",
          "network": {
             "forwarded_ip": "$http_x_forwarded_for", 
             "forwarded_original_ip": "$http_x_original_forwarded_for", 
             "real_ip": "$http_x_real_ip"
           },
          "user":{"name":"$remote_user"},
          "user_agent":{"original":"$http_user_agent"},
          "http":{
            "version": "$server_protocol",
            "request":{
              "body":{"bytes":$body_bytes_sent},
              "bytes": $request_length,
              "method":"$request_method",
              "referrer":"$http_referer"
            },
            "response":{
              "body":{"bytes":$body_bytes_sent},
              "bytes": $bytes_sent,
              "status_code":$status,
              "time":$request_time
            },
            "upstream": {
              "bytes": $upstream_response_length,
              "status_code":"$upstream_status",
              "time":$upstream_response_time,
              "address": "$upstream_addr",
              "name": "$proxy_upstream_name"
            }
          },
          "url":{
            "domain":"$host",
            "path":"$uri",
            "query":"$args",
            "original":"$request_uri"
          }
        }'

Section ingressClassResource: was not set before rke2 upgrade.

@brandond
Copy link
Member

Please show the helmchart, not the helmchartconfig

@nugzarg
Copy link

nugzarg commented Aug 14, 2024

Here the output of kubectl get helmchart -n kube-system rke2-ingress-nginx -o yaml

apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
  annotations:
    helm.cattle.io/chart-url: https://rke2-charts.rancher.io/assets/rke2-ingress-nginx/rke2-ingress-nginx-4.8.200.tgz
    helmcharts.cattle.io/managed-by: helm-controller
    objectset.rio.cattle.io/applied: H4sIAAAAAAAA/4yS3W7TQBCFXwXNLY7/1nESS1wUR0AEolJaKvVyvJ7Ei9e70c7G/FR5d7RpqhYKoZfjOXPm8+y5A9ypG3KsrIEKOtJDLNF7TbGyyZhBBL0yLVTwgfRQd+g8RDCQxxY9QnUHaIz16JU1HMo/HGSYmOydDube77hKEtdTPjk2OHZoZEcuSJGZPN93ldk6Yp6YrTLfHz5BBLb5StIz+dgp+2SNCoRn+vabITfZjj1U0Av+7RejVx+Vad9ctK09t+LewuBAUMFzyBdN8g5lGO/3DU34B3sa4BCBxob08Xr/suiQO6hgJkWzmRWbRV7mJc5oKposQzGfLsoNzbNpVhZpIUQbTM+RnmHhHclA0ljr2TvcQbVBzRTB8cVqazwZH/JQ8Kp+O6xe51/qC7H2sp6mt+ZmuH53K9Z9u/5k1p3cLvkyb8X7n8PVPBsS25flIr8e5fRSQQRMPqzaatugjqXesydXr5ZrqCBL4yKP0zhNshKiv2jG4n+q5eerk0TEaZylzwV2QBVif6pjbSXqR1m43RI9LpWDCpIRXaJVk5wye4zlo5jJjUrSE3zxAHY4HH4FAAD//ySeAJRoAwAA
    objectset.rio.cattle.io/id: ""
    objectset.rio.cattle.io/owner-gvk: k3s.cattle.io/v1, Kind=Addon
    objectset.rio.cattle.io/owner-name: rke2-ingress-nginx
    objectset.rio.cattle.io/owner-namespace: kube-system
  creationTimestamp: "2021-12-22T07:44:12Z"
  finalizers:
  - wrangler.cattle.io/helm-controller
  - wrangler.cattle.io/on-helm-chart-remove
  generation: 64
  labels:
    objectset.rio.cattle.io/hash: 7c3bf74f92626a7e53b11a38596fe8151640433d
  name: rke2-ingress-nginx
  namespace: kube-system
  resourceVersion: "1224480281"
  uid: e854442e-1d62-439e-86fe-ba1ebbfd7711
spec:
  bootstrap: false
  chartContent: 
  set:
    global.clusterCIDR: 10.42.0.0/16
    global.clusterCIDRv4: 10.42.0.0/16
    global.clusterDNS: 10.43.0.10
    global.clusterDomain: cluster.local
    global.rke2DataDir: /var/lib/rancher/rke2
    global.serviceCIDR: 10.43.0.0/16
status:
  jobName: helm-install-rke2-ingress-nginx

@brandond
Copy link
Member

brandond commented Aug 14, 2024

Your chart's set values appear to be missing some things that should be injected for ALL system charts, ref:

func setChartValues(manifestsDir, ingressController string, nodeConfig *daemonconfig.Node, cfg cmds.Agent) error {
chartValues := map[string]string{
"global.clusterCIDR": util.JoinIPNets(nodeConfig.AgentConfig.ClusterCIDRs),
"global.clusterCIDRv4": util.JoinIP4Nets(nodeConfig.AgentConfig.ClusterCIDRs),
"global.clusterCIDRv6": util.JoinIP6Nets(nodeConfig.AgentConfig.ClusterCIDRs),
"global.clusterDNS": util.JoinIPs(nodeConfig.AgentConfig.ClusterDNSs),
"global.clusterDomain": nodeConfig.AgentConfig.ClusterDomain,
"global.rke2DataDir": cfg.DataDir,
"global.serviceCIDR": util.JoinIPNets(nodeConfig.AgentConfig.ServiceCIDRs),
"global.systemDefaultIngressClass": ingressController,
"global.systemDefaultRegistry": nodeConfig.AgentConfig.SystemDefaultRegistry,
"global.cattle.systemDefaultRegistry": nodeConfig.AgentConfig.SystemDefaultRegistry,

What is the output of grep -E 'chart-url|global' /var/lib/rancher/rke2/server/manifests/rke2-ingress-nginx.yaml? You should see something like this:

root@rke2-server-1:/# grep -E 'chart-url|global' /var/lib/rancher/rke2/server/manifests/rke2-ingress-nginx.yaml
    helm.cattle.io/chart-url: https://rke2-charts.rancher.io/assets/rke2-ingress-nginx/rke2-ingress-nginx-4.10.102.tgz
    global.clusterCIDR: 10.42.0.0/16
    global.clusterCIDRv4: 10.42.0.0/16
    global.clusterDNS: 10.43.0.10
    global.clusterDomain: cluster.local
    global.rke2DataDir: /var/lib/rancher/rke2
    global.serviceCIDR: 10.43.0.0/16
    global.systemDefaultIngressClass: ingress-nginx

If you see the global.systemDefaultIngressClass value in the chart on disk, but not in the resource deployed to the cluster, please check for apply errors in your rke2-server log.

If you don't see it there... then something else weird is going on, and we'll want to look at your server's config.yaml.

@nugzarg
Copy link

nugzarg commented Aug 14, 2024

grep -E 'chart-url|global' /var/lib/rancher/rke2/server/manifests/rke2-ingress-nginx.yaml
Output:

    global.clusterCIDR: 10.42.0.0/16
    global.clusterCIDRv4: 10.42.0.0/16
    global.clusterDNS: 10.43.0.10
    global.clusterDomain: cluster.local
    global.rke2DataDir: /var/lib/rancher/rke2
    global.serviceCIDR: 10.43.0.0/16

cat /etc/rancher/rke2/config.yaml

node-name: "master1.kube.example.com"
node-ip: "1.2.3.4"
node-taint:
- "CriticalAddonsOnly=true:NoExecute"
kubelet-arg:
- feature-gates=SizeMemoryBackedVolumes=true
- seccomp-default=true
- pod-max-pids=2048
cni:
- cilium
disable:
- rke2-kube-proxy

@brandond
Copy link
Member

And you're sure you're on v1.30.3+rke2rk1 on all your nodes? Can you provide rke2-server logs from journald?

@nugzarg
Copy link

nugzarg commented Aug 14, 2024

No, my cluster is on v1.28.8+rke2r1. I just tried to upgrade first master node to v1.30.3+rke2rk1. The upgrade has triggered nginx-ingress upgrade to v1.10.1-hardened1. After nginx-ingress upgrade, no ingress rule was working. I received error 404 for all requests, because no rule had ingress class set and there was no default ingress class. After that I decided to downgrade the node to v1.28.8+rke2r1 (there was second issue with not working modesecurity and it was too much for me). Downgrade triggered nginx-ingress helm chart downgrade to nginx-1.9.6-hardened1 and everything is working now.

@brandond
Copy link
Member

brandond commented Aug 14, 2024

After that I decided to downgrade the node to v1.28.8+rke2r1

This is the first time you've mentioned that you are no longer running the version you listed when creating the issue. It would have been good to mention that, as none of the information I asked for is going to be of any use if you're not running the new version any longer.

because no rule had ingress class set and there was no default ingress class.

rke2-ingress-nginx should have been set as the default ingress class by the chart value I was having you check for.

@hyeluoh
Copy link
Author

hyeluoh commented Aug 15, 2024

I have upgraded to version v1.30.3. I checked the Nginx configuration in the Ingress, and there is no domain that I cannot access in the configuration, but it indeed configured the Ingress. My solution is to recreate the Ingress configuration.

@hyeluoh
Copy link
Author

hyeluoh commented Aug 15, 2024

No, my cluster is on v1.28.8+rke2r1. I just tried to upgrade first master node to v1.30.3+rke2rk1. The upgrade has triggered nginx-ingress upgrade to v1.10.1-hardened1. After nginx-ingress upgrade, no ingress rule was working. I received error 404 for all requests, because no rule had ingress class set and there was no default ingress class. After that I decided to downgrade the node to v1.28.8+rke2r1 (there was second issue with not working modesecurity and it was too much for me). Downgrade triggered nginx-ingress helm chart downgrade to nginx-1.9.6-hardened1 and everything is working now.

I think I've encountered the same situation as you.I suspect it might be related to the upgrade of the Nginx-Ingress version.
The rke2 cluster was upgraded from v1.28.11 to v1.30.3.
image

@hyeluoh hyeluoh changed the title After upgrading RKE2 from v1.29.7 to v1.30.3, I found that the domain configured through Ingress was inaccessible. After upgrading RKE2 from v1.28.10 to v1.30.3, I found that the domain configured through Ingress was inaccessible. Aug 15, 2024
@hyeluoh
Copy link
Author

hyeluoh commented Aug 15, 2024

Here is a comparison between the configurations of the two versions.

[root@gt-test-10-117 ~]# diff v1.28.10/nginx.conf v1.30.3/nginx.conf 
2c2
< # Configuration checksum: 8565260235095128852
---
> # Configuration checksum: 6186567879202657564
82,88d81
< 		ok, res = pcall(require, "monitor")
< 		if not ok then
< 		error("require failed: " .. tostring(res))
< 		else
< 		monitor = res
< 		end
< 		
111,112d103
< 		monitor.init_worker(10000)
< 		
144c135
< 	server_names_hash_bucket_size   64;
---
> 	server_names_hash_bucket_size   32;
237c228
< 	# PEM sha: d590dd180ecf6844ce48d03a12a9a92119ff026f
---
> 	# PEM sha: 3ee252d5fb4aa5dc6c2eac0505aee1592180d1c0
276a268,269
> 		http2 on;
> 		
279,280c272,273
< 		listen 443 default_server reuseport backlog=511 ssl http2 ;
< 		listen [::]:443 default_server reuseport backlog=511 ssl http2 ;
---
> 		listen 443 default_server reuseport backlog=511 ssl;
> 		listen [::]:443 default_server reuseport backlog=511 ssl;
330,331d322
< 				monitor.call()
< 				
403a395,396
> 			# Custom Response Headers
> 			
433,568d425
< 	
< 	## start server test.k8s.com
< 	server {
< 		server_name test.k8s.com ;
< 		
< 		listen 80  ;
< 		listen [::]:80  ;
< 		listen 443  ssl http2 ;
< 		listen [::]:443  ssl http2 ;
< 		
< 		set $proxy_upstream_name "-";
< 		
< 		ssl_certificate_by_lua_block {
< 			certificate.call()
< 		}
< 		
< 		location / {
< 			
< 			set $namespace      "nginx";
< 			set $ingress_name   "nginx-web-ingress";
< 			set $service_name   "";
< 			set $service_port   "";
< 			set $location_path  "/";
< 			set $global_rate_limit_exceeding n;
< 			
< 			rewrite_by_lua_block {
< 				lua_ingress.rewrite({
< 					force_ssl_redirect = false,
< 					ssl_redirect = true,
< 					force_no_ssl_redirect = false,
< 					preserve_trailing_slash = false,
< 					use_port_in_redirects = false,
< 					global_throttle = { namespace = "", limit = 0, window_size = 0, key = { }, ignored_cidrs = { } },
< 				})
< 				balancer.rewrite()
< 				plugins.run()
< 			}
< 			
< 			# be careful with `access_by_lua_block` and `satisfy any` directives as satisfy any
< 			# will always succeed when there's `access_by_lua_block` that does not have any lua code doing `ngx.exit(ngx.DECLINED)`
< 			# other authentication method such as basic auth or external auth useless - all requests will be allowed.
< 			#access_by_lua_block {
< 			#}
< 			
< 			header_filter_by_lua_block {
< 				lua_ingress.header()
< 				plugins.run()
< 			}
< 			
< 			body_filter_by_lua_block {
< 				plugins.run()
< 			}
< 			
< 			log_by_lua_block {
< 				balancer.log()
< 				
< 				monitor.call()
< 				
< 				plugins.run()
< 			}
< 			
< 			port_in_redirect off;
< 			
< 			set $balancer_ewma_score -1;
< 			set $proxy_upstream_name "upstream-default-backend";
< 			set $proxy_host          $proxy_upstream_name;
< 			set $pass_access_scheme  $scheme;
< 			
< 			set $pass_server_port    $server_port;
< 			
< 			set $best_http_host      $http_host;
< 			set $pass_port           $pass_server_port;
< 			
< 			set $proxy_alternative_upstream_name "";
< 			
< 			client_max_body_size                    1m;
< 			
< 			proxy_set_header Host                   $best_http_host;
< 			
< 			# Pass the extracted client certificate to the backend
< 			
< 			# Allow websocket connections
< 			proxy_set_header                        Upgrade           $http_upgrade;
< 			
< 			proxy_set_header                        Connection        $connection_upgrade;
< 			
< 			proxy_set_header X-Request-ID           $req_id;
< 			proxy_set_header X-Real-IP              $remote_addr;
< 			
< 			proxy_set_header X-Forwarded-For        $remote_addr;
< 			
< 			proxy_set_header X-Forwarded-Host       $best_http_host;
< 			proxy_set_header X-Forwarded-Port       $pass_port;
< 			proxy_set_header X-Forwarded-Proto      $pass_access_scheme;
< 			proxy_set_header X-Forwarded-Scheme     $pass_access_scheme;
< 			
< 			proxy_set_header X-Scheme               $pass_access_scheme;
< 			
< 			# Pass the original X-Forwarded-For
< 			proxy_set_header X-Original-Forwarded-For $http_x_forwarded_for;
< 			
< 			# mitigate HTTPoxy Vulnerability
< 			# https://www.nginx.com/blog/mitigating-the-httpoxy-vulnerability-with-nginx/
< 			proxy_set_header Proxy                  "";
< 			
< 			# Custom headers to proxied server
< 			
< 			proxy_connect_timeout                   5s;
< 			proxy_send_timeout                      60s;
< 			proxy_read_timeout                      60s;
< 			
< 			proxy_buffering                         off;
< 			proxy_buffer_size                       4k;
< 			proxy_buffers                           4 4k;
< 			
< 			proxy_max_temp_file_size                1024m;
< 			
< 			proxy_request_buffering                 on;
< 			proxy_http_version                      1.1;
< 			
< 			proxy_cookie_domain                     off;
< 			proxy_cookie_path                       off;
< 			
< 			# In case of errors try the next upstream server before returning an error
< 			proxy_next_upstream                     error timeout;
< 			proxy_next_upstream_timeout             0;
< 			proxy_next_upstream_tries               3;
< 			
< 			proxy_pass http://upstream_balancer;
< 			
< 			proxy_redirect                          off;
< 			
< 		}
< 		
< 	}
< 	## end server test.k8s.com

Here is the comparison of the Ingress configuration after the upgrade.

[root@gt-test-10-117 ~]# diff v1.28.10/nginx-web-ingress.yaml  v1.30.3/nginx-web-ingress.yaml 
8c8
<   resourceVersion: "20077"
---
>   resourceVersion: "22143"
14,16c14
<   loadBalancer:
<     ingress:
<     - ip: 192.168.10.117
---
>   loadBalancer: {}

@brandond

@brandond
Copy link
Member

Are you upgrading directly from v1.28.10 to v1.30.3? That is not supported, you are expected to step through intermediate minors (v1.27) when upgrading.

I am not sure that's related though. Please see the information that was asked for above, regarding the HelmChart resource, both on disk and in the cluster.

@serhiynovos
Copy link

@brandond after upgrading to RKE2 v1.30.3 +rke2r1 I'm also facing this issue. I checked ingress storage class and it has ingressclass.kubernetes.io/is-default-class: 'true' annotation

apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
  annotations:
    ingressclass.kubernetes.io/is-default-class: 'true'
    meta.helm.sh/release-name: rke2-ingress-nginx
    meta.helm.sh/release-namespace: kube-system
  creationTimestamp: '2024-01-25T22:48:15Z'
  generation: 1
  labels:
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: rke2-ingress-nginx
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: rke2-ingress-nginx
    app.kubernetes.io/part-of: rke2-ingress-nginx
    app.kubernetes.io/version: 1.10.1
    helm.sh/chart: rke2-ingress-nginx-4.10.102
  managedFields:
    - apiVersion: networking.k8s.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:ingressclass.kubernetes.io/is-default-class: {}
            f:meta.helm.sh/release-name: {}
            f:meta.helm.sh/release-namespace: {}
          f:labels:
            .: {}
            f:app.kubernetes.io/component: {}
            f:app.kubernetes.io/instance: {}
            f:app.kubernetes.io/managed-by: {}
            f:app.kubernetes.io/name: {}
            f:app.kubernetes.io/part-of: {}
            f:app.kubernetes.io/version: {}
            f:helm.sh/chart: {}
        f:spec:
          f:controller: {}
      manager: helm
      operation: Update
      time: '2024-08-26T20:10:50Z'
  name: nginx
  resourceVersion: '196680132'
  uid: 93ef06ed-bf17-4c2f-aa1a-0a4619cf1f62
spec:
  controller: k8s.io/ingress-nginx

@brandond
Copy link
Member

brandond commented Aug 26, 2024

In newer releases of RKE2, the ingress-nginx IngressClass is set as default, and any new Ingress resources created on these versions will have the ingressClassName assigned during creation, if the attribute is not set.

If you're upgrading from earlier releases, and did not explicitly set the ingressClassName on your Ingress resources, the default ingress class WILL NOT be set on your existing resources, and on affected releases of RKE2, ingress-nginx will no longer handle these Ingresses.

The fix is to either:

  • Modify your Ingress resources to set ingressClassName: ingress-nginx
  • Update the rke2-ingress-nginx chart values to set watchIngressWithoutClass: true
  • Upgrade to a newer release of RKE2. On newer releases, the watchIngressWithoutClass chart value is automatically set to true if ingress-nginx is the default ingress controller.

@mdrahman-suse
Copy link
Contributor

Validated the fixes on the latest releases, closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants