Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tidb-controller-manager panics if tidb-cluster is configured with AdditionalVolumeMounts #5011

Closed
sergiomcalzada opened this issue May 17, 2023 · 2 comments · Fixed by #5058

Comments

@sergiomcalzada
Copy link

Bug Report

What version of Kubernetes are you using?

Client Version: v1.24.10
Kustomize Version: v4.5.4
Server Version: v1.24.10

What version of TiDB Operator are you using?

TiDB Operator Version: version.Info{GitVersion:"v1.4.4", GitCommit:"4b1c55c9aabf0410e276efcb460e37488f203b66", GitTreeState:"clean", BuildDate:"2023-03-13T06:41:33Z", GoVersion:"go1.19.7", Compiler:"gc", Platfor
m:"linux/amd64"}

What storage classes exist in the Kubernetes cluster and what are used for PD/TiKV pods?

N/A

What's the status of the TiDB cluster pods?

Controller manager crashloopback
What did you do?

Updated the operator from 1.3.7 to 1.4.4

What did you expect to see?
The operator continues working

What did you see instead?
The operator is on cashloopback

NOTE: The backupSchedule and the Cluster have that additional volume mounted tidb/tidb-backup-pvc. There is a warning about matiching desired volume size, but we didn't changed the size. Indeed there is no size configured there for it

W0517 10:43:33.428815       1 phase.go:69] volume tidb/tidb-backup-pvc modification is not allowed: can't match desired volume
E0517 10:43:33.429089       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 650 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x2a88e40?, 0x48c9990})
	k8s.io/[email protected]/pkg/util/runtime/runtime.go:74 +0x99
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000433f90?})
	k8s.io/[email protected]/pkg/util/runtime/runtime.go:48 +0x75
panic({0x2a88e40, 0x48c9990})
	runtime/panic.go:884 +0x212
github.com/pingcap/tidb-operator/pkg/manager/volumes.observeVolumeStatus({0x34df500, 0xc000641b90}, {0xc000b39500, 0x3, 0x4?}, {0xc001c33ef0, 0x1, 0x1})
	github.com/pingcap/tidb-operator/pkg/manager/volumes/sync_volume_status.go:85 +0x3ec
github.com/pingcap/tidb-operator/pkg/manager/volumes.SyncVolumeStatus({0x34df500, 0xc000641b90}, {0x34cb688, 0xc000432170}, 0xc0008d9100, {0x2e59b1b, 0x4})
	github.com/pingcap/tidb-operator/pkg/manager/volumes/sync_volume_status.go:49 +0x2a5
github.com/pingcap/tidb-operator/pkg/manager/member.(*tikvMemberManager).syncTiKVClusterStatus(0xc000673da0, 0xc0008d9100, 0xc00497c000)
	github.com/pingcap/tidb-operator/pkg/manager/member/tikv_member_manager.go:902 +0xd45
github.com/pingcap/tidb-operator/pkg/manager/member.(*tikvMemberManager).syncStatefulSetForTidbCluster(0xc000673da0, 0xc0008d9100)
	github.com/pingcap/tidb-operator/pkg/manager/member/tikv_member_manager.go:211 +0x33b
github.com/pingcap/tidb-operator/pkg/manager/member.(*tikvMemberManager).Sync(0xc000673da0, 0xc0008d9100)
	github.com/pingcap/tidb-operator/pkg/manager/member/tikv_member_manager.go:134 +0x2db
github.com/pingcap/tidb-operator/pkg/controller/tidbcluster.(*defaultTidbClusterControl).updateTidbCluster(0xc0002727e0, 0xc0008d9100)
	github.com/pingcap/tidb-operator/pkg/controller/tidbcluster/tidb_cluster_control.go:220 +0x1e7
github.com/pingcap/tidb-operator/pkg/controller/tidbcluster.(*defaultTidbClusterControl).UpdateTidbCluster(0xc0002727e0, 0xc0008d9100)
	github.com/pingcap/tidb-operator/pkg/controller/tidbcluster/tidb_cluster_control.go:114 +0xb0
github.com/pingcap/tidb-operator/pkg/controller/tidbcluster.(*Controller).syncTidbCluster(...)
	github.com/pingcap/tidb-operator/pkg/controller/tidbcluster/tidb_cluster_controller.go:166
github.com/pingcap/tidb-operator/pkg/controller/tidbcluster.(*Controller).sync(0xc000fa83c0, {0xc000ded270, 0x9})

Some extracts from the config

apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
  tikv:
    additionalVolumeMounts:
    - mountPath: /backup
      name: backup
    additionalVolumes:
    - name: backup
      persistentVolumeClaim:
        claimName: tidb-backup-pvc   
apiVersion: pingcap.com/v1alpha1
kind: BackupSchedule
metadata:
  name: tidb-backup-schedule
spec:  
  # https://docs.pingcap.com/tidb-in-kubernetes/stable/backup-restore-overview#backupschedule-cr-fields
  #pause: true
  maxBackups: 12
  #maxReservedTime: "168h" # one week
  #https://crontab.guru/
  #schedule: "* */2 * * *" # At every minute past every 2nd hour
  #schedule: "0 * * * *" # At minute 0
  #schedule: "0 */4 * * *" # At minute 0 past every 4th hour
  schedule: "0 */1 * * *" # every hour
  #schedule: "0 0 * * *" # At 00:00 every day

  backupTemplate:
    br:
      cluster: tidb
      clusterNamespace: ${namespace}
    cleanPolicy: Delete
    local:
      #prefix: backup
      volume:
        name: backup
        persistentVolumeClaim:
          claimName: tidb-backup-pvc
      volumeMount:
        name: backup
        mountPath: /backup    
    
@csuzhangxc
Copy link
Member

@sergiomcalzada thanks for your report, it seems caused by the volume-modify feature introduced in v1.4 but it didn't handle additionalVolumes correctly.

@liubog2008
Copy link
Member

AdditionalVolumes are not managed by operator, so we will not support modifying additional volume. The panic bug will be fixed in #5058

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants