Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: index out range when collaset spec.replicas < partition #135

Closed
ColdsteelRail opened this issue Dec 20, 2023 · 2 comments · Fixed by #144
Closed

Bug: index out range when collaset spec.replicas < partition #135

ColdsteelRail opened this issue Dec 20, 2023 · 2 comments · Fixed by #144
Assignees
Labels
help wanted Extra attention is needed kind/bug Something isn't working

Comments

@ColdsteelRail
Copy link
Member

Minimal reproduce step

  • 问题复现:Collaset缩容时,replicas值可能小于partion值,导致崩溃
apiVersion: apps.kusionstack.io/v1alpha1
kind: CollaSet
metadata:
  name: server
  namespace: operating-tutorial
spec:
  replicas: 2 # scale down from 3 to 2
  selector:
    matchLabels:
      app: server
  updateStrategy:
    podUpgradePolicy: InPlaceIfPossible
    rollingUpdate:
      byPartition:
        partition: 3
  template:
    metadata:
      labels:
        app: server
    spec:
      containers:
      - image: wu8685/echo:1.3
        name: server
        command:
        - /server
        resources:
          limits:
            cpu: "0.1"
            ephemeral-storage: 100Mi
            memory: 100Mi
          requests:
            cpu: "0.1"
            ephemeral-storage: 100Mi
            memory: 100Mi
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 3

What did you expect to see?

  • 校验或拦截

What did you see instead

  • kusionstack-controller-aneger崩溃
NAMESPACE            NAME                                              READY   STATUS    RESTARTS        AGE
kusionstack-system   kusionstack-controller-manager-6b6db85868-mgxc2   0/1     Error     8 (5m24s ago)   22m
  • 关键日志
panic: runtime error: slice bounds out of range [:3] with capacity 2

goroutine 328 [running]:
kusionstack.io/operating/pkg/controllers/collaset/synccontrol.decidePodToUpdateByPartition(0xc000158a00, {0xc00094d8c0?, 0x1b78b80?, 0x2})
	/home/runner/work/operating/operating/pkg/controllers/collaset/synccontrol/update.go:108 +0xb2
kusionstack.io/operating/pkg/controllers/collaset/synccontrol.decidePodToUpdate(0xc00094d850?, {0xc00094d8c0?, 0x2?, 0xc00094c790?})
	/home/runner/work/operating/operating/pkg/controllers/collaset/synccontrol/update.go:85 +0x4f
kusionstack.io/operating/pkg/controllers/collaset/synccontrol.(*RealSyncControl).Update(0xc0003e7e80, 0xc000158a00, {0xc00094d850, 0x2, 0x2}, {0xc00094c790, 0x2, 0x2}, 0xc000108f20, 0xc000745800, ...)
	/home/runner/work/operating/operating/pkg/controllers/collaset/synccontrol/sync_control.go:358 +0x193
kusionstack.io/operating/pkg/controllers/collaset.(*CollaSetReconciler).doSync(0xc000324a40, 0x58?, 0x7f21f1aeb5b8?, {0xc00094c790, 0x2, 0x2}, 0x0?)
	/home/runner/work/operating/operating/pkg/controllers/collaset/collaset_controller.go:193 +0x1b5
kusionstack.io/operating/pkg/controllers/collaset.(*CollaSetReconciler).DoReconcile(0xc000339260?, 0x1b88c18?, 0xc000158a00?, {0xc00094c790?, 0x8?, 0xc000860350?}, 0x0?)
	/home/runner/work/operating/operating/pkg/controllers/collaset/collaset_controller.go:174 +0x2b
kusionstack.io/operating/pkg/controllers/collaset.(*CollaSetReconciler).Reconcile(0xc000324a40, {0x1b775b8, 0xc000652b10}, {{{0xc000846168?, 0x186e6a0?}, {0xc000860350?, 0x281b890?}}})
	/home/runner/work/operating/operating/pkg/controllers/collaset/collaset_controller.go:164 +0x63b
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0xc000000fa0, {0x1b775b8, 0xc000652ab0}, {{{0xc000846168?, 0x186e6a0?}, {0xc000860350?, 0xc000628380?}}})
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114 +0x22c
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000000fa0, {0x1b77510, 0xc0003165c0}, {0x174b160?, 0xc00080fb20?})
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311 +0x2f2
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000000fa0, {0x1b77510, 0xc0003165c0})
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:223 +0x30c

What is your KusionStack components and its version?

operating 版本: latest
Kubernetes 版本: v1.27.3

@ColdsteelRail ColdsteelRail added the kind/bug Something isn't working label Dec 20, 2023
@ColdsteelRail ColdsteelRail changed the title Bug: Collaset partition值大于Replicas时出现Panic,导致kusion-ontroller-manager崩溃 Bug: Collaset partition值大于Replicas时出现Panic,导致kusion-controller-manager崩溃 Dec 20, 2023
@ColdsteelRail ColdsteelRail changed the title Bug: Collaset partition值大于Replicas时出现Panic,导致kusion-controller-manager崩溃 Bug: Panic: Collaset误缩容Replicas值小于partition值,数组越界 Dec 20, 2023
@wu8685 wu8685 added the help wanted Extra attention is needed label Dec 28, 2023
@ColdsteelRail ColdsteelRail changed the title Bug: Panic: Collaset误缩容Replicas值小于partition值,数组越界 Bug: index out range when collaset spec.replicas < partition Jan 2, 2024
@ColdsteelRail
Copy link
Member Author

thanks for your responses, could you please assign this issue to @ColdsteelRail, appreciate

@wu8685
Copy link
Collaborator

wu8685 commented Jan 2, 2024

thanks for your responses, could you please assign this issue to @ColdsteelRail, appreciate

Assigned. Thanks for your contributing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants