Skip to content

Releases: libopenstorage/stork

v24.3.0

11 Sep 09:39
Compare
Choose a tag to compare

Enhancements

  • Stork now supports partial backups. If backup fails for any of the PVCs, the successful backups of other PVCs are still saved, and the status is displayed as partial success. #1716
    Note: A partial backup requires at least one successful PVC backup.
  • Updated golang, aws-iam-authenticator, google-cloud-cli, and google-cloud-sdk versions to resolve security vulnerabilities. #1804 #1807

Bug fix

  • Issue: In a Synchronous DR setup, when you perform a failover operation using the storkctl perform failover command, the witness node might be deactivated instead of the source cluster.
    User Impact: After failover, the source cluster might remain in active state, and the PX volumes can still be mounted and used from the source cluster.
    Resolution: After failover, now the source cluster is deactivated by default, and the witness node remains unaffected. #1829

24.2.5

05 Aug 08:12
Compare
Choose a tag to compare

Bug fix

  • Issue: Strong hyperconvergence for pods is not working when using stork.libopenstorage.org/preferLocalNodeOnly annotation.
    User Impact: Pods remain in a pending state.
    Resolution: When the stork.libopenstorage.org/preferLocalNodeOnly annotation is used, the pods are now scheduled in the node where the volume replica lies, and the strong hyperconvergence works as expected. #1818

24.2.4

17 Jul 01:07
Compare
Choose a tag to compare

Bug Fix

  • Issue: During an OCP upgrade in a 3-node cluster, the MutatingWebhookConfiguration stork-webhooks-cfg is deleted if the leader Stork pod is evicted.
    User Impact: Applications that require Stork as the scheduler will experience disruptions, and OCP upgrades will get stuck on a 3-node cluster.
    Resolution: The MutatingWebhookConfiguration is now created after the leader election, ensuring stork-webhooks-cfg is always available. #1810
    Affected Versions: All

24.2.3

02 Jul 23:58
Compare
Choose a tag to compare

Note: For users currently on Stork versions 24.2.0, 24.2.1, or 24.2.2, Portworx by Pure Storage recommends upgrading to Stork 24.2.3.

Bug Fix

  • Issue: If the VolumeSnapshotSchedule has more status entries than the retain policy limit, Stork may continue creating new VolumeSnapshots, ignoring the retain policy. This can happen if the retain limit was lowered or if there was an error during snapshot creation.
    User Impact: Users saw more VolumeSnapshots than their retain policy was configured to allow.
    Resolution: Upgrade to Stork version 24.2.3. #1800
    Note: This fix doesn’t clean up the snapshots that were created before the upgrade. If required, you need to delete the old snapshots manually.
    Affected Versions: 24.2.0, 24.2.1, and 24.2.2.

24.2.2

14 Jun 04:32
Compare
Choose a tag to compare

Enhancement

  • Stork now uses the shared informer cache event handling mechanism instead of the watch API to reschedule unhealthy pods that are using Portworx volumes. #1795

24.2.1

06 Jun 16:13
Compare
Choose a tag to compare

Enhancement

  • Stork now supports Azure China environment for Azure backup locations. For more information, see Add Azure backup location.

Bug Fixes

  • Issue: If you were running Portworx Backup version 2.6.0 and upgraded the Stork version to 24.1.0, selecting the default VSC in the Create Backup window resulted in a VSC Not Found error.
    User Impact: Experienced failures during backup operations.
    Resolution: You can now choose the default VSC in the Create Backup window and create successful backups. #1744

  • Issue: If you deploy Portworx Enterprise with PX-Security enabled and take a backup on NFS backup location and then restore, restore used to fail.
    User Impact: Unable to restore backups from the NFS backup location for PX security-enabled Portworx volumes.
    Resolution: This issue is now fixed. #1733

24.2.0

29 May 06:16
Compare
Choose a tag to compare

Enhancements

  • Enhanced Disaster Recovery User Experience

    In this latest Stork release, the user experience has been improved significantly, with a particular focus on performing failover and failback operations. These enhancements provide a smoother and more intuitive user experience by simplifying the process while ensuring efficiency and reliability.

    Now, you can perform a failover or failback operation using the following storkctl commands:

    • To perform a failover operation, use the following command:
      storkctl perform failover -m <migration-schedule> -n <migration-schedule-namespace>
    • To perform a failback operation, use the following command:
      storkctl perform failback -m <migration-schedule> -n <migration-schedule-namespace>

    For more information on the enhanced approach, refer to the below documentation.

  • The Portworx driver is updated to optimize the API calls it makes to reduce the time taken for scheduling pods and monitoring pods if they need to be rescheduled when Portworx is down on nodes.

Bug Fixes

  • Issue: Migration schedules in the admin namespace were updated with true or false for the applicationActivated field when activating or deactivating a namespace, even if they did not migrate the particular namespace.
    User Impact: Unrelated migration schedules were getting suspended.
    Resolution: Stork now updates the applicationActivated field only for migration schedules that are migrating at least one of the namespaces being activated or deactivated. #1718

  • Issue: Updating the VolumeSnapshotSchedule resulted in a version mismatch error from Kubernetes when the update happened on a previous version of the resource.
    User Impact: When the VolumeSnapshotSchedule is high, Stork logs are flooded with these warning messages.
    Resolution: Fixed the VolumeSnapshotSchedule update with a patch to avoid the version mismatch error. #1665

  • Issue: Similar volume snapshot names were created when the VolumeSnapshotSchedule frequency matched and aftertrimming produced similar substrings.
    User Impact: For one volume, a snapshot may not be taken but can be marked as successful.
    Resolution: Adding a 4 digit randomness to the name to avoid name collisions for volumesnapshots resulting from different volumesnapshot schedules. #1686

  • Issue: Stork relies on Kubernetes DNS to locate services, but it also assumes the .svc.cluster.local domain for Kubernetes services.
    User Impact: The clusters that modified Kubernetes DNS domains were not able to use Stork.
    Resolution: Stork now works on clusters with a modified Kubernetes DNS domain. #1629

  • Issue: Resource transformation for CR was not supported.
    User Impact: It was blocking some of the necessary transformations for resources that were required at the destination site.
    Resolution: Now, resource transformation for CR is supported. #1705

Known Issues

  • Issue: If you use the storkctl perform failover command to perform a failover operation, the Stork might not be able to scale down the KubeVirt pods, which could cause the operation to fail.
    Workaround: Perform the failover operation by following the procedure on the below pages:

24.1.0

20 May 08:12
Compare
Choose a tag to compare

Enhancements

  • Stork now supports Kubevirt VMs for Portworx backup and restore operations. You can now initiate VM-specific backups by setting the backupObjectType to VirtualMachine. Stork automatically includes associated resources, such as PVCs, secrets, and ConfigMaps used as volumes and user data in VM backups. Also, Stork applies default freeze/thaw rules during VM backup operations to ensure consistent filesystem backups.
  • Cloud Native backups will now automatically default to CSI or KDMP with LocalSnapshot, depending on the type of schedules they create.
  • Previously in Stork, for CSI backups, you were limited to selecting a single VSC from the dropdown under the CSISnapshotClassName field. Now you can select a VSC for each provisioner via the CSISnapshotClassMap.
  • Now, the creation of a default VSC from Stork is optional.

Bug Fixes

  • Issue: Canceling an ongoing backup initiated by PX-Backup results in the halting of the post-execution rule.
    User Impact: This interruption causes the I/O processes on the application to stop or the post-execution rule execution to cease.
    Resolution: To address this, Stork executes and removes the post-execution rule CR as part of the cleanup procedure for the application backup CR. #1602

  • Issue: Generic KDMP backup/restore pods become unresponsive in environments where Istio is enabled.
    User Impact: Generic KDMP backup and restore fails in the Istio enabled environments.
    Resolution: Relaxed the Istio webhook checks for the stork created KDMP generic backup/restore pods. Additionally, the underlying issue causing job pod freezes has been resolved in Kubernetes version 1.28 and Istio version 1.19. #1623

23.11.0

22 Jan 04:23
Compare
Choose a tag to compare

New Features

  • You can now create and delete schedule policies and migration schedules using the new storkctl CLI feature. This enables you to seamlessly create and delete SchedulePolicy and MigrationSchedule resources, enhancing the DR setup process.
    In addition to the existing support for clusterPairs, you can now efficiently manage all necessary resources through storkctl. This update ensures a faster and simpler setup process, with built-in validations. By eliminating the need for manual YAML file edits, the feature significantly reduces the likelihood of errors, providing a more robust and user-friendly experience for DR resource management in Kubernetes clusters.

  • The new Storage Class parameter preferRemoteNode enhances scheduling flexibility for SharedV4 Service Volumes. By setting this parameter to false, you can now disable anti-hyperconvergence during scheduling. This provides an increased flexibility to tailor Stork's scheduling behavior according to your specific application needs.

Enhancement

  • Updated golang and google-cloud-sdk versions to resolve security vulnerabilities. #1587, #1588

Bug Fixes

  • Issue: Exclusion of Kubernetes resources such as deployments, statefulsets, and so on was not successful during migration.
    User Impact: The use of labels to exclude selectors proved ineffective in scenarios where the resource was managed by an operator that reset user-defined labels.
    Resolution: The introduction of the excludeResourceTypes feature now allows users to exclude certain types of resources from migration, providing a more effective solution compared to using labels. #1554

  • Issue: The applicationrestore function created using storkctl consistently restored to a namespace with the identical name as the source, causing users to be unable to restore to a different namespace.
    User Impact: Users faced limitations as they were unable to restore applications to a namespace other than the one with the same name as the source.
    Resolution: storkctl has been updated to address this issue by introducing support for accepting namespace mapping as a parameter, allowing users to restore to a different namespace as needed. #1545

  • Issue: The storkctl create clusterpair command was not functioning properly with HTTPS PX endpoints.
    User Impact: Migrations between clusters with SSL-enabled PX endpoints were not successful.
    Resolution: The issue has been addressed, and now both HTTPS and HTTP endpoints are accepted as source (src-ep) and destination (dest-ep) when using storkctl create clusterpair. #1537

  • Issue: The PostgreSQL operator generates an error related to the pre-existence of service account, role, and role bindings following a migration.
    User Impact: Users are unable to scale up a PostgreSQL application installed via OpenShift Operator Hub after completing the migration.
    Resolution: Excluded migration of service account, role, and role bindings if they have owner reference set to allow PostgreSQL pods to come up successfully. #1560

  • Issue: Real-Time Custom Resource (RT CR) enters a failed state when a transform rule includes either int or bool as a data type.
    User Impact: Migration involving resource transformation will not succeed.
    Resolution: Resolved the issue by addressing the parsing problem associated with int and bool types. #1532

  • Issue: Continuous crashes occur in Stork pods when the cluster contains a RT CR with a rule type set as slice and the operation is add.
    User Impact: Stork service experiences ongoing disruptions.
    Resolution: Implemented a solution by using type assertion to prevent the panic. Additionally, the problematic SetNestedField method is replaced with SetNestedStringSlice to avoid panics in such scenarios. You can also temporarily resolve the problem by removing the RT CR from the application cluster. #1530

  • Issue: Stork crashes when attempting to clone an application with CSI volumes using Portworx.
    User Impact: Users are unable to clone applications if PVCs in the namespaces utilize Portworx CSI volumes.
    Resolution: Now, a patch is included to manage CSI volumes with Portworx, which ensures the stability of application cloning functionality. #1591

  • Issue: When setting up a migration schedule in the admin namespace with pre/post-execution rules, these rules must be established in both the admin namespace and every namespace undergoing migration.
    User Impact: The user experience is less intuitive as it requires creating identical rules across multiple namespaces.
    Resolution: The process is now simplified as rules only require addition within the migration schedule's namespace. #1569

  • Issue: Stork was not honoring locator volume labels correctly when scheduling pods.
    User Impact: In cases where preferRemoteNodeOnly was initially set to true, pods sometimes failed to schedule. This issue was particularly noticeable when the Portworx volume setting preferRemoteNodeOnly was later changed to false, and there were no remote nodes available for scheduling.
    Resolution: Now, even in scenarios where remote nodes are not available for scheduling, pods can be successfully scheduled on a node that holds a replica of the volume. #1606

Known Issues

  • Issue: In Portworx version 3.0.4, several migration tests fail in an auth-enabled environment. This issue occurs specifically when running these tests in environments where authentication is enabled.
    User Impact: You may experience failed migrations, which will impact data transfer and management processes.
    Resolution: The issue has been resolved in Portworx version 3.1.0. Users experiencing this problem are advised to upgrade to version 3.1.0 to ensure smooth migration operations and avoid permission-related errors during data migration processes.

  • Issue: When using the storkctl create clusterpair command, the HTTPS endpoints for Portworx were not functioning properly.
    User Impact: This issue affects when you attempt migrations between clusters where px endpoints were secured with SSL. As a result, migrations could not be carried out successfully in environments using secure HTTPS connections.
    Resolution: In the upcoming Portworx 3.1.0 release, the storkctl create clusterpair command will be updated to accept both HTTP and HTTPS endpoints, allowing the specification of either src-ep or dest-ep with the appropriate scheme. This update ensures successful cluster pairing and migration in environments with SSL-secured px endpoints.

23.9.1

15 Dec 14:21
Compare
Choose a tag to compare

Bug Fixes

  • Issue: The generic backup of some PVCs in kdmp was failing due to the inclusion of certain read-only directories and files.
    User Impact: Difficulties restoring the snapshot as the restoration of these read-only directories and files resulted in permission denied errors.
    Resolution: Introduced the --ignore-file option in kdmp backup, enabling you to specify a list of files and directories to be excluded during snapshot creation. This ensures that during restoration, these excluded files and directories will not be restored. #1572

    Format for adding the ignore file list:

    KDMP_EXCLUDE_FILE_LIST: |
        <storageClassName1>=<dir-list>,<file-list1>,....
        <storageClassName2>=<dir-list>,<file-list1>,....
    

    Sample for adding the ignore file list:

    KDMP_EXCLUDE_FILE_LIST: |
        px-db=dir1,file1,dir2
        mysql=dir1,file1,dir2
    
  • Issue: The backup process does not terminate when an invalid post-execution rule is applied, leading to occasional failures in updating the failure status to the application backup CR.
    User Impact: Backups with invalid post-execution rules were not failing as expected.
    Resolution: Implemented a thorough check to ensure that backups with invalid post-execution rules are appropriately marked as failed, accompanied by a clear error message. #1582