Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to enable cluster mode #212

Open
rymancl opened this issue Dec 12, 2023 · 5 comments
Open

Unable to enable cluster mode #212

rymancl opened this issue Dec 12, 2023 · 5 comments
Labels
bug 🐛 An issue with the system

Comments

@rymancl
Copy link

rymancl commented Dec 12, 2023

Describe the Bug

I am trying to switch an existing cluster with cluster_mode_enabled = false (default) to cluster_mode_enabled = true.
As part of this change, I have also made the following changes:

  • remove cluster_size
  • set automatic_failover_enabled = true
  • set cluster_mode_num_node_groups = 1
  • set cluster_mode_replicas_per_node_group = 1

Proposed plan:
Screenshot 2023-12-12 at 12 22 56 PM

Apply error:
Screenshot 2023-12-12 at 12 26 38 PM

module.shared-redis.aws_elasticache_parameter_group.default[0]: Modifying... [id=<redacted>-uw2-shared]
╷
│ Error: modifying ElastiCache Parameter Group: InvalidParameterValue: The parameter cluster-enabled cannot be modified.
│       status code: 400, request id: <redacted>
│ 
│   with module.shared-redis.aws_elasticache_parameter_group.default[0],
│   on .terraform/modules/shared-redis/main.tf line 91, in resource "aws_elasticache_parameter_group" "default":
│   91: resource "aws_elasticache_parameter_group" "default" {
│ 
╵

According to AWS, this parameter cannot be modified:
Screenshot 2023-12-12 at 12 28 14 PM

Expected Behavior

Successfully enable cluster mode on Redis

Steps to Reproduce

See above

Screenshots

No response

Environment

  • OSX Sonoma 14.2
  • Terraform v1.6.5 on darwin_arm64
  • AWS provider version 5.30.0
  • Module version 0.53.0
  • Redis engine version 7.0.7

Additional Context

No response

@rymancl rymancl added the bug 🐛 An issue with the system label Dec 12, 2023
@kevcube
Copy link
Contributor

kevcube commented Dec 12, 2023

@rymancl it looks like AWS disallows changing of this parameter. I suspect the only way to do this is to delete the cluster and restore from a snapshot with cluster mode enabled.

@kevcube
Copy link
Contributor

kevcube commented Dec 12, 2023

@rymancl or at least you can not manually change it. Maybe if you add another instance to the "cluster" then AWS will change that parameter to true behind the scenes.

@rymancl
Copy link
Author

rymancl commented Dec 12, 2023

@rymancl it looks like AWS disallows changing of this parameter. I suspect the only way to do this is to delete the cluster and restore from a snapshot with cluster mode enabled.

That's true, but as of redis 7.0 AWS allows you to do in place moves from cluster mode disabled to cluster mode enabled.

Docs here

Beginning with Redis 7, ElastiCache for Redis supports switching between Redis (cluster mode disabled) and Redis (cluster mode enabled).

The bug I believe is that the module is trying to set that parameter when it shouldn't be. I assume AWS will set it for you.

@kevcube
Copy link
Contributor

kevcube commented Dec 12, 2023

Ah I see. Try a targeted apply (assuming these are dev/sandbox resources) to module.shared-redis.aws_elasticache_replication_group.default[0] then see if AWS changes that parameter. Terraform should pick it up before the next plan and won't attempt to change it again.

Maybe the ideal solution would be for this module to not set it at all and assume that AWS manages it, depending on the outcome of the above.

@rymancl
Copy link
Author

rymancl commented Dec 12, 2023

I tested by modifying that piece of code that tries to set the parameter.

  dynamic "parameter" {
    for_each = var.parameter
    content {
      name  = parameter.value.name
      value = tostring(parameter.value.value)
    }
  }

This allows the apply to proceed.
It did error with the following:

module.shared-redis.aws_elasticache_replication_group.default[0]: Modifying... [id=<redacted>-uw2-shared]
module.shared-redis.aws_elasticache_replication_group.default[0]: Still modifying... [id=<redacted>-uw2-shared, 10s elapsed]
module.shared-redis.aws_elasticache_replication_group.default[0]: Still modifying... [id=<redacted>-uw2-shared, 20s elapsed]
module.shared-redis.aws_elasticache_replication_group.default[0]: Still modifying... [id=<redacted>-uw2-shared, 30s elapsed]
╷
│ Error: updating ElastiCache Replication Group (<redacted>-uw2-shared): InvalidCacheClusterState: Cache cluster '<redacted>-uw2-shared-002' is not in available state.
│       status code: 400, request id: <redacted>
│ 
│   with module.shared-redis.aws_elasticache_replication_group.default[0],
│   on .terraform/modules/shared-redis/main.tf line 115, in resource "aws_elasticache_replication_group" "default":
│  115: resource "aws_elasticache_replication_group" "default" {
│ 

I believe this error was because my current cluster only had one node, so cluster mode could not be enabled until another node was brought up.
I had to wait for the new node (002) to become available.
Screenshot 2023-12-12 at 12 47 20 PM

After it was available, I applied again.

The aws_elasticache_replication_group changes succeeded, but the aws_route53_record update failed; it is obvious from my initial plan screenshot that it would fail because it was trying to set the record value to an empty string, since the replication group did not have a configuration_endpoint_address yet.

Error: creating Route 53 Record: InvalidChangeBatch: [Invalid Resource Record: 'FATAL problem: DomainNameEmpty (Domain name is empty) encountered with ''', Unparseable CNAME encountered]

Regardless, the cluster is still showing cluster mode disabled, and has no configuration endpoint.

I tried setting cluster_mode_num_node_groups to 2 instead of 1 per the TF sample.
Screenshot 2023-12-12 at 1 07 07 PM

This apply errored with:

Error: modifying ElastiCache Replication Group (greenstreet-dev-gs-redis-d-uw2-shared) shard configuration: modifying ElastiCache Replication Group shard configuration: InvalidParameterValue: Operation is only applicable for cluster mode enabled replication groups.

This made me curious, so I tried via the console. I notice that there is an intermediate step you must take before moving to cluster mode enabled called "compatible".
image

Interesting. Before trying this, I wanted to play with the parameter group more.

I went back to the TF docs and found this note:
image

Interesting.

I double checked I was setting a redis7 family and checked the default.redis.7 parameter group and to my surprise I see that cluster-enabled is modifiable.
image
So why isn't it modifiable in the one the module creates?

I undo my code changes to the parameter group and upgrade my module to v1.0.0 so that I can I try setting the new create_parameter_group = false and parameter_group_name = "default.redis7.cluster.on".

The plan for this showed destroying the custom parameter group and updating the replication group to use the parameter group with cluster-mode on. (Still have the issue of empty DNS record at this point).

This apply errored because the parameter group was in use; it tries to delete before performing the update to the replication group.

Error: deleting ElastiCache Parameter Group (-uw2-shared): InvalidCacheParameterGroupState: One or more cache clusters are still members of this parameter group -uw2-shared, so the group cannot be deleted.

At this point, I'm unsure if I can proceed without manual intervention.
So I try to update the parameter group manually.

The parameter cluster-enabled has a different value in the requested parameter group than the current parameter group. This parameter value cannot be changed for a cache cluster.

Screenshot 2023-12-12 at 2 09 28 PM

At this point, I'm not sure how to proceed with the module.
I can try to set the mode to "compatible" manually, but I want to see if anyone has any other input before I go that route.

Other docs: https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/modify-cluster-mode.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 An issue with the system
Projects
None yet
Development

No branches or pull requests

2 participants