Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support IPv6 based clusters #6803

Closed
EKrol2 opened this issue Mar 8, 2023 · 7 comments
Closed

Support IPv6 based clusters #6803

EKrol2 opened this issue Mar 8, 2023 · 7 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@EKrol2
Copy link

EKrol2 commented Mar 8, 2023

I have 2 EKS clusters, Cluster 1 created with a simple ekctl create cluster, and cluster 2 with terraform which is quite a bit more involved (and therefore making it hard to reproduce).

Furthermore. I have a topic on an apache kafka cluster with TLS enabled.

I set up Knative eventing on both clusters, created a kafkasource connecting to a topic on the kafka cluster and an event display (kservice) as a sink.
Note that the kafkasource and eventdisplay are deployed in the knative-eventing namespace, allong with the secrets required for the TLS connection.

On both clusters all components (kafka-eventing/kafkasource/routing/etc) report ready status.

Cluster 2 has additional resources, most the EKS addons:
Amazon EBS CSI driver
Amazon VPC CNI
CoreDNS
Kube-proxy

Finally Cluster 1 uses AL2_x86_64 AMI type wheras Cluster 2 uses BOTTLEROCKET_x86_64

Expected Behavior

Messages to display on the event display on both clusters.

Actual Behavior

On Cluster 1 messages that I send to the Kafka Topic are displayed as expected.
On Cluster 2 (despite the kafka-eventing/kafkasource/routing/etc saying ready) no messages arrive.

On Cluster 2, in the Kafka-Source-Dispatcher i get the following (cleaned up) error:

... Failed to reconcile [onNewEgress] egress id=60506aa6-94ef-46ee-8cfd-8ed83b9f4260 consumerGroup=<my-consumergroup-name>destination=http://event-display-knative.knative-eventing.svc.cluster.local ... "stack_trace":"io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Secret] with name: [cacert] in namespace: [knative-eventing] failed.\n\tat io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:130)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.getMandatory(BaseOperation.java:177)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.get(BaseOperation.java:139)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.get(BaseOperation.java:88)\n\tat dev.knative.eventing.kafka.broker.core.security.KubernetesAuthProvider.getSecretFromKubernetes(KubernetesAuthProvider.java:107)\n\tat dev.knative.eventing.kafka.broker.core.security.KubernetesAuthProvider.secretDataOf(KubernetesAuthProvider.java:79)\n\tat dev.knative.eventing.kafka.broker.core.security.KubernetesAuthProvider.lambda$getCredentials$1(KubernetesAuthProvider.java:59)\n\tat io.vertx.core.impl.ContextBase.lambda$null$0(ContextBase.java:137)\n\tat io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:264)\n\tat io.vertx.core.impl.ContextBase.lambda$executeBlocking$1(ContextBase.java:135)\n\tat io.vertx.core.impl.TaskQueue.run(TaskQueue.java:76)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)\n\tat io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\n\tat java.base/java.lang.Thread.run(Thread.java:833)\nCaused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname fd72:7250:c9d1::1 not verified:\n certificate: sha256/tOaK6Qj0rkzsOCuvFoqfbsFJzoBcH3p9WQMcCVfWDJA=\n DN: CN=kube-apiserver\n subjectAltNames: [fd72:7250:c9d1:0:0:0:0:1,

If someone could shed some light on this it would be much appreciated, I've been stuck on this for a while now.

Additional Info

EKS v1.24
Knative Operator: 1.9.2
Knative Eventing: v1.9.2
Knative Serving: v1.9.1

@pierDipi
Copy link
Member

pierDipi commented Mar 8, 2023

This is usually because the CA certs in the KafkaSource or referenced secret is missed or miscofigured, the old source implementation was disabling the verification, the new doesn't for obvious reasons

@EKrol2
Copy link
Author

EKrol2 commented Mar 8, 2023

Hi @pierDipi, thank you for responding.

If the CA was wrong in some way, it would also not work in Cluster 1, the Knative setup and the secrets are the same in both clusters.

@EKrol2
Copy link
Author

EKrol2 commented Mar 8, 2023

I tried making another cluster in the same way as cluster 1 (the one that worked) and added the EKS addons mentioned above. Again cluster 1 displayed no errors. So I'm assuming the addons are the not cause of the error on cluster 2.

@EKrol2
Copy link
Author

EKrol2 commented Mar 9, 2023

I got it "working" by rolling out a new cluster via Terraform using ipv4 instead of ipv6. In the error message we can see fabric8 is being used and okhttp3. If I understand things correctly, ipv6 would only work with okhttp4.
(See fabric8io/kubernetes-client#2632)

@pierDipi
Could it be that the KafkaSource EventSource uses an old library?

@pierDipi
Copy link
Member

pierDipi commented Mar 10, 2023

Thanks, good catch on the issue!

@pierDipi pierDipi changed the title Kafka-source-dispatcher: Failed to reconcile [onNewEgress] egress Support IPv6 based clusters Mar 10, 2023
@pierDipi pierDipi added the kind/bug Categorizes issue or PR as related to a bug. label Mar 10, 2023
@pierDipi
Copy link
Member

@EKrol2 for eventing kafka issues please open them https://github.com/knative-sandbox/eventing-kafka-broker

@pierDipi
Copy link
Member

Transfered issue to knative-extensions/eventing-kafka-broker#3005

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

2 participants