Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Headless service for Zookeeper or Kraftcontrollers are not reachable #290

Open
damees opened this issue Mar 10, 2024 · 0 comments
Open

Headless service for Zookeeper or Kraftcontrollers are not reachable #290

damees opened this issue Mar 10, 2024 · 0 comments

Comments

@damees
Copy link

damees commented Mar 10, 2024

Hello,

I have been trying to run CFK for several days now and be it for Kraft controllers or Zookeepers, the kafka brokers never reached the ready state because the headless service host name (e.g zookeeper.confluent.svc.cluster.local in the case of zookeeper) created for zookeeper cannot be resolved. Here are the steps I followed to deploy confluent plaftform with single node. I have a local kubernetes cluster (version 1.27.11) I configured myself with 1 master node and 2 workers nodes.

  1. Install confluent operator helm upgrade --install confluent-operator confluentinc/confluent-for-kubernetes --namespace confluent --set namespaced=false (I obviously created the name space confluent before)
  2. Create StorageClass
kind: StorageClass
metadata:
  name: local-storage-class
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
  1. Create Persistent Volumes for Zookeeper and Broker. I did create 3 PV using the same config as below, except I set 5G for zookeeper data and log:
kind: PersistentVolume
metadata:
  name: broker-1
  labels:
    app: broker
spec:
  capacity:
    storage: 15Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage-class
  local:
    path: /data/broker-1
  1. Deploy confluent using https://github.com/confluentinc/confluent-kubernetes-examples/blob/master/quickstart-deploy/confluent-platform-singlenode.yaml, I only kept zookeeper and kafka components, ajusting dataVolumeCapacity and logVolumeCapacity to 5G for zookeeper and setting storage class accordingly:
apiVersion: platform.confluent.io/v1beta1
kind: Zookeeper
metadata:
  name: zookeeper
  namespace: confluent
spec:
  replicas: 1
  storageClass:
    name: local-storage-class
  image:
    application: confluentinc/cp-zookeeper:7.6.0
    init: confluentinc/confluent-init-container:2.8.0
  dataVolumeCapacity: 5Gi
  logVolumeCapacity: 5Gi
  podTemplate:
    resources:
      requests:
        cpu: 100m
        memory: 256Mi
    podSecurityContext:
      fsGroup: 1000
      runAsUser: 1000
      runAsNonRoot: true
---
apiVersion: platform.confluent.io/v1beta1
kind: Kafka
metadata:
  name: kafka
  namespace: confluent
spec:
  replicas: 1
  storageClass:
    name: local-storage-class
  image:
    application: confluentinc/cp-server:7.6.0
    init: confluentinc/confluent-init-container:2.8.0
  dataVolumeCapacity: 15Gi
  configOverrides:
    server:
      - "confluent.license.topic.replication.factor=1"
      - "confluent.metrics.reporter.topic.replicas=1"
      - "confluent.tier.metadata.replication.factor=1"
      - "confluent.metadata.topic.replication.factor=1"
      - "confluent.balancer.topic.replication.factor=1"
      - "confluent.security.event.logger.exporter.kafka.topic.replicas=1"
      - "event.logger.exporter.kafka.topic.replicas=1"
      - "offsets.topic.replication.factor=1"
      - "confluent.cluster.link.enable=true"
      - "password.encoder.secret=secret"
  podTemplate:
    resources:
      requests:
        cpu: 200m
        memory: 512Mi
    podSecurityContext:
      fsGroup: 1000
      runAsUser: 1000
      runAsNonRoot: true
  metricReporter:
    enabled: true
  1. Run kubectl apply -f manifest.yaml. After a few minutes (I even waited for more), kubectl get pods returns:
NAME                                  READY   STATUS             RESTARTS        AGE
busybox                               1/1     Running            2 (5h37m ago)   16h
confluent-operator-84fcbd69dd-7vdpr   1/1     Running            3 (5h37m ago)   17h
dnsutils                              1/1     Running            1 (5h37m ago)   5h45m
kafka-0                               0/1     CrashLoopBackOff   21 (4m3s ago)   5h57m
zookeeper-0                           1/1     Running            1 (5h37m ago)   5h58m

As you can see, the broker is not healthy. I checked the zookeeper-0 pod logs and didn't spot any error. But when checking the kafka-0 pod, I see the following:

[ERROR] 2024-03-10 13:53:00,731 [main-SendThread(zookeeper.confluent.svc.cluster.local:2181)] org.apache.zookeeper                                                                                             .client.StaticHostProvider resolve - Unable to resolve address: zookeeper.confluent.svc.cluster.local:2181
java.net.UnknownHostException: zookeeper.confluent.svc.cluster.local
        at java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:797)
        at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
        at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1386)
        at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1307)
        at org.apache.zookeeper.client.StaticHostProvider$1.getAllByName(StaticHostProvider.java:88)
        at org.apache.zookeeper.client.StaticHostProvider.resolve(StaticHostProvider.java:141)
        at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:368)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1204)
[INFO] 2024-03-10 13:53:00,736 [main-SendThread(zookeeper.confluent.svc.cluster.local:2181)] org.apache.zookeeper.                                                                                             ClientCnxn logStartConnect - Opening socket connection to server zookeeper.confluent.svc.cluster.local:2181.
[ERROR] 2024-03-10 13:53:00,736 [main-SendThread(zookeeper.confluent.svc.cluster.local:2181)] org.apache.zookeeper                                                                                             .ClientCnxnSocketNIO connect - Unable to open socket to zookeeper.confluent.svc.cluster.local:2181
[WARN] 2024-03-10 13:53:00,737 [main-SendThread(zookeeper.confluent.svc.cluster.local:2181)] org.apache.zookeeper.                                                                                             ClientCnxn run - Session 0x0 for server zookeeper.confluent.svc.cluster.local:2181, Closing socket connection. Att                                                                                             empting reconnect except it is a SessionExpiredException.
java.nio.channels.UnresolvedAddressException
        at java.base/sun.nio.ch.Net.checkAddress(Net.java:131)
        at java.base/sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:673)
        at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:260)
        at org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:270)
        at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1173)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1207)

And it just loop on this error until the pod is restarted.

  1. Check all available services in the cluster:
NAME                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                                          AGE
confluent-operator     ClusterIP   10.103.186.240   <none>        7778/TCP                                                         17h
kafka                  ClusterIP   None             <none>        9092/TCP,8090/TCP,9071/TCP,7203/TCP,7777/TCP,7778/TCP,9072/TCP   6h5m
kafka-0-internal       ClusterIP   10.97.74.196     <none>        9092/TCP,8090/TCP,9071/TCP,7203/TCP,7777/TCP,7778/TCP,9072/TCP   6h5m
zookeeper              ClusterIP   None             <none>        2181/TCP,7203/TCP,7777/TCP,3888/TCP,2888/TCP,7778/TCP            6h6m
zookeeper-0-internal   ClusterIP   10.111.139.102   <none>        2181/TCP,7203/TCP,7777/TCP,3888/TCP,2888/TCP,7778/TCP            6h6m
  1. Try to resolve services from busybox.
    kubectl exec -it busybox -- nslookup zookeeper-0-internal.confluent.svc.cluster.local always returns
Server:         10.96.0.10
Address:        10.96.0.10:53

Name:   zookeeper-0-internal.confluent.svc.cluster.local
Address: 10.111.139.102

And this is where I can't understand what is happening. When I run kubectl exec -it busybox -- nslookup zookeeper.confluent.svc.cluster.local, 9 times out of 10 tentatives, it returns:

;; connection timed out; no servers could be reached
command terminated with exit code 1

Without changing anyhting, 1 time out of 10, it resolves successfully with:

Server:         10.96.0.10
Address:        10.96.0.10:53

Name:   zookeeper.confluent.svc.cluster.local
Address: 192.168.180.63

All the non headless services always resolve correctly (confluent-operator, kafka-0-internal, zookeeper-0-internal). But zookeeper service does not.

So I concluded from the above that the broker fails to start because most of the time, it cannot resolve the zookeeper headless service. I then checked the zookeeper pod to see if anything is wrong, but it looks definitely healthy:

[INFO] 2024-03-10 13:20:39,686 [main] io.confluent.agent.monitoring.DiskUsage premain - DiskUsage Agent: config : /opt/confluentinc/etc/zookeeper/disk-usage-agent.properties
[INFO] 2024-03-10 13:20:40,740 [main] io.confluent.agent.monitoring.DiskUsage premain - DiskUsage Agent: Registering object :io.confluent.caas:type=VolumeMetrics, service=zookeeper, dir=data for class : io.confluent.agent.monitoring.Volume
[INFO] 2024-03-10 13:20:40,748 [main] io.confluent.agent.monitoring.DiskUsage premain - DiskUsage Agent: Ping Volume{store=/mnt/data/data (/dev/sda1), total=34340868096, used=273436672, available=34067431424, percentUsed=0.7962427485397485, percentAvailable=99.20375725146026, mountpoint='/mnt/data/data', deviceName='/dev/sda1'}
I> No access restrictor found, access to any MBean is allowed
Jolokia: Agent started with URL http://192.168.180.63:7777/jolokia/
[INFO] 2024-03-10 13:20:41,425 [main] org.apache.zookeeper.server.quorum.QuorumPeerConfig parse - Reading configuration from: /opt/confluentinc/etc/zookeeper/zookeeper.properties
[INFO] 2024-03-10 13:20:41,428 [main] org.apache.zookeeper.server.quorum.QuorumPeerConfig parseProperties - clientPortAddress is 0.0.0.0:2181
[INFO] 2024-03-10 13:20:41,428 [main] org.apache.zookeeper.server.quorum.QuorumPeerConfig parseProperties - secureClientPort is not set
[INFO] 2024-03-10 13:20:41,428 [main] org.apache.zookeeper.server.quorum.QuorumPeerConfig parseProperties - observerMasterPort is not set
[INFO] 2024-03-10 13:20:41,429 [main] org.apache.zookeeper.server.quorum.QuorumPeerConfig parseProperties - metricsProvider.className is org.apache.zookeeper.metrics.impl.DefaultMetricsProvider
[ERROR] 2024-03-10 13:20:41,444 [main] org.apache.zookeeper.server.quorum.QuorumPeerConfig parseDynamicConfig - Invalid configuration, only one server specified (ignoring)
[INFO] 2024-03-10 13:20:41,447 [main] org.apache.zookeeper.server.DatadirCleanupManager <init> - autopurge.snapRetainCount set to 3
[INFO] 2024-03-10 13:20:41,447 [main] org.apache.zookeeper.server.DatadirCleanupManager <init> - autopurge.purgeInterval set to 1
[WARN] 2024-03-10 13:20:41,448 [main] org.apache.zookeeper.server.quorum.QuorumPeerMain initializeAndRun - Either no config or no quorum defined in config, running in standalone mode
[INFO] 2024-03-10 13:20:41,448 [PurgeTask] org.apache.zookeeper.server.DatadirCleanupManager run - Purge task started.
[INFO] 2024-03-10 13:20:41,455 [main] org.apache.zookeeper.jmx.ManagedUtil isLog4jJmxEnabled - Log4j 1.2 jmx support not found; jmx disabled.
[INFO] 2024-03-10 13:20:41,460 [main] org.apache.zookeeper.server.quorum.QuorumPeerConfig parse - Reading configuration from: /opt/confluentinc/etc/zookeeper/zookeeper.properties
[INFO] 2024-03-10 13:20:41,461 [PurgeTask] org.apache.zookeeper.server.persistence.FileTxnSnapLog <init> - zookeeper.snapshot.trust.empty : false
[INFO] 2024-03-10 13:20:41,462 [main] org.apache.zookeeper.server.quorum.QuorumPeerConfig parseProperties - clientPortAddress is 0.0.0.0:2181
[INFO] 2024-03-10 13:20:41,463 [main] org.apache.zookeeper.server.quorum.QuorumPeerConfig parseProperties - secureClientPort is not set
[INFO] 2024-03-10 13:20:41,463 [main] org.apache.zookeeper.server.quorum.QuorumPeerConfig parseProperties - observerMasterPort is not set
[INFO] 2024-03-10 13:20:41,463 [main] org.apache.zookeeper.server.quorum.QuorumPeerConfig parseProperties - metricsProvider.className is org.apache.zookeeper.metrics.impl.DefaultMetricsProvider
[ERROR] 2024-03-10 13:20:41,464 [main] org.apache.zookeeper.server.quorum.QuorumPeerConfig parseDynamicConfig - Invalid configuration, only one server specified (ignoring)
[INFO] 2024-03-10 13:20:41,465 [main] org.apache.zookeeper.server.ZooKeeperServerMain runFromConfig - Starting server
[INFO] 2024-03-10 13:20:41,500 [PurgeTask] org.apache.zookeeper.server.persistence.SnapStream <clinit> - zookeeper.snapshot.compression.method = CHECKED
[INFO] 2024-03-10 13:20:41,502 [main] org.apache.zookeeper.server.ServerMetrics metricsProviderInitialized - ServerMetrics initialized with provider org.apache.zookeeper.metrics.impl.DefaultMetricsProvider@33f676f6
[INFO] 2024-03-10 13:20:41,511 [PurgeTask] org.apache.zookeeper.server.DatadirCleanupManager run - Purge task completed.
[INFO] 2024-03-10 13:20:41,511 [main] org.apache.zookeeper.server.auth.DigestAuthenticationProvider <clinit> - ACL digest algorithm is: SHA1
[INFO] 2024-03-10 13:20:41,512 [main] org.apache.zookeeper.server.auth.DigestAuthenticationProvider isEnabled - zookeeper.DigestAuthenticationProvider.enabled = true
[INFO] 2024-03-10 13:20:41,512 [main] org.apache.zookeeper.server.persistence.FileTxnSnapLog <init> - zookeeper.snapshot.trust.empty : false
[INFO] 2024-03-10 13:20:41,529 [main] org.apache.zookeeper.server.ZooKeeperServer printBanner -
[INFO] 2024-03-10 13:20:41,530 [main] org.apache.zookeeper.server.ZooKeeperServer printBanner -   ______                  _
[INFO] 2024-03-10 13:20:41,530 [main] org.apache.zookeeper.server.ZooKeeperServer printBanner -  |___  /                 | |
[INFO] 2024-03-10 13:20:41,530 [main] org.apache.zookeeper.server.ZooKeeperServer printBanner -     / /    ___     ___   | | __   ___    ___   _ __     ___   _ __
[INFO] 2024-03-10 13:20:41,530 [main] org.apache.zookeeper.server.ZooKeeperServer printBanner -    / /    / _ \   / _ \  | |/ /  / _ \  / _ \ | '_ \   / _ \ | '__|
[INFO] 2024-03-10 13:20:41,530 [main] org.apache.zookeeper.server.ZooKeeperServer printBanner -   / /__  | (_) | | (_) | |   <  |  __/ |  __/ | |_) | |  __/ | |
[INFO] 2024-03-10 13:20:41,530 [main] org.apache.zookeeper.server.ZooKeeperServer printBanner -  /_____|  \___/   \___/  |_|\_\  \___|  \___| | .__/   \___| |_|
[INFO] 2024-03-10 13:20:41,530 [main] org.apache.zookeeper.server.ZooKeeperServer printBanner -                                               | |
[INFO] 2024-03-10 13:20:41,530 [main] org.apache.zookeeper.server.ZooKeeperServer printBanner -                                               |_|
[INFO] 2024-03-10 13:20:41,530 [main] org.apache.zookeeper.server.ZooKeeperServer printBanner -
[INFO] 2024-03-10 13:20:41,536 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:zookeeper.version=3.8.3-6ad6d364c7c0bcf0de452d54ebefa3058098ab56, built on 2023-10-05 10:34 UTC
[INFO] 2024-03-10 13:20:41,536 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:host.name=zookeeper-0.zookeeper.confluent.svc.cluster.local
[INFO] 2024-03-10 13:20:41,536 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:java.version=11.0.21
[INFO] 2024-03-10 13:20:41,536 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:java.vendor=Azul Systems, Inc.
[INFO] 2024-03-10 13:20:41,536 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:java.home=/usr/lib/jvm/java-11-zulu-openjdk-ca
[INFO] 2024-03-10 13:20:41,537 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:java.class.path=/usr/bin/../share/java/kafka/commons-digester-2.1.jar:/usr/bin/../share/java/kafka/kafka-storage-api-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/jetty-util-9.4.53.v20231009.jar:/usr/bin/../share/java/kafka/netty-codec-4.1.100.Final.jar:/usr/bin/../share/java/kafka/jakarta.activation-api-1.2.2.jar:/usr/bin/../share/java/kafka/jopt-simple-5.0.4.jar:/usr/bin/../share/java/kafka/javax.annotation-api-1.3.2.jar:/usr/bin/../share/java/kafka/scala-collection-compat_2.13-2.10.0.jar:/usr/bin/../share/java/kafka/connect-json-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/jersey-container-servlet-core-2.39.1.jar:/usr/bin/../share/java/kafka/commons-lang3-3.8.1.jar:/usr/bin/../share/java/kafka/netty-transport-4.1.100.Final.jar:/usr/bin/../share/java/kafka/kafka-log4j-appender-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/metrics-core-4.1.12.1.jar:/usr/bin/../share/java/kafka/snappy-java-1.1.10.5.jar:/usr/bin/../share/java/kafka/kafka-storage-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/kafka-metadata-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/swagger-annotations-2.2.8.jar:/usr/bin/../share/java/kafka/osgi-resource-locator-1.0.3.jar:/usr/bin/../share/java/kafka/zookeeper-3.8.3.jar:/usr/bin/../share/java/kafka/jersey-hk2-2.39.1.jar:/usr/bin/../share/java/kafka/netty-handler-4.1.100.Final.jar:/usr/bin/../share/java/kafka/jersey-common-2.39.1.jar:/usr/bin/../share/java/kafka/jetty-servlet-9.4.53.v20231009.jar:/usr/bin/../share/java/kafka/hk2-utils-2.6.1.jar:/usr/bin/../share/java/kafka/commons-cli-1.4.jar:/usr/bin/../share/java/kafka/audience-annotations-0.12.0.jar:/usr/bin/../share/java/kafka/jackson-module-jaxb-annotations-2.13.5.jar:/usr/bin/../share/java/kafka/commons-collections-3.2.2.jar:/usr/bin/../share/java/kafka/kafka-server-common-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/scala-logging_2.13-3.9.4.jar:/usr/bin/../share/java/kafka/caffeine-2.9.3.jar:/usr/bin/../share/java/kafka/jersey-client-2.39.1.jar:/usr/bin/../share/java/kafka/jose4j-0.9.3.jar:/usr/bin/../share/java/kafka/jetty-client-9.4.53.v20231009.jar:/usr/bin/../share/java/kafka/jaxb-api-2.3.1.jar:/usr/bin/../share/java/kafka/kafka-tools-api-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/jackson-jaxrs-json-provider-2.13.5.jar:/usr/bin/../share/java/kafka/kafka-streams-scala_2.13-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/jackson-datatype-jdk8-2.13.5.jar:/usr/bin/../share/java/kafka/connect-mirror-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/javassist-3.29.2-GA.jar:/usr/bin/../share/java/kafka/jackson-databind-2.13.5.jar:/usr/bin/../share/java/kafka/connect-transforms-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/reflections-0.10.2.jar:/usr/bin/../share/java/kafka/commons-beanutils-1.9.4.jar:/usr/bin/../share/java/kafka/hk2-locator-2.6.1.jar:/usr/bin/../share/java/kafka/jackson-annotations-2.13.5.jar:/usr/bin/../share/java/kafka/trogdor-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/netty-resolver-4.1.100.Final.jar:/usr/bin/../share/java/kafka/scala-library-2.13.11.jar:/usr/bin/../share/java/kafka/kafka-raft-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/kafka-streams-test-utils-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/netty-transport-native-epoll-4.1.100.Final.jar:/usr/bin/../share/java/kafka/jackson-dataformat-csv-2.13.5.jar:/usr/bin/../share/java/kafka/jakarta.xml.bind-api-2.3.3.jar:/usr/bin/../share/java/kafka/jetty-http-9.4.53.v20231009.jar:/usr/bin/../share/java/kafka/zstd-jni-1.5.5-1.jar:/usr/bin/../share/java/kafka/jackson-core-2.13.5.jar:/usr/bin/../share/java/kafka/connect-runtime-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/jakarta.inject-2.6.1.jar:/usr/bin/../share/java/kafka/pcollections-4.0.1.jar:/usr/bin/../share/java/kafka/kafka-tools-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/error_prone_annotations-2.10.0.jar:/usr/bin/../share/java/kafka/kafka-clients-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/netty-buffer-4.1.100.Final.jar:/usr/bin/../share/java/kafka/javax.ws.rs-api-2.1.1.jar:/usr/bin/../share/java/kafka/jetty-server-9.4.53.v20231009.jar:/usr/bin/../share/java/kafka/plexus-utils-3.3.1.jar:/usr/bin/../share/java/kafka/jetty-util-ajax-9.4.53.v20231009.jar:/usr/bin/../share/java/kafka/javax.servlet-api-3.1.0.jar:/usr/bin/../share/java/kafka/scala-java8-compat_2.13-1.0.2.jar:/usr/bin/../share/java/kafka/kafka-streams-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/commons-io-2.11.0.jar:/usr/bin/../share/java/kafka/zookeeper-jute-3.8.3.jar:/usr/bin/../share/java/kafka/hk2-api-2.6.1.jar:/usr/bin/../share/java/kafka/jackson-module-scala_2.13-2.13.5.jar:/usr/bin/../share/java/kafka/kafka-group-coordinator-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/connect-mirror-client-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/jsr305-3.0.2.jar:/usr/bin/../share/java/kafka/rocksdbjni-7.9.2.jar:/usr/bin/../share/java/kafka/netty-common-4.1.100.Final.jar:/usr/bin/../share/java/kafka/jackson-jaxrs-base-2.13.5.jar:/usr/bin/../share/java/kafka/lz4-java-1.8.0.jar:/usr/bin/../share/java/kafka/kafka-streams-examples-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/maven-artifact-3.8.8.jar:/usr/bin/../share/java/kafka/metrics-core-2.2.0.jar:/usr/bin/../share/java/kafka/commons-validator-1.7.jar:/usr/bin/../share/java/kafka/jetty-continuation-9.4.53.v20231009.jar:/usr/bin/../share/java/kafka/jetty-security-9.4.53.v20231009.jar:/usr/bin/../share/java/kafka/aopalliance-repackaged-2.6.1.jar:/usr/bin/../share/java/kafka/jersey-container-servlet-2.39.1.jar:/usr/bin/../share/java/kafka/connect-basic-auth-extension-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/kafka-shell-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/slf4j-reload4j-1.7.36.jar:/usr/bin/../share/java/kafka/jakarta.annotation-api-1.3.5.jar:/usr/bin/../share/java/kafka/jakarta.ws.rs-api-2.1.6.jar:/usr/bin/../share/java/kafka/jline-3.22.0.jar:/usr/bin/../share/java/kafka/javax.activation-api-1.2.0.jar:/usr/bin/../share/java/kafka/scala-reflect-2.13.11.jar:/usr/bin/../share/java/kafka/jetty-servlets-9.4.53.v20231009.jar:/usr/bin/../share/java/kafka/paranamer-2.8.jar:/usr/bin/../share/java/kafka/slf4j-api-1.7.36.jar:/usr/bin/../share/java/kafka/activation-1.1.1.jar:/usr/bin/../share/java/kafka/jetty-io-9.4.53.v20231009.jar:/usr/bin/../share/java/kafka/jersey-server-2.39.1.jar:/usr/bin/../share/java/kafka/netty-transport-classes-epoll-4.1.100.Final.jar:/usr/bin/../share/java/kafka/argparse4j-0.7.0.jar:/usr/bin/../share/java/kafka/kafka_2.13-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/jakarta.validation-api-2.0.2.jar:/usr/bin/../share/java/kafka/netty-transport-native-unix-common-4.1.100.Final.jar:/usr/bin/../share/java/kafka/commons-logging-1.2.jar:/usr/bin/../share/java/kafka/connect-api-7.6.0-ccs.jar:/usr/bin/../share/java/kafka/checker-qual-3.19.0.jar:/usr/bin/../share/java/kafka/kafka.jar:/usr/bin/../share/java/kafka/reload4j-1.2.25.jar:/usr/bin/../share/java/confluent-telemetry/*
[INFO] 2024-03-10 13:20:41,537 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:java.library.path=/usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib
[INFO] 2024-03-10 13:20:41,537 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:java.io.tmpdir=/tmp
[INFO] 2024-03-10 13:20:41,537 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:java.compiler=<NA>
[INFO] 2024-03-10 13:20:41,537 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:os.name=Linux
[INFO] 2024-03-10 13:20:41,537 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:os.arch=amd64
[INFO] 2024-03-10 13:20:41,537 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:os.version=6.2.0-1019-azure
[INFO] 2024-03-10 13:20:41,537 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:user.name=appuser
[INFO] 2024-03-10 13:20:41,537 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:user.home=/home/appuser
[INFO] 2024-03-10 13:20:41,538 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:user.dir=/home/appuser
[INFO] 2024-03-10 13:20:41,538 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:os.memory.free=241MB
[INFO] 2024-03-10 13:20:41,538 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:os.memory.max=256MB
[INFO] 2024-03-10 13:20:41,538 [main] org.apache.zookeeper.server.ZooKeeperServer logEnv - Server environment:os.memory.total=256MB
[INFO] 2024-03-10 13:20:41,538 [main] org.apache.zookeeper.server.ZooKeeperServer <clinit> - zookeeper.enableEagerACLCheck = false
[INFO] 2024-03-10 13:20:41,538 [main] org.apache.zookeeper.server.ZooKeeperServer <clinit> - zookeeper.digest.enabled = true
[INFO] 2024-03-10 13:20:41,538 [main] org.apache.zookeeper.server.ZooKeeperServer <clinit> - zookeeper.closeSessionTxn.enabled = true
[INFO] 2024-03-10 13:20:41,538 [main] org.apache.zookeeper.server.ZooKeeperServer setFlushDelay - zookeeper.flushDelay = 0 ms
[INFO] 2024-03-10 13:20:41,538 [main] org.apache.zookeeper.server.ZooKeeperServer setMaxWriteQueuePollTime - zookeeper.maxWriteQueuePollTime = 0 ms
[INFO] 2024-03-10 13:20:41,538 [main] org.apache.zookeeper.server.ZooKeeperServer setMaxBatchSize - zookeeper.maxBatchSize=1000
[INFO] 2024-03-10 13:20:41,538 [main] org.apache.zookeeper.server.ZooKeeperServer <clinit> - zookeeper.intBufferStartingSizeBytes = 1024
[INFO] 2024-03-10 13:20:41,539 [main] org.apache.zookeeper.server.BlueThrottle logWeighedThrottlingSetting - Weighed connection throttling is disabled
[INFO] 2024-03-10 13:20:41,541 [main] org.apache.zookeeper.server.ZooKeeperServer setMinSessionTimeout - minSessionTimeout set to 6000 ms
[INFO] 2024-03-10 13:20:41,541 [main] org.apache.zookeeper.server.ZooKeeperServer setMaxSessionTimeout - maxSessionTimeout set to 60000 ms
[INFO] 2024-03-10 13:20:41,542 [main] org.apache.zookeeper.server.ResponseCache <init> - getData response cache size is initialized with value 400.
[INFO] 2024-03-10 13:20:41,543 [main] org.apache.zookeeper.server.ResponseCache <init> - getChildren response cache size is initialized with value 400.
[INFO] 2024-03-10 13:20:41,544 [main] org.apache.zookeeper.server.util.RequestPathMetricsCollector <init> - zookeeper.pathStats.slotCapacity = 60
[INFO] 2024-03-10 13:20:41,544 [main] org.apache.zookeeper.server.util.RequestPathMetricsCollector <init> - zookeeper.pathStats.slotDuration = 15
[INFO] 2024-03-10 13:20:41,545 [main] org.apache.zookeeper.server.util.RequestPathMetricsCollector <init> - zookeeper.pathStats.maxDepth = 6
[INFO] 2024-03-10 13:20:41,545 [main] org.apache.zookeeper.server.util.RequestPathMetricsCollector <init> - zookeeper.pathStats.initialDelay = 5
[INFO] 2024-03-10 13:20:41,545 [main] org.apache.zookeeper.server.util.RequestPathMetricsCollector <init> - zookeeper.pathStats.delay = 5
[INFO] 2024-03-10 13:20:41,545 [main] org.apache.zookeeper.server.util.RequestPathMetricsCollector <init> - zookeeper.pathStats.enabled = false
[INFO] 2024-03-10 13:20:41,551 [main] org.apache.zookeeper.server.ZooKeeperServer setLargeRequestMaxBytes - The max bytes for all large requests are set to 104857600
[INFO] 2024-03-10 13:20:41,552 [main] org.apache.zookeeper.server.ZooKeeperServer setLargeRequestThreshold - The large request threshold is set to -1
[INFO] 2024-03-10 13:20:41,552 [main] org.apache.zookeeper.server.AuthenticationHelper initConfigurations - zookeeper.enforce.auth.enabled = false
[INFO] 2024-03-10 13:20:41,552 [main] org.apache.zookeeper.server.AuthenticationHelper initConfigurations - zookeeper.enforce.auth.schemes = []
[INFO] 2024-03-10 13:20:41,552 [main] org.apache.zookeeper.server.ZooKeeperServer <init> - Created server with tickTime 3000 ms minSessionTimeout 6000 ms maxSessionTimeout 60000 ms clientPortListenBacklog -1 datadir /mnt/data/txnlog/version-2 snapdir /mnt/data/data/version-2
[INFO] 2024-03-10 13:20:41,632 [main] org.eclipse.jetty.util.log initialized - Logging initialized @5520ms to org.eclipse.jetty.util.log.Slf4jLog
[WARN] 2024-03-10 13:20:41,782 [main] org.eclipse.jetty.server.handler.ContextHandler setContextPath - o.e.j.s.ServletContextHandler@2bd08376{/,null,STOPPED} contextPath ends with /*
[WARN] 2024-03-10 13:20:41,783 [main] org.eclipse.jetty.server.handler.ContextHandler setContextPath - Empty contextPath
[INFO] 2024-03-10 13:20:41,824 [main] org.eclipse.jetty.server.Server doStart - jetty-9.4.53.v20231009; built: 2023-10-09T12:29:09.265Z; git: 27bde00a0b95a1d5bbee0eae7984f891d2d0f8c9; jvm 11.0.21+9-LTS
[INFO] 2024-03-10 13:20:41,877 [main] org.eclipse.jetty.server.session doStart - DefaultSessionIdManager workerName=node0
[INFO] 2024-03-10 13:20:41,877 [main] org.eclipse.jetty.server.session doStart - No SessionScavenger set, using defaults
[INFO] 2024-03-10 13:20:41,879 [main] org.eclipse.jetty.server.session startScavenging - node0 Scavenging every 660000ms
[WARN] 2024-03-10 13:20:41,881 [main] org.eclipse.jetty.security.SecurityHandler checkPathsWithUncoveredHttpMethods - [email protected]@2bd08376{/,null,STARTING} has uncovered http methods for path: /*
[INFO] 2024-03-10 13:20:41,904 [main] org.eclipse.jetty.server.handler.ContextHandler doStart - Started o.e.j.s.ServletContextHandler@2bd08376{/,null,AVAILABLE}
[INFO] 2024-03-10 13:20:41,918 [main] org.eclipse.jetty.server.AbstractConnector doStart - Started ServerConnector@413f69cc{HTTP/1.1, (http/1.1)}{0.0.0.0:8080}
[INFO] 2024-03-10 13:20:41,919 [main] org.eclipse.jetty.server.Server doStart - Started @5807ms
[INFO] 2024-03-10 13:20:41,919 [main] org.apache.zookeeper.server.admin.JettyAdminServer start - Started AdminServer on address 0.0.0.0, port 8080 and command URL /commands
[INFO] 2024-03-10 13:20:41,929 [main] org.apache.zookeeper.server.ServerCnxnFactory createFactory - Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory
[WARN] 2024-03-10 13:20:41,930 [main] org.apache.zookeeper.server.ServerCnxnFactory initMaxCnxns - maxCnxns is not configured, using default value 0.
[INFO] 2024-03-10 13:20:41,931 [main] org.apache.zookeeper.server.NIOServerCnxnFactory configure - Configuring NIO connection handler with 10s sessionless connection timeout, 1 selector thread(s), 4 worker threads, and 64 kB direct buffers.
[INFO] 2024-03-10 13:20:41,933 [main] org.apache.zookeeper.server.NIOServerCnxnFactory configure - binding to port 0.0.0.0/0.0.0.0:2181
[INFO] 2024-03-10 13:20:41,962 [main] org.apache.zookeeper.server.watch.WatchManagerFactory createWatchManager - Using org.apache.zookeeper.server.watch.WatchManager as watch manager
[INFO] 2024-03-10 13:20:41,962 [main] org.apache.zookeeper.server.watch.WatchManagerFactory createWatchManager - Using org.apache.zookeeper.server.watch.WatchManager as watch manager
[INFO] 2024-03-10 13:20:41,963 [main] org.apache.zookeeper.server.ZKDatabase <init> - zookeeper.snapshotSizeFactor = 0.33
[INFO] 2024-03-10 13:20:41,963 [main] org.apache.zookeeper.server.ZKDatabase <init> - zookeeper.commitLogCount=500
[INFO] 2024-03-10 13:20:41,963 [main] org.apache.zookeeper.server.persistence.FileSnap deserialize - Reading snapshot /mnt/data/data/version-2/snapshot.0
[INFO] 2024-03-10 13:20:41,967 [main] org.apache.zookeeper.server.DataTree deserializeZxidDigest - The digest value is empty in snapshot
[INFO] 2024-03-10 13:20:41,969 [main] org.apache.zookeeper.server.ZKDatabase loadDataBase - Snapshot loaded in 6 ms, highest zxid is 0x0, digest is 1371985504
[INFO] 2024-03-10 13:20:41,975 [main] org.apache.zookeeper.server.persistence.FileTxnSnapLog save - Snapshotting: 0x0 to /mnt/data/data/version-2/snapshot.0
[INFO] 2024-03-10 13:20:41,986 [main] org.apache.zookeeper.server.ZooKeeperServer takeSnapshot - Snapshot taken in 11 ms
[INFO] 2024-03-10 13:20:41,995 [ProcessThread(sid:0 cport:2181):] org.apache.zookeeper.server.PrepRequestProcessor run - PrepRequestProcessor (sid:0) started, reconfigEnabled=false
[INFO] 2024-03-10 13:20:41,995 [main] org.apache.zookeeper.server.RequestThrottler <clinit> - zookeeper.request_throttler.shutdownTimeout = 10000 ms
[INFO] 2024-03-10 13:20:42,006 [main] org.apache.zookeeper.server.ContainerManager <init> - Using checkIntervalMs=60000 maxPerMinute=10000 maxNeverUsedIntervalMs=0
[INFO] 2024-03-10 13:20:42,007 [main] org.apache.zookeeper.audit.ZKAuditProvider <clinit> - ZooKeeper audit is disabled.
[INFO] 2024-03-10 14:20:41,449 [PurgeTask] org.apache.zookeeper.server.DatadirCleanupManager run - Purge task started.
[INFO] 2024-03-10 14:20:41,449 [PurgeTask] org.apache.zookeeper.server.persistence.FileTxnSnapLog <init> - zookeeper.snapshot.trust.empty : false
[INFO] 2024-03-10 14:20:41,450 [PurgeTask] org.apache.zookeeper.server.DatadirCleanupManager run - Purge task completed.

I tested different manifests, using replicas=3 for zookeeper or kraft controllers. The issue is always the same. This may be a kubernetes issue, honestly, I don't know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant