Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubeflow and Istio deployment - HTTP, gRPC endpoints not working #796

Closed
ghost opened this issue Aug 13, 2019 · 3 comments
Closed

Kubeflow and Istio deployment - HTTP, gRPC endpoints not working #796

ghost opened this issue Aug 13, 2019 · 3 comments
Assignees
Milestone

Comments

@ghost
Copy link

ghost commented Aug 13, 2019

I am trying to access http or grpc endpoints. Unfortunally without success. For HTTP I am recieving 503 errors. Also I see errorrs in controller manager. I assume it should works out of the box in that simple example:

How to reproduce

Docker 18.09
Centos 7.6
Kubernetes 1.15

kfctl init kubeflow --config="https://raw.githubusercontent.com/kubeflow/kubeflow/master/bootstrap/config/kfctl_k8s_istio.yaml" -V

I changed version of Seldon to 0.3.2 snapshot

cd kubeflow
kfctl generate all -V
kfctl apply all -V
kubectl label namespace kubeflow istio-injection=enabled --overwrite

I use that file: https://github.com/SeldonIO/seldon-core/blob/master/notebooks/resources/model.json
kubectl apply -f model.json -n kubeflow

I am recieving errors (in seldon-operator-controller-manager-0):

{"level":"info","ts":1565738361.3888788,"logger":"seldon-controller","msg":"pSvcName","val":"seldon-model-test-deployment-example"}
{"level":"info","ts":1565738361.3893006,"logger":"seldon-controller","msg":"Creating default Ambassador config"}
{"level":"info","ts":1565738361.3900414,"logger":"seldon-controller","msg":"Found identical deployment","namespace":"kubeflow","name":"test-deployment-example-7cd068f","status":{"observedGeneration":1,"replicas":1,"updatedReplicas":1,"readyReplicas":1,"availableReplicas":1,"conditions":[{"type":"Available","status":"True","lastUpdateTime":"2019-08-13T23:14:32Z","lastTransitionTime":"2019-08-13T23:14:32Z","reason":"MinimumReplicasAvailable","message":"Deployment has minimum availability."},{"type":"Progressing","status":"True","lastUpdateTime":"2019-08-13T23:14:32Z","lastTransitionTime":"2019-08-13T23:14:07Z","reason":"NewReplicaSetAvailable","message":"ReplicaSet \"test-deployment-example-7cd068f-5bcbdd6f86\" has successfully progressed."}]}}
{"level":"error","ts":1565738361.391365,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"seldondeployment-controller","request":"kubeflow/seldon-model","error":"the server could not find the requested resource (put seldondeployments.machinelearning.seldon.io seldon-model)","stacktrace":"github.com/seldonio/seldon-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/seldonio/seldon-operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/seldonio/seldon-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/seldonio/seldon-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\ngithub.com/seldonio/seldon-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/seldonio/seldon-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/seldonio/seldon-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/seldonio/seldon-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/seldonio/seldon-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/seldonio/seldon-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/seldonio/seldon-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/seldonio/seldon-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
curl -v 127.0.0.1:31380/seldon/kubeflow/seldon-model/api/v0.1/predictions/
* About to connect() to 127.0.0.1 port 31380 (#0)
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 31380 (#0)
> GET /seldon/kubeflow/seldon-model/api/v0.1/predictions/ HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 127.0.0.1:31380
> Accept: */*
>
< HTTP/1.1 503 Service Unavailable
< date: Tue, 13 Aug 2019 23:25:45 GMT
< server: istio-envoy
< content-length: 0
<
* Connection #0 to host 127.0.0.1 left intact

Additional information:

Virtual Service:

kubectl describe virtualservice test-deployment-seldon-model-http -n kubeflow
Name:         test-deployment-seldon-model-http
Namespace:    kubeflow
Labels:       <none>
Annotations:  <none>
API Version:  networking.istio.io/v1alpha3
Kind:         VirtualService
Metadata:
  Creation Timestamp:  2019-08-13T23:14:07Z
  Generation:          1
  Owner References:
    API Version:           machinelearning.seldon.io/v1alpha2
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  SeldonDeployment
    Name:                  seldon-model
    UID:                   9e27239e-300e-465f-94e6-72bdbc170f9b
  Resource Version:        3501245
  Self Link:               /apis/networking.istio.io/v1alpha3/namespaces/kubeflow/virtualservices/test-deployment-seldon-model-http
  UID:                     70d451c6-947e-49ea-a3ff-81df8808dfe8
Spec:
  Gateways:
    kubeflow-gateway
  Hosts:
    *
  Http:
    Match:
      Uri:
        Prefix:  /seldon/kubeflow/seldon-model/
    Rewrite:
      Uri:  /
    Route:
      Destination:
        Host:  seldon-model-test-deployment-example
        Port:
          Number:  8000
        Subset:    example
      Weight:      0
Events:            <none>

Service:

kubectl get svc -n kubeflow | grep seldon
seldon-operator-controller-manager-service                        ClusterIP   10.102.197.231   <none>        443/TCP             30m
test-deployment-example-classifier-seldonio-mock-classifier-1-0   ClusterIP   10.106.212.176   <none>        9000/TCP            15m

kubectl describe svc test-deployment-example-classifier-seldonio-mock-classifier-1-0 -n kubeflow
Name:              test-deployment-example-classifier-seldonio-mock-classifier-1-0
Namespace:         kubeflow
Labels:            seldon-app-classifier=test-deployment-example-classifier-seldonio-mock-classifier-1-0
                   seldon-deployment-id=test-deployment
Annotations:       <none>
Selector:          seldon-app-classifier=test-deployment-example-classifier-seldonio-mock-classifier-1-0
Type:              ClusterIP
IP:                10.106.212.176
Port:              http  9000/TCP
TargetPort:        9000/TCP
Endpoints:         10.244.2.232:9000
Session Affinity:  None
Events:            <none>

Pod:

Name:           test-deployment-example-7cd068f-5bcbdd6f86-l94z2
Namespace:      kubeflow
Priority:       0
Node:           jmiler-vm1.machine.com/10.91.118.28
Start Time:     Wed, 14 Aug 2019 01:14:07 +0200
Labels:         app=test-deployment-example-7cd068f
                fluentd=true
                pod-template-hash=5bcbdd6f86
                seldon-app=seldon-model-test-deployment-example
                seldon-app-classifier=test-deployment-example-classifier-seldonio-mock-classifier-1-0
                seldon-deployment-id=test-deployment-seldon-model
                version=v1
Annotations:    prometheus.io/path: prometheus
                prometheus.io/port: 8000
                prometheus.io/scrape: true
                sidecar.istio.io/status:
                  {"version":"5f3ae3613c7945ef767cb9fd594596bc001ff3ab915f12e4379c0cb5648d2729","initContainers":["istio-init"],"containers":["istio-proxy"]...
Status:         Running
IP:             10.244.2.232
Controlled By:  ReplicaSet/test-deployment-example-7cd068f-5bcbdd6f86
Init Containers:
  istio-init:
    Container ID:  docker://35eb992c8fd793dca559d880c58441106f616b072f4fe3e57fb35fa2ef588951
    Image:         docker.io/istio/proxy_init:1.1.6
    Image ID:      docker-pullable://istio/proxy_init@sha256:54d89fb2b3b0a2365f2d2b0a8862f1f8320a63ab6a09c637c60f13f6021c4609
    Port:          <none>
    Host Port:     <none>
    Args:
      -p
      15001
      -u
      1337
      -m
      REDIRECT
      -i
      *
      -x

      -b
      9000,8000,5001,8082,9090
      -d
      15020
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 14 Aug 2019 01:14:09 +0200
      Finished:     Wed, 14 Aug 2019 01:14:10 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:        10m
      memory:     10Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-t987c (ro)
Containers:
  classifier:
    Container ID:   docker://eb87a1874969724af95759171d0709d1a2c3de6e6042eb08e5abe50572aa9ac8
    Image:          seldonio/mock_classifier:1.0
    Image ID:       docker-pullable://seldonio/mock_classifier@sha256:af2d0eb695af1698bb6a13db29a808e855616bbeccb0f086bcf3235b4a3d07db
    Port:           9000/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Wed, 14 Aug 2019 01:14:11 +0200
    Ready:          True
    Restart Count:  0
    Requests:
      memory:   1Mi
    Liveness:   tcp-socket :http delay=60s timeout=1s period=5s #success=1 #failure=3
    Readiness:  tcp-socket :http delay=20s timeout=1s period=5s #success=1 #failure=3
    Environment:
      PREDICTIVE_UNIT_SERVICE_PORT:  9000
      PREDICTIVE_UNIT_ID:            classifier
      PREDICTOR_ID:                  example
      SELDON_DEPLOYMENT_ID:          seldon-model
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-t987c (ro)
  seldon-container-engine:
    Container ID:   docker://7f7aa9ad51beec055e2d558ccc7848588adee9b244319e80764c6cf055d9e6c7
    Image:          docker.io/seldonio/engine:0.3.2-SNAPSHOT
    Image ID:       docker-pullable://seldonio/engine@sha256:2f46913a1fea82b62fe5c2ff0bedb6eae56f8f5af8261ed61091e47638c92580
    Ports:          8000/TCP, 5001/TCP, 8082/TCP, 9090/TCP
    Host Ports:     0/TCP, 0/TCP, 0/TCP, 0/TCP
    State:          Running
      Started:      Wed, 14 Aug 2019 01:14:11 +0200
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:      100m
    Liveness:   http-get http://:admin/live delay=20s timeout=2s period=5s #success=1 #failure=7
    Readiness:  http-get http://:admin/ready delay=20s timeout=2s period=1s #success=1 #failure=1
    Environment:
      ENGINE_PREDICTOR:                eyJuYW1lIjoiZXhhbXBsZSIsImdyYXBoIjp7Im5hbWUiOiJjbGFzc2lmaWVyIiwidHlwZSI6Ik1PREVMIiwiaW1wbGVtZW50YXRpb24iOiJVTktOT1dOX0lNUExFTUVOVEFUSU9OIiwiZW5kcG9pbnQiOnsic2VydmljZV9ob3N0IjoibG9jYWxob3N0Iiwic2VydmljZV9wb3J0Ijo5MDAwLCJ0eXBlIjoiUkVTVCJ9fSwiY29tcG9uZW50U3BlY3MiOlt7Im1ldGFkYXRhIjp7ImNyZWF0aW9uVGltZXN0YW1wIjpudWxsfSwic3BlYyI6eyJjb250YWluZXJzIjpbeyJuYW1lIjoiY2xhc3NpZmllciIsImltYWdlIjoic2VsZG9uaW8vbW9ja19jbGFzc2lmaWVyOjEuMCIsInBvcnRzIjpbeyJuYW1lIjoiaHR0cCIsImNvbnRhaW5lclBvcnQiOjkwMDAsInByb3RvY29sIjoiVENQIn1dLCJlbnYiOlt7Im5hbWUiOiJQUkVESUNUSVZFX1VOSVRfU0VSVklDRV9QT1JUIiwidmFsdWUiOiI5MDAwIn0seyJuYW1lIjoiUFJFRElDVElWRV9VTklUX0lEIiwidmFsdWUiOiJjbGFzc2lmaWVyIn0seyJuYW1lIjoiUFJFRElDVE9SX0lEIiwidmFsdWUiOiJleGFtcGxlIn0seyJuYW1lIjoiU0VMRE9OX0RFUExPWU1FTlRfSUQiLCJ2YWx1ZSI6InNlbGRvbi1tb2RlbCJ9XSwicmVzb3VyY2VzIjp7InJlcXVlc3RzIjp7Im1lbW9yeSI6IjFNaSJ9fSwibGl2ZW5lc3NQcm9iZSI6eyJ0Y3BTb2NrZXQiOnsicG9ydCI6Imh0dHAifSwiaW5pdGlhbERlbGF5U2Vjb25kcyI6NjAsInRpbWVvdXRTZWNvbmRzIjoxLCJwZXJpb2RTZWNvbmRzIjo1LCJzdWNjZXNzVGhyZXNob2xkIjoxLCJmYWlsdXJlVGhyZXNob2xkIjozfSwicmVhZGluZXNzUHJvYmUiOnsidGNwU29ja2V0Ijp7InBvcnQiOiJodHRwIn0sImluaXRpYWxEZWxheVNlY29uZHMiOjIwLCJ0aW1lb3V0U2Vjb25kcyI6MSwicGVyaW9kU2Vjb25kcyI6NSwic3VjY2Vzc1RocmVzaG9sZCI6MSwiZmFpbHVyZVRocmVzaG9sZCI6M30sImxpZmVjeWNsZSI6eyJwcmVTdG9wIjp7ImV4ZWMiOnsiY29tbWFuZCI6WyIvYmluL3NoIiwiLWMiLCIvYmluL3NsZWVwIDEwIl19fX0sInRlcm1pbmF0aW9uTWVzc2FnZVBhdGgiOiIvZGV2L3Rlcm1pbmF0aW9uLWxvZyIsInRlcm1pbmF0aW9uTWVzc2FnZVBvbGljeSI6IkZpbGUiLCJpbWFnZVB1bGxQb2xpY3kiOiJJZk5vdFByZXNlbnQifV0sInRlcm1pbmF0aW9uR3JhY2VQZXJpb2RTZWNvbmRzIjoxfX1dLCJyZXBsaWNhcyI6MSwiZW5naW5lUmVzb3VyY2VzIjp7fSwibGFiZWxzIjp7InZlcnNpb24iOiJ2MSJ9LCJzdmNPcmNoU3BlYyI6e30sImV4cGxhaW5lciI6eyJjb250YWluZXJTcGVjIjp7Im5hbWUiOiIiLCJyZXNvdXJjZXMiOnt9fX19
      DEPLOYMENT_NAME:                 test-deployment
      DEPLOYMENT_NAMESPACE:            kubeflow
      ENGINE_SERVER_PORT:              8000
      ENGINE_SERVER_GRPC_PORT:         5001
      JAVA_OPTS:                       -Dcom.sun.management.jmxremote.rmi.port=9090 -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9090 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.local.only=false -Djava.rmi.server.hostname=127.0.0.1
      SELDON_LOG_MESSAGES_EXTERNALLY:  false
    Mounts:
      /etc/podinfo from podinfo (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-t987c (ro)
  istio-proxy:
    Container ID:  docker://c3020ad8480a8465860ff9ce4b77e25531c5889eec8f6e7d6e1d4308cb3c44de
    Image:         docker.io/istio/proxyv2:1.1.6
    Image ID:      docker-pullable://istio/proxyv2@sha256:e7ee1ad38bd5b556ad0527ac691a9f647b66835960417b154c5d28b2ed9219cb
    Port:          15090/TCP
    Host Port:     0/TCP
    Args:
      proxy
      sidecar
      --domain
      $(POD_NAMESPACE).svc.cluster.local
      --configPath
      /etc/istio/proxy
      --binaryPath
      /usr/local/bin/envoy
      --serviceCluster
      test-deployment-example-7cd068f.$(POD_NAMESPACE)
      --drainDuration
      45s
      --parentShutdownDuration
      1m0s
      --discoveryAddress
      istio-pilot.istio-system:15010
      --zipkinAddress
      zipkin.istio-system:9411
      --connectTimeout
      10s
      --proxyAdminPort
      15000
      --concurrency
      2
      --controlPlaneAuthPolicy
      NONE
      --statusPort
      15020
      --applicationPorts
      9000,8000,5001,8082,9090
    State:          Running
      Started:      Wed, 14 Aug 2019 01:14:12 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     2
      memory:  128Mi
    Requests:
      cpu:      10m
      memory:   40Mi
    Readiness:  http-get http://:15020/healthz/ready delay=1s timeout=1s period=2s #success=1 #failure=30
    Environment:
      POD_NAME:                      test-deployment-example-7cd068f-5bcbdd6f86-l94z2 (v1:metadata.name)
      POD_NAMESPACE:                 kubeflow (v1:metadata.namespace)
      INSTANCE_IP:                    (v1:status.podIP)
      ISTIO_META_POD_NAME:           test-deployment-example-7cd068f-5bcbdd6f86-l94z2 (v1:metadata.name)
      ISTIO_META_CONFIG_NAMESPACE:   kubeflow (v1:metadata.namespace)
      ISTIO_META_INTERCEPTION_MODE:  REDIRECT
      ISTIO_METAJSON_ANNOTATIONS:    {"prometheus.io/path":"prometheus","prometheus.io/port":"8000","prometheus.io/scrape":"true"}

      ISTIO_METAJSON_LABELS:         {"app":"test-deployment-example-7cd068f","fluentd":"true","pod-template-hash":"5bcbdd6f86","seldon-app":"seldon-model-test-deployment-example","seldon-app-classifier":"test-deployment-example-classifier-seldonio-mock-classifier-1-0","seldon-deployment-id":"test-deployment-seldon-model","version":"v1"}

    Mounts:
      /etc/certs/ from istio-certs (ro)
      /etc/istio/proxy from istio-envoy (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-t987c (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  podinfo:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.annotations -> annotations
  default-token-t987c:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-t987c
    Optional:    false
  istio-envoy:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  <unset>
  istio-certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  istio.default
    Optional:    true
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age   From                                   Message
  ----     ------     ----  ----                                   -------
  Normal   Scheduled  17m   default-scheduler                      Successfully assigned kubeflow/test-deployment-example-7cd068f-5bcbdd6f86-l94z2 to jmiler-vm2.machine.com
  Normal   Pulled     17m   kubelet, jmiler-vm2.machine.com  Container image "docker.io/istio/proxy_init:1.1.6" already present on machine
  Normal   Created    17m   kubelet, jmiler-vm2.machine.com  Created container istio-init
  Normal   Started    17m   kubelet, jmiler-vm2.machine.com  Started container istio-init
  Normal   Pulled     17m   kubelet, jmiler-vm2.machine.com  Container image "seldonio/mock_classifier:1.0" already present on machine
  Normal   Created    17m   kubelet, jmiler-vm2.machine.com  Created container classifier
  Normal   Pulled     17m   kubelet, jmiler-vm2.machine.com  Container image "docker.io/seldonio/engine:0.3.2-SNAPSHOT" already present on machine
  Normal   Started    17m   kubelet, jmiler-vm2.machine.com  Started container classifier
  Normal   Created    17m   kubelet, jmiler-vm2.machine.com  Created container seldon-container-engine
  Normal   Started    17m   kubelet, jmiler-vm2.machine.com  Started container seldon-container-engine
  Normal   Pulled     17m   kubelet, jmiler-vm2.machine.com  Container image "docker.io/istio/proxyv2:1.1.6" already present on machine
  Normal   Created    17m   kubelet, jmiler-vm2.machine.com  Created container istio-proxy
  Normal   Started    17m   kubelet, jmiler-vm2.machine.com  Started container istio-proxy
  Warning  Unhealthy  17m   kubelet, jmiler-vm2.machine.com  Readiness probe failed: HTTP probe failed with statuscode: 503
@ukclivecox
Copy link
Contributor

To me this looks like an istio network issue.
Anything related to : https://istio.io/docs/tasks/security/authz-http/

@ukclivecox ukclivecox added this to the 1.0.x milestone Aug 24, 2019
@ukclivecox ukclivecox self-assigned this Sep 12, 2019
@ukclivecox
Copy link
Contributor

Can you test with master version of seldon and run the notebook https://github.com/SeldonIO/seldon-core/blob/master/notebooks/istio_example.ipynb

@ukclivecox
Copy link
Contributor

Closing please reopen if still an issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant