Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gRPC requests fail for intermittent time after mapping creation #1587

Closed
ukclivecox opened this issue May 31, 2019 · 6 comments
Closed

gRPC requests fail for intermittent time after mapping creation #1587

ukclivecox opened this issue May 31, 2019 · 6 comments
Labels
stale Issue is stale and will be closed

Comments

@ukclivecox
Copy link

Describe the bug
gRPC endpoint fails after initial mapping for intermittant time

To Reproduce
Steps to reproduce the behavior:
Create a gRPC Mapping. The mapping we use is:

apiVersion: ambassador/v1
grpc: true
headers:
  namespace: seldon
  seldon: myabtest
kind: Mapping
name: seldon_seldon_myabtest_grpc_mapping
prefix: /seldon.protos.Seldon/
retry_policy:
  num_retries: 3
  retry_on: connect-failure
rewrite: /seldon.protos.Seldon/
service: myabtest-myabtest.seldon:5001
timeout_ms: 3000

I see in debug logs no cluster match for URL '/seldon.protos.Seldon/Predict' while the gRPC requests fail.

Expected Behaviour:

I would expect gRPC to succeed when mapping is available as backend services are ready.

Versions (please complete the following information):

  • Ambassador: 0.70.1
  • Kubernetes environment [e.g. Minikube, bare metal, Google Kubernetes Engine]
  • Version (Minikube) v1.14.1

Additional context

A bad request log is shown below:

[2019-05-31 09:43:53.564][163][debug][http] [source/common/http/conn_manager_impl.cc:243] [C3353] new stream
[2019-05-31 09:43:53.565][163][debug][http] [source/common/http/conn_manager_impl.cc:580] [C3353][S13422079470000671942] request headers complete (end_stream=false):
':scheme', 'http'
':method', 'POST'
':authority', 'localhost:8003'
':path', '/seldon.protos.Seldon/Predict'
'seldon', 'myabtest'
'namespace', 'seldon'
'te', 'trailers'
'content-type', 'application/grpc'
'user-agent', 'grpc-python/1.14.0 grpc-c/6.0.0 (manylinux; chttp2; gladiolus)'
'grpc-accept-encoding', 'identity,deflate,gzip'
'accept-encoding', 'identity,gzip'

[2019-05-31 09:43:53.566][163][debug][router] [source/common/router/router.cc:277] [C3353][S13422079470000671942] no cluster match for URL '/seldon.protos.Seldon/Predict'
[2019-05-31 09:43:53.566][163][debug][http] [source/common/http/conn_manager_impl.cc:1278] [C3353][S13422079470000671942] encoding headers via codec (end_stream=true):
':status', '200'
'content-type', 'application/grpc'
'grpc-status', '12'
'date', 'Fri, 31 May 2019 09:43:53 GMT'
'server', 'envoy'

[2019-05-31 09:43:53.566][163][debug][http2] [source/common/http/http2/codec_impl.cc:577] [C3353] stream closed: 0
ACCESS [2019-05-31T09:43:53.564Z] "POST /seldon.protos.Seldon/Predict HTTP/2" 200 NR 0 0 2 - "172.17.0.13" "grpc-python/1.14.0 grpc-c/6.0.0 (manylinux; chttp2; gladiolus)" "c7a97bce-3116-45bd-af3f-7497a8e8bfbf" "localhost:8003" "-"

After a variable time (secs, minutes) a successful request is possible:

2019-05-31 09:46:59.330][162][debug][http] [source/common/http/conn_manager_impl.cc:243] [C3509] new stream
[2019-05-31 09:46:59.331][162][debug][http] [source/common/http/conn_manager_impl.cc:580] [C3509][S5557571008820326814] request headers complete (end_stream=false):
':scheme', 'http'
':method', 'POST'
':authority', 'localhost:8003'
':path', '/seldon.protos.Seldon/Predict'
'seldon', 'myabtest'
'namespace', 'seldon'
'te', 'trailers'
'content-type', 'application/grpc'
'user-agent', 'grpc-python/1.14.0 grpc-c/6.0.0 (manylinux; chttp2; gladiolus)'
'grpc-accept-encoding', 'identity,deflate,gzip'
'accept-encoding', 'identity,gzip'

[2019-05-31 09:46:59.331][162][debug][router] [source/common/router/router.cc:320] [C3509][S5557571008820326814] cluster 'cluster_myabtest_myabtest_seldon_5001' match for URL '/seldon.protos.Seldon/Predict'
[2019-05-31 09:46:59.331][162][debug][router] [source/common/router/router.cc:381] [C3509][S5557571008820326814] router decoding headers:
':scheme', 'http'
':method', 'POST'
':authority', 'localhost:8003'
':path', '/seldon.protos.Seldon/Predict'
'seldon', 'myabtest'
'namespace', 'seldon'
'te', 'trailers'
'content-type', 'application/grpc'
'user-agent', 'grpc-python/1.14.0 grpc-c/6.0.0 (manylinux; chttp2; gladiolus)'
'grpc-accept-encoding', 'identity,deflate,gzip'
'accept-encoding', 'identity,gzip'
'x-forwarded-for', '172.17.0.13'
'x-forwarded-proto', 'http'
'x-envoy-internal', 'true'
'x-request-id', '9cf13b44-7e60-4f4d-9f21-503b1032d49d'
'x-envoy-expected-rq-timeout-ms', '3000'
'x-envoy-original-path', '/seldon.protos.Seldon/Predict'

[2019-05-31 09:46:59.331][162][debug][client] [source/common/http/codec_client.cc:26] [C3512] connecting
[2019-05-31 09:46:59.331][162][debug][connection] [source/common/network/connection_impl.cc:644] [C3512] connecting to 10.104.54.0:5001
[2019-05-31 09:46:59.331][162][debug][connection] [source/common/network/connection_impl.cc:653] [C3512] connection in progress
[2019-05-31 09:46:59.331][162][debug][http2] [source/common/http/http2/codec_impl.cc:735] [C3512] setting stream-level initial window size to 268435456
[2019-05-31 09:46:59.331][162][debug][http2] [source/common/http/http2/codec_impl.cc:757] [C3512] updating connection-level initial window size to 268435456
[2019-05-31 09:46:59.332][162][debug][pool] [source/common/http/conn_pool_base.cc:20] queueing request due to no available connections
[2019-05-31 09:46:59.332][162][debug][http] [source/common/http/conn_manager_impl.cc:1037] [C3509][S5557571008820326814] request end stream
[2019-05-31 09:46:59.332][162][debug][connection] [source/common/network/connection_impl.cc:517] [C3512] connected
[2019-05-31 09:46:59.332][162][debug][client] [source/common/http/codec_client.cc:64] [C3512] connected
[2019-05-31 09:46:59.332][162][debug][pool] [source/common/http/http2/conn_pool.cc:96] [C3512] creating stream
[2019-05-31 09:46:59.332][162][debug][router] [source/common/router/router.cc:1165] [C3509][S5557571008820326814] pool ready
[2019-05-31 09:46:59.559][162][debug][router] [source/common/router/router.cc:717] [C3509][S5557571008820326814] upstream headers complete: end_stream=false
[2019-05-31 09:46:59.560][162][debug][http] [source/common/http/conn_manager_impl.cc:1278] [C3509][S5557571008820326814] encoding headers via codec (end_stream=false):
':status', '200'
'content-type', 'application/grpc'
'grpc-encoding', 'identity'
'grpc-accept-encoding', 'gzip'
'x-envoy-upstream-service-time', '227'
'date', 'Fri, 31 May 2019 09:46:59 GMT'
'server', 'envoy'

[2019-05-31 09:46:59.567][162][debug][client] [source/common/http/codec_client.cc:95] [C3512] response complete
[2019-05-31 09:46:59.567][162][debug][pool] [source/common/http/http2/conn_pool.cc:237] [C3512] destroying stream: 0 remaining
[2019-05-31 09:46:59.567][162][debug][http] [source/common/http/conn_manager_impl.cc:1446] [C3509][S5557571008820326814] encoding trailers via codec:
'grpc-status', '0'
@ryandawsonuk
Copy link

Related to #504 (comment)

@ukclivecox
Copy link
Author

ukclivecox commented May 31, 2019

Also to note when making repeated requests:

I get failed gRPC requests with status.UNIMPLEMENTED returns
Then it transitions to 1 request that returns status.GOAWAY
Then it starts working...

@stale
Copy link

stale bot commented Jul 30, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Issue is stale and will be closed label Jul 30, 2019
@ukclivecox
Copy link
Author

Any update on whether others see this or not?

@stale stale bot removed the stale Issue is stale and will be closed label Jul 30, 2019
@axsaucedo
Copy link

+1

@stale
Copy link

stale bot commented Oct 13, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Issue is stale and will be closed label Oct 13, 2019
@stale stale bot closed this as completed Oct 20, 2019
alexgervais pushed a commit that referenced this issue Jul 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issue is stale and will be closed
Projects
None yet
Development

No branches or pull requests

3 participants