You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
If you are interested in working on this issue or have submitted a pull request, please leave a comment.
Overview of the Issue
When using the endpoints-controller and when deploying or restarting a certain number of pods, the endpoints-controller will receive multiple events for the same endpoint as the pods are moving.
During the processing of the first event, the client.ACL().TokenList(nil) call will return tokens that have been created during the processing (as the other pods are being started). Because the list in endpointPods is not up to date, the reconciliation process will delete the ACL token causing the init iteration to fail (but will work after a while by retrying to login).
// map is me dumping the subset.Addresses
map[
{10.195.109.89 0xc0009dcd70 &ObjectReference{Kind:Pod,Namespace:spartacux,Name:site-spartacux-fr-b2c-58b4d579-wd4sw,UID:53880add-5a21-437c-ba1a-969aa579ea0d,APIVersion:,ResourceVersion:143265599,FieldPath:,}}:passing
{10.195.110.58 0xc0009dcd80 &ObjectReference{Kind:Pod,Namespace:spartacux,Name:site-spartacux-fr-b2c-58b4d579-pb6gh,UID:7745d69f-a9e1-4a06-8676-255ea061a072,APIVersion:,ResourceVersion:143265109,FieldPath:,}}:passing
{10.195.74.93 0xc0009dcda0 &ObjectReference{Kind:Pod,Namespace:spartacux,Name:site-spartacux-fr-b2c-58b4d579-p5t72,UID:0a0ca60f-ea47-48d5-b675-b678827360b5,APIVersion:,ResourceVersion:143266873,FieldPath:,}}:critical]
// 11/8/2021 at 14:19:54
{"level":"info","ts":1628691594.6090574,"logger":"controller.endpoints","msg":"deleting ACL token for pod","name":"site-spartacux-fr-b2c-58b4d579-6q9s7"}
{"level":"info","ts":1628691595.269407,"logger":"controller.endpoints","msg":"deleting ACL token for pod","name":"site-spartacux-fr-b2c-58b4d579-b86nv"}
{"level":"info","ts":1628691595.2737951,"logger":"controller.endpoints","msg":"deleting ACL token for pod","name":"site-spartacux-fr-b2c-58b4d579-bg5wc"}
{"level":"info","ts":1628691595.2784,"logger":"controller.endpoints","msg":"deleting ACL token for pod","name":"site-spartacux-fr-b2c-58b4d579-j5rmt"}
{"level":"info","ts":1628691595.3030953,"logger":"controller.endpoints","msg":"retrieved","name":"site-spartacux-fr-b2c","ns":"spartacux"}
{"level":"info","ts":1628691595.303154,"logger":"controller.endpoints","msg":"adding target to endpointsPods list","target":"site-spartacux-fr-b2c-58b4d579-wd4sw"}
Reproduction Steps
Scale any deployment to 100 pods or more
Checking the logs of the init-container:
❯ kubectl logs site-spartacux-fr-b2c-58b4d579-6q9s7 -n spartacux -c consul-connect-init
Wed Aug 11 14:19:54 UTC 2021
{"@level":"info","@message":"Consul login complete","@timestamp":"2021-08-11T14:19:54.121527Z"}
{"@level":"error","@message":"Unable to get Agent services","@timestamp":"2021-08-11T14:19:54.122403Z","error":"Unexpected response code: 403 (ACL not found)"}
{"@level":"error","@message":"Unable to get Agent services","@timestamp":"2021-08-11T14:19:55.123105Z","error":"Unexpected response code: 403 (ACL not found)"}
{"@level":"error","@message":"Unable to get Agent services","@timestamp":"2021-08-11T14:19:56.123835Z","error":"Unexpected response code: 403 (ACL not found)"}
{"@level":"error","@message":"Unable to get Agent services","@timestamp":"2021-08-11T14:19:57.124566Z","error":"Unexpected response code: 403 (ACL not found)"}
{"@level":"error","@message":"Unable to get Agent services","@timestamp":"2021-08-11T14:19:58.125249Z","error":"Unexpected response code: 403 (ACL not found)"
Check the deleting ACL token message timestamp vs the login message, the token is deleted just after being created.
Logs
Expected behavior
ACL token should not be deleted during a loop if the pod is still present
Environment details
consul-k8s version: latest master
Additional Context
Thanks you for your help, let me know if you need more information!
The text was updated successfully, but these errors were encountered:
Community Note
Overview of the Issue
When using the endpoints-controller and when deploying or restarting a certain number of pods, the endpoints-controller will receive multiple events for the same endpoint as the pods are moving.
During the processing of the first event, the
client.ACL().TokenList(nil)
call will return tokens that have been created during the processing (as the other pods are being started). Because the list inendpointPods
is not up to date, the reconciliation process will delete the ACL token causing the init iteration to fail (but will work after a while by retrying to login).Reproduction Steps
deleting ACL token
message timestamp vs the login message, the token is deleted just after being created.Logs
Expected behavior
Environment details
consul-k8s
version: latest masterAdditional Context
Thanks you for your help, let me know if you need more information!
The text was updated successfully, but these errors were encountered: