Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS services are not resolving properly #5613

Merged
merged 1 commit into from
Nov 3, 2015

Conversation

smarterclayton
Copy link
Contributor

The change on Sept 15th changed how services resolved in the absence
of search paths, which resulted in very long times to resolve DNS in
some cases.

This reduces search times to a few ms.

Fixes #5154

@liggitt

@smarterclayton
Copy link
Contributor Author

[test]

@smarterclayton
Copy link
Contributor Author

Fixes #5154

@liggitt
Copy link
Contributor

liggitt commented Nov 2, 2015

Add tests to end to end to make sure short names still resolve? We currently have this:

MASTER_SERVICE_IP="$(dig @${API_HOST} "kubernetes.default.svc.cluster.local." +short A | head -n 1)"

Also test these?

kubernetes.default.svc
kubernetes.default
kubernetes

@liggitt
Copy link
Contributor

liggitt commented Nov 2, 2015

actually, not sure shortnames will work unless we have search paths set up

@smarterclayton
Copy link
Contributor Author

dns_test is pretty good. It's also very tolerant. I'd like to wait until
we have the SCC stuff to enable the upstream e2e.

On Mon, Nov 2, 2015 at 5:23 PM, Jordan Liggitt [email protected]
wrote:

Add tests to end to end to make sure short names still resolve? We
currently have this:

MASTER_SERVICE_IP="$(dig @${API_HOST} "kubernetes.default.svc.cluster.local." +short A | head -n 1)"

Also test these?

kubernetes.default.svc
kubernetes.default
kubernetes


Reply to this email directly or view it on GitHub
#5613 (comment).

@smarterclayton
Copy link
Contributor Author

v1.0.6:

$ dig @localhost _endpoints.kubernetes.default.svc.cluster.local

; <<>> DiG 9.9.6-P1-RedHat-9.9.6-8.P1.fc21 <<>> @localhost _endpoints.kubernetes.default.svc.cluster.local
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14752
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;_endpoints.kubernetes.default.svc.cluster.local. IN A

;; ANSWER SECTION:
_endpoints.kubernetes.default.svc.cluster.local. 30 IN A 10.0.2.15

;; Query time: 86 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Nov 03 02:11:01 UTC 2015
;; MSG SIZE  rcvd: 81

[vagrant@openshiftdev origin]$ dig @localhost _endpoints.kubernetes.default.svc.cluster.local SRV

; <<>> DiG 9.9.6-P1-RedHat-9.9.6-8.P1.fc21 <<>> @localhost _endpoints.kubernetes.default.svc.cluster.local SRV
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39167
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; QUESTION SECTION:
;_endpoints.kubernetes.default.svc.cluster.local. IN SRV

;; ANSWER SECTION:
_endpoints.kubernetes.default.svc.cluster.local. 30 IN SRV 10 50 8443 unknown-port-8443.e1.kubernetes.

;; ADDITIONAL SECTION:
unknown-port-8443.e1.kubernetes. 30 IN  A   10.0.2.15

;; Query time: 253 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Nov 03 02:11:08 UTC 2015
;; MSG SIZE  rcvd: 163
[vagrant@openshiftdev origin]$ dig @localhost kubernetes.default.svc.cluster.local

; <<>> DiG 9.9.6-P1-RedHat-9.9.6-8.P1.fc21 <<>> @localhost kubernetes.default.svc.cluster.local
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27725
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;kubernetes.default.svc.cluster.local. IN A

;; ANSWER SECTION:
kubernetes.default.svc.cluster.local. 30 IN A   172.30.0.1

;; Query time: 133 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Nov 03 02:12:26 UTC 2015
;; MSG SIZE  rcvd: 70

[vagrant@openshiftdev origin]$ dig @localhost kubernetes.default.svc.cluster.local SRV

; <<>> DiG 9.9.6-P1-RedHat-9.9.6-8.P1.fc21 <<>> @localhost kubernetes.default.svc.cluster.local SRV
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 56804
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; QUESTION SECTION:
;kubernetes.default.svc.cluster.local. IN SRV

;; ANSWER SECTION:
kubernetes.default.svc.cluster.local. 30 IN SRV 10 50 443 unknown-port-443.portal.kubernetes.

;; ADDITIONAL SECTION:
unknown-port-443.portal.kubernetes. 30 IN A 172.30.0.1

;; Query time: 113 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Nov 03 02:12:29 UTC 2015
;; MSG SIZE  rcvd: 158

@@ -72,8 +76,30 @@ func (b *ServiceResolver) Records(name string, exact bool) ([]msg.Service, error
if len(segments) == 0 {
return nil, nil
}
glog.V(4).Infof("Answering query %s:%t", dnsName, exact)
switch base := segments[0]; base {
case "pod":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where did this come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kube added it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pointer to their stuff so we can be sure we're consistent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Port: 2346,
},
{
Target: "other.e1.headless2.",
Target: headless2IPHash + "._other._tcp.headless2.default.svc.cluster.local.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

citation needed

The change on Sept 15th changed how services resolved in the absence
of search paths, which resulted in very long times to resolve DNS in
some cases.

Change the SRV record style to match upstream Kube (hash of pod ip)

Add support for the "pod" range <IP>.namespace.pod.cluster.local
resolves to <IP>.

Update tests.
@openshift-bot
Copy link
Contributor

Evaluated for origin test up to eb5c744

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin/6696/)

@smarterclayton
Copy link
Contributor Author

Any other comments?

@smarterclayton smarterclayton added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 3, 2015
@smarterclayton
Copy link
Contributor Author

SRV records changed from _port._proto.SVCNAME to _port._proto.SVCNAME.NAMESPACE.svc.cluster.local (should resolve the same, but faster)

@smarterclayton
Copy link
Contributor Author

Endpoint SRV records change from _port.EP<NUM>.SVCNAME to ENDPOINTIPHASH._port._proto.SVCNAME.NAMESPACE.svc.cluster.local

@smarterclayton
Copy link
Contributor Author

[merge]

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin/6696/) (Image: devenv-rhel7_2637)

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to eb5c744

openshift-bot pushed a commit that referenced this pull request Nov 3, 2015
@openshift-bot openshift-bot merged commit 1245d42 into openshift:master Nov 3, 2015
@smarterclayton smarterclayton added this to the 1.1.0 milestone Nov 8, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/reliability priority/P1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Repeated skydns log: incomplete CNAME chain: rcode is not equal to success
3 participants