-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return ErrBuf when records processed < actual records send by upstream #1493
Conversation
…nt by upstream (due to buff size)
Hmm, I'm not sure about this. It seems like we explicitly handle the provided count value being too large so this may be a breaking change. See Lines 663 to 666 in 0d504a6
What behaviour do other DNS servers do in this case? Ultimately isn't the bug really that the server hasn't set the TC bit even though the message is truncated? |
@tmthrgd - Thank you for taking a look at this request. We can easily repro this issue with a simple test case in latest coredns. Simple test case to repro - https://github.com/SriHarsha001/fixcorednsudpbug/blob/34ee289d240f0dfa7d0998b0bfe6da2efae157b7/plugin/pkg/proxy/proxy_test.go#L133 A bit complex test cases to repro - Below writeup is referring this func unpackRRslice
When the domain name (example123.org.) has 15 characters - Similarly when the domain name (example1234567.org.) has 19 characters - So in these cases, if upstream has sent more records and no TC flag is set, records are truncated and TC flag is not set (in setHdr). This is the failing scenario. As we can see from the below table, this problem can be seen for default UDP size when number of characters in domain name is 02, 15, 19, 36 (when off is 512). We can also repro this issue for all other numbers by setting Edns0 in the request. No of Char Question Each loop Sequence of off I hope I was able to explain in a way that can be understood clearly. |
@tmthrgd - Just a gentle follow up. Requesting a review when you get few minutes. |
I don't fully understand what being said here. You start with
so, I send a UDP message without EDNS to an upstream using UDP and it sends a reply exceeding 512B as if it assumes my buf is 4096B or something? That is a clear violation of protocol and there will be no work around added for that here. Please complain to the upstream. And then disregarding above what would then be buggy here? How would a simple unit test look like in this library that fails now and will be fixed with this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs a unit test that demonstrates the need
Just for clarity. The problem which @SriHarsha001 is trying to solve here is sporadic "legitimization" of malformed DNS responses by dns library under some condition, i.e. the library doesn't return error, but instead it silently returns (possibly truncated or "fixed") DNS message. The type of malformed message being considered here is when the number of resource records declared in the message header does not correspond to the actual number of resource records existing in DNS response message. As a special case, this may happen when dns library allocates smaller UDP buffer than the size of DNS response coming from (non-complying) DNS server. The condition when the error is not reported by dns library is if the last octet of DNS message (i.e. the last byte of the buffer) corresponds to last octet of one of resource records. |
We can repro the issue by just changing the domain name from "example." to "example123.org." and requesting for TypeA in this test case - TestServingLargeResponses Line 627 in a16092f
Change line number - 628 and 629 Change line number - 639 |
Yes, I have added unit tests to test the edge cases. I kindly request you to review when you get few minutes. |
Thank you very much @rdrozhdzh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the unit test.
I'll probably merge this pr first: #1507
which may or may not introduce a conflict here (FYI)
Thank you for your review. That PR has significant changes. Just FYI... I did a quick test on unpack_cryptobyte branch and was able to repro the issue. |
Yes this is entirely expected. I'm not trying to change behaviour very much in that PR, merely the implementation. Future PRs can address this and other potential issues. Having now worked on #1507, I better understand the issue you've observed (#1492) and agree a fix makes sense. It might be a breaking for a very specific corner case, but given other errors that are more likely to occur for truncated messages I think it's fine. As part of the changes for #1507, the fix is actually much simpler than what you've proposed here. I'll open a new PR once that's been merged to address this for you. |
Sure sure, thank you very much. I will look out for updates in this lib. |
@tmthrgd - Just a gentle follow up on this request. Tracking this #1507, I see the PR is not merged. I am guessing the work is in progress. |
@@ -852,16 +852,25 @@ func (dns *Msg) unpack(dh Header, msg []byte, off int) (err error) { | |||
} | |||
|
|||
dns.Answer, off, err = unpackRRslice(int(dh.Ancount), msg, off) | |||
if err == nil && int(dh.Ancount) > len(dns.Answer) { | |||
return ErrBuf | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we return ErrBuf and the code below is only executed when the count is smaller which is weird. So this def. feels wrong. And can't go in as such
Again: an upstream sends back a message that is too large without TC? That is a protocol violation, unless I still misunderstand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the reply - @miekg. Sure sure I understand. Yes, this is a scenario seen due to non-complying DNS server.
@tmthrgd mentioned the fix is actually much simpler than what I proposed here #1493 (comment). I am kindly requesting Tom's help here.
@rdrozhdzh has very nicely summarized the issue I am trying to explain - #1493 (comment).
Quote - Sporadic "legitimization" of malformed DNS responses by dns library under some condition, i.e. the library doesn't return error, but instead it silently returns (possibly truncated or "fixed") DNS message.
The type of malformed message being considered here is when the number of resource records declared in the message header does not correspond to the actual number of resource records existing in DNS response message. As a special case, this may happen when dns library allocates smaller UDP buffer than the size of DNS response coming from (non-complying) DNS server.
The condition when the error is not reported by dns library is if the last octet of DNS message (i.e. the last byte of the buffer) corresponds to last octet of one of resource records.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tmthrgd - Earlier you had mentioned about a simpler fix for this issue, would you be able to help here please ?
came back to this to see if I could make sense of it for the 3rd time. The answers is no. First off all the test is completely flawed. But I thought I might be able to fix it, i did and then ran the test again and it still failed. because this is not all all how this package works. the changed test: package dns
import (
"fmt"
"net"
"testing"
)
func TestUnPackForErrBuf(t *testing.T) {
name := "example123.org."
i := 17
expectErr := true
m := new(Msg)
m.SetQuestion(name, TypeA)
m.Authoritative = true
for j := 0; j < i; j++ {
a := &A{Hdr: RR_Header{Name: name, Rrtype: TypeA, Class: ClassINET}, A: net.IPv4(127, 0, 0, byte(j+1)).To4()}
m.Answer = append(m.Answer, a)
}
fmt.Printf("%d\n%s", m.Len(), m)
packedmsg, err := m.Pack()
println("LEN PACKED", len(packedmsg))
if len(packedmsg) > MinMsgSize {
packedmsg = packedmsg[:MinMsgSize]
}
m1 := new(Msg)
err = m1.Unpack(packedmsg)
println(m1.String())
if expectErr {
println(err)
if err != ErrBuf {
t.Error("Expected ErrBuf due to truncation")
}
}
} when running this prints === RUN TestUnPackForErrBuf
542
;; opcode: QUERY, status: NOERROR, id: 15960
;; flags: aa rd; QUERY: 1, ANSWER: 17, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;example123.org. IN A
;; ANSWER SECTION:
example123.org. 0 IN A 127.0.0.1
example123.org. 0 IN A 127.0.0.2
example123.org. 0 IN A 127.0.0.3
example123.org. 0 IN A 127.0.0.4
example123.org. 0 IN A 127.0.0.5
example123.org. 0 IN A 127.0.0.6
example123.org. 0 IN A 127.0.0.7
example123.org. 0 IN A 127.0.0.8
example123.org. 0 IN A 127.0.0.9
example123.org. 0 IN A 127.0.0.10
example123.org. 0 IN A 127.0.0.11
example123.org. 0 IN A 127.0.0.12
example123.org. 0 IN A 127.0.0.13
example123.org. 0 IN A 127.0.0.14
example123.org. 0 IN A 127.0.0.15
example123.org. 0 IN A 127.0.0.16
example123.org. 0 IN A 127.0.0.17
LEN PACKED 542
;; opcode: QUERY, status: NOERROR, id: 15960
;; flags: aa rd; QUERY: 1, ANSWER: 16, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;example123.org. IN A
;; ANSWER SECTION:
example123.org. 0 IN A 127.0.0.1
example123.org. 0 IN A 127.0.0.2
example123.org. 0 IN A 127.0.0.3
example123.org. 0 IN A 127.0.0.4
example123.org. 0 IN A 127.0.0.5
example123.org. 0 IN A 127.0.0.6
example123.org. 0 IN A 127.0.0.7
example123.org. 0 IN A 127.0.0.8
example123.org. 0 IN A 127.0.0.9
example123.org. 0 IN A 127.0.0.10
example123.org. 0 IN A 127.0.0.11
example123.org. 0 IN A 127.0.0.12
example123.org. 0 IN A 127.0.0.13
example123.org. 0 IN A 127.0.0.14
example123.org. 0 IN A 127.0.0.15
example123.org. 0 IN A 127.0.0.16
(0x0,0x0)
dns_truncated_test.go:35: Expected ErrBuf due to truncation
--- FAIL: TestUnPackForErrBuf (0.00s)
FAIL
exit status 1
FAIL github.com/miekg/dns 0.002s which is the result of you have a message with 17 As and the FORCEFULLY SET THE BUFFER to 512 and then unpack, and you get back a PERFECTLY FINE message with 16 As. I don't know why you think something should trigger as if the thing should know that we should get 17 RRs? The All the code around unpack is also completely clear about not trusting counters: // Qdcount, Ancount, Nscount, Arcount can't be trusted, as they are attacker controlled.
// This means we can't use them to pre-allocate slices. // The header counts might have been wrong so we need to update it
dh.Arcount = uint16(len(dns.Extra)) so not only is this not a bug, any code that does that thing that you think is OK, is in fact not OK and completely wrong |
This proposed PR is to address the bug outlined here - #1492
Additional reference - coredns/coredns#6371 (comment)
Issue Description
If there is a misbehaving upstream server which does not set TC bit and sends back a response length greater than pc.c.UDPSize or 512 default size, then there is a corner case where offset is equal to len(msg) as calculated in unpackHeader defined here github.com/miekg/msg_helpers.go. In this case, there is no Overflow/ErrBuff error returned, so the response will be truncated without TC bit set.
For example -
Similarly,