-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test.test_xml_etree*.XMLPullParserTest.test_simple_xml fails with (system) expat 2.6.0 #115133
Comments
I've run into the exact same issue when trying to build/package Python 3.11.8 in a clean chroot on Archlinux with testing repos enabled (expat 2.6.0 is currently only available from the |
Note that tests are passed for I wonder whether it was an intentional change (and what was the reason) or a bug in expat. |
@serhiy-storchaka: I think this might be the result of the fix for CVE-2023-52425: Not sure if it's intentional, though. |
Just a quick note that libexpat upstream is aware by now. I'll get back to you for more soon-ish. |
I have created candidate pull request #115138 now for review. It fixes the test suite in CPython for Expat >=2.6.0. |
I would make the Expat buffering more context depending. For example, when the parser waits for the end of the long attribute, |
Expat previously took quadratic time when parsing tokens that required multiple buffer fills (a.k.a. "feeds" in the Python code). Expat 2.6.0 introduces a mitigation for this that does exponential backoff when repeatedly failing to parse the same token due to not having enough data. This is why it doesn't affect the non-chunked case. It will also not affect the vast majority of usecases, since tokens do not usually require multiple feeds. While I don't believe Expat ever guaranteed immediate events, I recognize that it is a change in Expat's behavior in practice. However, I expect that the vast majority of apps will not depend on getting immediate feedback (as they won't know what their input is) -- and this was the least intrusive way I could think of to fix this DoS.
@hartwork and I have discussed something along those lines, but concluded that it would be unnecessarily complex, and possibly difficult to get 100% right. For example, if we're looking for the end of If this behavior is absolutely unacceptable, it is possible (but not recommended) to disable the mitigation using |
Does it need to restart parsing from Could you at least only block reparsing for large enough data? I mean that there is a difference between reparsing 100 bytes and 1000000 bytes. Reducing the factor from 2 to 1.5 or 1.8 could help in some use cases. For example, if the stream consists of massages of approximately equal size N, and we feed the parser by chunks of approximately N bytes, then there is a large chance to block three messages in row (the first chunk is slightly smaller than the first message, and the second chunk is slightly smaller than the first chunk), but if the factor is less than 2, most likely only one message be blocked. Or maybe combine both approaches: |
Yes, that's what it does.
It's possible, but won't that just make the behavior harder to grasp for a user of the library? Also, since the cost depends on the feed size, it's hard to set a definite threshold. Even a 1000-byte token, fed into Expat 1 byte at a time, will incur an amplification of 500x.
Yes, a smaller factor could help that case. But I'm not sure it's a good idea to try to hide this behavior from apps -- for those that don't really need the behavior, it's unnecessary, and for those that do need it, it's not a 100% guarantee. Do note that the whole first chunk would need to consist of a single token (essentially one big tag or comment) for this example to trigger the deferral logic, since consuming any bytes will reset the heuristic. |
Followup to 93480b9. Bug: python/cpython#115133 Signed-off-by: Sam James <[email protected]>
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive.
#115164 keeps testing with small chunks, so we can test that our code does not introduce additional buffering, but makes some failures tolerable with Expat 2.6.0. Am I correct that the test with chunk_size=8 is passed with Expat 2.6.0? |
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive.
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive.
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive.
…GH-115164) Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b) Co-authored-by: Serhiy Storchaka <[email protected]>
…GH-115164) Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b) Co-authored-by: Serhiy Storchaka <[email protected]>
Thank you all for discussion. |
…5164) (GH-115288) Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b) Co-authored-by: Serhiy Storchaka <[email protected]>
…5164) (GH-115289) Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b) Co-authored-by: Serhiy Storchaka <[email protected]>
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b) Co-authored-by: Serhiy Storchaka <[email protected]>
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b) Co-authored-by: Serhiy Storchaka <[email protected]>
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. Heavily inspired by python/cpython@4a08e7b We cannot use a @fails_with_expat_2_6_0 decorator because the test passes in ETreePullTestCase. See python/cpython#115133 See GHSA-gh68-jm46-84rf Co-authored-by: Serhiy Storchaka <[email protected]>
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b)
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b)
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b)
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b)
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b)
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b)
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b)
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b)
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b)
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. Heavily inspired by python/cpython@4a08e7b We cannot use a @fails_with_expat_2_6_0 decorator because the test passes in ETreePullTestCase. See python/cpython#115133 See GHSA-gh68-jm46-84rf Co-authored-by: Serhiy Storchaka <[email protected]>
Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b)
Fix etree XMLPullParser tests for Expat >=2.6.0 with reparse deferral Fixes: gh#python#115133 From-PR: gh#python/cpython!115138 Patch: expat-260-test_xml_etree-reparse-deferral.patch
Fix etree XMLPullParser tests for Expat >=2.6.0 with reparse deferral Fixes: gh#python#115133 From-PR: gh#python/cpython!115138 Patch: expat-260-test_xml_etree-reparse-deferral.patch
Fix etree XMLPullParser tests for Expat >=2.6.0 with reparse deferral Fixes: gh#python#115133 From-PR: gh#python/cpython!115138 Patch: expat-260-test_xml_etree-reparse-deferral.patch
Fix etree XMLPullParser tests for Expat >=2.6.0 with reparse deferral Combined with gh#python/cpython!31453 bpo-46811: Make test suite support Expat >=2.4.5 (pythonGH-31453) Curly brackets were never allowed in namespace URIs according to RFC 3986, and so-called namespace-validating XML parsers have the right to reject them a invalid URIs. libexpat >=2.4.5 has become strcter in that regard due to related security issues; with ET.XML instantiating a namespace-aware parser under the hood, this test has no future in CPython. References: - https://datatracker.ietf.org/doc/html/rfc3968 - https://www.w3.org/TR/xml-names/ Also, test_minidom.py: Support Expat >=2.4.5 (cherry picked from commit 2cae938) Co-authored-by: Sebastian Pipping <[email protected]> Fixes: gh#python#115133 From-PR: gh#python/cpython!115138 Patch: CVE-2023-52425-libexpat-2.6.0-backport-15.6.patch
Combined with gh#python/cpython!31453 bpo-46811: Make test suite support Expat >=2.4.5 (pythonGH-31453) Curly brackets were never allowed in namespace URIs according to RFC 3986, and so-called namespace-validating XML parsers have the right to reject them a invalid URIs. libexpat >=2.4.5 has become strcter in that regard due to related security issues; with ET.XML instantiating a namespace-aware parser under the hood, this test has no future in CPython. References: - https://datatracker.ietf.org/doc/html/rfc3968 - https://www.w3.org/TR/xml-names/ Also, test_minidom.py: Support Expat >=2.4.5 (cherry picked from commit 2cae938) Co-authored-by: Sebastian Pipping <[email protected]> Fixes: gh#python#115133 From-PR: gh#python/cpython!115138 Patch: CVE-2023-52425-libexpat-2.6.0-backport.patch
Fix etree XMLPullParser tests for Expat >=2.6.0 with reparse deferral Combined with gh#python/cpython!31453 bpo-46811: Make test suite support Expat >=2.4.5 (pythonGH-31453) Curly brackets were never allowed in namespace URIs according to RFC 3986, and so-called namespace-validating XML parsers have the right to reject them a invalid URIs. libexpat >=2.4.5 has become strcter in that regard due to related security issues; with ET.XML instantiating a namespace-aware parser under the hood, this test has no future in CPython. References: - https://datatracker.ietf.org/doc/html/rfc3968 - https://www.w3.org/TR/xml-names/ Also, test_minidom.py: Support Expat >=2.4.5 (cherry picked from commit 2cae938) Co-authored-by: Sebastian Pipping <[email protected]> Fixes: gh#python#115133 From-PR: gh#python/cpython!115138 Patch: CVE-2023-52425-libexpat-2.6.0-backport-15.6.patch
…ythonGH-115164) (pythonGH-115288) Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b) Co-authored-by: Serhiy Storchaka <[email protected]>
…ythonGH-115164) (pythonGH-115288) Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b) Co-authored-by: Serhiy Storchaka <[email protected]>
…ythonGH-115164) (pythonGH-115288) Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b) Co-authored-by: Serhiy Storchaka <[email protected]>
…ythonGH-115164) (pythonGH-115288) Feeding the parser by too small chunks defers parsing to prevent CVE-2023-52425. Future versions of Expat may be more reactive. (cherry picked from commit 4a08e7b) Co-authored-by: Serhiy Storchaka <[email protected]>
Fix etree XMLPullParser tests for Expat >=2.6.0 with reparse deferral Combined with gh#python/cpython!31453 bpo-46811: Make test suite support Expat >=2.4.5 (pythonGH-31453) Curly brackets were never allowed in namespace URIs according to RFC 3986, and so-called namespace-validating XML parsers have the right to reject them a invalid URIs. libexpat >=2.4.5 has become strcter in that regard due to related security issues; with ET.XML instantiating a namespace-aware parser under the hood, this test has no future in CPython. References: - https://datatracker.ietf.org/doc/html/rfc3968 - https://www.w3.org/TR/xml-names/ Also, test_minidom.py: Support Expat >=2.4.5 (cherry picked from commit 2cae938) Co-authored-by: Sebastian Pipping <[email protected]> Fixes: gh#python#115133 From-PR: gh#python/cpython!115138 Patch: CVE-2023-52425-libexpat-2.6.0-backport-15.6.patch
Bug report
Bug description:
Expat 2.6.0 was released yesterday, with CVE fixes. After upgrading the system library and building CPython
--with-system-expat
, I'm getting the following test failures:I have reproduced with 3.11.8, 3.12.8 and main as of 2afc718, both using Gentoo ebuild and raw git repository. I've tested the latter like this:
CC @hartwork
CPython versions tested on:
3.11, 3.12, CPython main branch
Operating systems tested on:
Linux
Linked PRs
The text was updated successfully, but these errors were encountered: