Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quadratic entity decoding DoS vector #104

Open
bucklereed opened this issue Jun 8, 2017 · 2 comments
Open

Quadratic entity decoding DoS vector #104

bucklereed opened this issue Jun 8, 2017 · 2 comments

Comments

@bucklereed
Copy link

In #103 I wrote that character entities are safe, which is sort of true: compared to the exponential 'billion laughs' attack, they are! But there's still the possibility for a malicious document to consume a quadratic amount of memory compared to its input size by defining a large entity and then referencing it a large number of times:

Prelude Text.XML> parseText def "<!DOCTYPE foo [<!ENTITY A \"aaaaaaaaaa\" >]><foo>&A;&A;&A;&A;&A;&A;&A;&A;&A;&A;</foo>"
Right (Document {documentPrologue = Prologue {prologueBefore = [], prologueDoctype = Just (Doctype {doctypeName = "foo", doctypeID = Nothing}), prologueAfter = []}, documentRoot = Element {elementName = Name {nameLocalName = "foo", nameNamespace = Nothing, namePrefix = Nothing}, elementAttributes = fromList [], elementNodes = [NodeContent "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"]}, documentEpilogue = []})

(You may wish to read the result as screaming at the shambling horror at the dark corners of DTDs.)

There is already a defence against this: refuse to do any entity substitution, either by setting psDecodeEntities appropriately or by using Text.XML.Unresolved. If this is declared to be Good Enough, Text.XML's parsing functions will need warnings on them about this DoS vector and that they therefore should not be used on untrusted input without setting psDecodeEntities.

@bucklereed
Copy link
Author

The above comment's wrong: handling internal entities is done entirely in the parser. Even using Text.XML.Unresolved, or even the streaming interface directly, there's currently no way around this.

Or, in other words, currently xml-conduit really shouldn't be used for parsing untrusted data unless you like having all your memory eaten...

@jgm
Copy link
Contributor

jgm commented Feb 26, 2021

I believe this is mitigated by #161, which allows you to set a limit on the size of an entity expansion.
(You can set this to a small number if you really want to avoid this kind of thing.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants