-
-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak when using Nokogiri::XML::Builder with XML namespaces in nested elements #1810
Comments
The script above is flawed, as the elements don't actually reference a namespace. Use this script instead, which makes use of the default namespace and shows the same memory leak:
Output on my machine:
I'm currently running it again on Valgrind. Will post the results as soon as it's finished. |
Here are the few largest
It looks like memory is being leaked at two locations. |
Hi, thanks for reporting this, and apologies for the delay in replying. I've been getting false positives from valgrind in nokogiri's CI pipeline and needed to figure out how to suppress those before digging into this report. I'll take a look this weekend. |
Also - thank you for using the issue-reporting template, and for providing such clear information about the leak. You've helped a lot! |
No worries. I've actually done some heap profiling with Massif:
Here's the peak snapshot (but it looks similarly at every snapshot):
As far as I can see, there seem to be two issues:
|
OK, I think I've nailed the root cause down. Running your script before the fix:
and with a patch applied
and valgrind no longer dumps out any lost memory associated with allocations having Going to clean the code up a bit before committing, should be able to cut a release this weekend. |
Just checked and saw the commits. Thanks for the effort! Does this also fix the problem of excessively defined namespaces due to the inability to find existing definitions? Or was that a non-issue maybe? |
@paddor the "excessively defined namespaces" is an artifact of how the builder is implemented: each node is created (with namespaces) before it is added to the document tree, so there's no way to know (without a code design change) where to search in the document for a relevant existing namespace definition. As a result, the Builder (actually it's I'm open to a pull request that addresses the unnecessary creation of namespace definitions, but at this point it's an optimization and is thus unlikely to get my attention anytime soon. One further note: once the node is added to a document, Nokogiri will search the ancestors for a relevant pre-existing nsDef and if one exists will remove the duplicate from the reparented node. (More exactly, this node is parked in an "unlinked nodes" hash owned by the Document rather than being immediately freed because it's possible for references to exist. (This is pre-existing code.)) The changeset in #1815 introduces better behavior, to make sure that all references to the removed nsDef are repointed to the correct nsDef hanging off the ancestor element. I may play around with whether this means it's not possible to have any hanging references to it, in which case we may be able to free the nsDef up immediately. Need to think about this and try to break it. |
Interesting to know that every new node is created without an associated document! (Not an issue though, as long as no memory is leaked.) So, in the example script, as soon as the new node is integrated into the document (reparented), the duplicate namespace definition is removed? Pretty cool. 👍 |
@paddor Node are always created with an associated document; but they're created with an associated parent. Because there's no parent, there's no where to look for an existing matching namespace. I explored freeing this nsDef but determined it's not safe, and added a test demonstrating why it's not safe in 38a28fe. |
…ns-memory-leak address #1810 builder namespace memory leak
The fix in #1815 has been merged into master. Will be in the next release, hopefully in the next few days. |
Thanks! Looking forward to the next release. |
Any plans when the next release will be released? |
Release is coming together now. Ideally soon after libxml 2.9.9 drops next week. Watch this milestone for updates: https://github.com/sparklemotion/nokogiri/milestone/16 |
What problems are you experiencing?
Maybe related or similar to #1771. Memory leak when using
Nokogiri::XML::Builder
and namespaces. Only namespace definitions beginning withxmlns:
cause the leak (because Nokogiri::XML::Document#create_element is implemented that way), and nesting seems to be important too. I guess Nokogiri::XML::Node#add_namespace_definition doesn't free that definition when the node is garbage collected.What's the output from
nokogiri -v
?Can you provide a self-contained script that reproduces what you're seeing?
When I run this script, its RSS grows continually from ~12 MB up to 60 MB before it exits.
The text was updated successfully, but these errors were encountered: