-
-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Line number capped at 65535 but libxml2 already returns long. #1617
Comments
The act of casting this return value from a long to an int is not the cause of line number overflow. First, let's take a look at the relevant libxml2 struct:
You can see that internally, libxml2 stores line number as an unsigned short. Second, the size of
outputs
But thanks for thinking of us! And thanks for using Nokogiri. |
Mmm I see. And changing that in libxml2 breaks ABI. Too bad. I guess they return long already although they have to cast from ushort to prepare the interface for when they can change the struct. In this case nokogiri should also be already prepared and respect the return value of xmlGetLineNo, IMHO (unless it has performance issues). So when you update to a libxml2 with the fixed line, you won't have to remember to change this line in nokogiri and it will be fixed automatically, don't you think? |
I love your enthusiasm, @pakore, but there is in fact no bug here. If you can write a failing test, I'll reconsider my opinion. |
I can, but I would have to mock The bug is as simple as this: You are casting a You can get away with it so far because you know the I may try to write a unit test if i can get around mocking that function to return something higher than MAX_INT. I understand this is not urgent, but since the fix is immediate you can just include it in the next release. Anyway, thanks for reading my messages and replying, appreciated! |
Anyway, if there is an XML with more than 2147483648 lines....you deserve a bug :). |
I guess we actually should use UINT2NUM(). |
@knu @pakore I apologize for saying so, but given the underlying limitations of libxml2, there's no bug here, and libxml2 maintainers are on the record saying that they won't change this behavior until 2.0 because of the required ABI change. @knu, if you want to change this, you're within your rights as a committer. but we can't write a failing test for this, so I honestly don't know why we're still talking about it. |
feat(cruby): support line numbers larger than a short --- **What problem is this PR intended to solve?** As noted in #1493, #1617, #1505, #1003, and #533, libxml2 has not historically supported line numbers greater than a `short int`. Starting in libxml v2.9.0, setting the parse option `BIG_LINES` would allow tracking line numbers in longer documents. Specifically this PR makes the following changes: - set `BIG_LINES` parse option by default which will allow `Node#line` to return large integers - allow `Node#line=` to set large line numbers on text nodes Fixes #1764 **Have you included adequate test coverage?** Yes! **Does this change affect the behavior of either the C or the Java implementations?** JRuby's Xerces-based implementation did not suffer from this particular shortcoming, although its line number functionality is questionable in other ways (see #2177 / b32c875).
This will be fixed in v1.13. |
ext/nokogiri/xml_node.c:1263
contains the line:return INT2NUM(xmlGetLineNo(node));
but the signature of
xmlGetLineNo
according to libxml2 2.8.0 (tree.c:4527) isso libxml2 is returning a
long
that is passed to a macro function that accepts anint
, solong
is converted toint
, and hence capped to 65535The solution would be to change
ext/nokogiri/xml_node.c:1263
toreturn LONG2NUM(xmlGetLineNo(node));
and thus a problem that's been around for some years would be fixed.
The text was updated successfully, but these errors were encountered: