Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support ulong values for Json #1118

Merged
merged 1 commit into from
Jun 3, 2015
Merged

Conversation

IgorStepanov
Copy link
Contributor

Hello, Vibe.D!

I've written about my problem on forum: http://forum.rejectedsoftware.com/groups/rejectedsoftware.vibed/thread/25482/

I use vibe.d on small commercial product, and two days ago I got this trouble.
One REST server (api.shodan.io) sent me a JSON like {..., serial: 17559991181826658461, ...}
I read the official JSON standard: http://json.org/
Unfortunately, it doesn't define bounds for a number value, thus 99999999999999999999999999 is a correct JSON value.
However the most real systems uses 64-bit (as maximum) signed and unsigned integers for describe integral values.
Thus, supporting of ulong is very needed for users.

I've added a new Type value: uint_, which represents ulong values.
I prefer int_ type instead of uint_ when it possible.
If the integral value x <= long.max, I write is to Json as int_ even if it hasn't the sign.

If this way is unacceptable, please suggest another way to parse ulong values.
We may convert it to long or to string.
Get an exception during parsing of correct JSON answer of a REST server is very, because I can't get the REST answer as string (using vibe.web.rest) and parse it manually.
Thanks.

@IgorStepanov IgorStepanov force-pushed the ulong-json branch 2 times, most recently from 0f12262 to abfdf9a Compare May 21, 2015 18:57
@Geod24
Copy link
Contributor

Geod24 commented May 21, 2015

Using ulong just pushes the limit, it doesn't really fix the problem. Me think supporting BigInt could definitely fix this. @s-ludwig ?

@@ -108,6 +110,7 @@ struct Json {
Null = null_, /// Compatibility alias - will be deprecated soon
Bool = bool_, /// Compatibility alias - will be deprecated soon
Int = int_, /// Compatibility alias - will be deprecated soon
Uint = uint_, /// Compatibility alias - will be deprecated soon
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt we need to add a compatibility alias for a new feature.

@etcimon
Copy link
Contributor

etcimon commented May 21, 2015

I also vote for a BigInt here

@s-ludwig
Copy link
Member

BigInt would also have the advantage of better forward compatibility with std_data_json. So I'd also be in favor of that solution.

@IgorStepanov
Copy link
Contributor Author

So, you are all prefer BigInt? Ok. Should it replace the long value, or there should be both types?
In the second case, how should work parsing of int values?

@s-ludwig
Copy link
Member

It should work similar to the current ulong solution. So all integral numbers fitting in long should still be stored as a long and anything bigger as a BigInt. The only issue is that the naive implementation of this (convert string to BigInt and the BigInt to long if possible) will probably have a non-trivial performance impact. But for cases where that matters, we could introduce a version (VibeJsonDisableBigInt) that uses the old string -> long parsing mode.

@IgorStepanov
Copy link
Contributor Author

Ok, I'll do it.

@IgorStepanov IgorStepanov force-pushed the ulong-json branch 3 times, most recently from 759fdcb to a73efec Compare May 21, 2015 23:52
@IgorStepanov
Copy link
Contributor Author

@s-ludwig done

@@ -92,28 +117,59 @@ struct Json {
}
}

version (VibeUseJsonBigInt) {
long bigIntToLong() inout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing private

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@s-ludwig
Copy link
Member

Thanks @IgorStepanov! I'm leaning towards making the version only apply to the parsing stage, but to keep general support for BigInt anyway. The current approach just seems to make it too likely to break foreign code (such as a final switch (json.type)). Thoughts?

Apart from my comments above, seems to look fine. I'll do a more thorough review later.

@IgorStepanov
Copy link
Contributor Author

I'm leaning towards making the version only apply to the parsing stage, but to keep general support for BigInt anyway. The current approach just seems to make it too likely to break foreign code (such as a final switch (json.type)). Thoughts?

Ok, I've changed the code. See the second commit.
However, If we talk only about parsing, we may simply check the number length in skipNumber, and construct BigInt only if it needed.
Note, JSON doesn't support non-trivial integer literals like octal, or hex thus parsing of JSON ints is so easy.
We may simply compute result while parsing a number (num *= 10; num += (s[idx]) = '0');) and check the overflow at each step. And no special version.

@marcioapm
Copy link
Contributor

Sorry for being late to the party, but what about supporting cent/ucent, instead? :)
At first sight it seems like a more natural extension than BigInt.

I'm not sure what the state of this is in DMD but I remember it was worked on recently for LDC.

Seems like a bad spot to be in for performance sensitive scenarios - deciding between not supporting values >= 2^63 and < 2^64 or using BigInts which require extra heap storage currently, whereas the cent would fit in the Json value's buffer - converting cent back to ulong is also pretty fast.

@IgorStepanov
Copy link
Contributor Author

@marcioapm cent isn't supported in dmd and it doesn't solve the general problem: json standard allows an unlimited numbers.
For the floating point numbers a trouble is less essential: all float types allow to represent any real number (-inf, +inf) with some error.
For an integer number json, current parser may reject a correct value which greater then long.max. I is not good.

@IgorStepanov
Copy link
Contributor Author

@s-ludwig I've add an overflow check to skipNumber, and removed version (VibeJsonDisableBigInt):
now I build a BigInt value only if number is greater then long.max.

@@ -108,12 +128,12 @@ struct Json {
Null = null_, /// Compatibility alias - will be deprecated soon
Bool = bool_, /// Compatibility alias - will be deprecated soon
Int = int_, /// Compatibility alias - will be deprecated soon
BigInt = bigint, /// Compatibility alias - will be deprecated soon
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, removed.

@IgorStepanov IgorStepanov force-pushed the ulong-json branch 2 times, most recently from 19d39e8 to 6496323 Compare May 25, 2015 21:46
@Geod24
Copy link
Contributor

Geod24 commented May 25, 2015

Could you squash your commits into a single one ?

@@ -17,4 +17,12 @@ void main()
Json parent = obj;
parent.remove("item1");
foreach (i; obj) writeln(i);

version (VibeJsonDisableBigInt) {
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seem to be the only occurence of the version left.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. I'll remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@IgorStepanov
Copy link
Contributor Author

Could you squash your commits into a single one ?

All commits represents different approaches. If the my last approach will be considered best, I squash commits. Otherwice, it will be much easy to rollback to previous version.

@@ -181,6 +199,8 @@ struct Json {
/// ditto
long opAssign(long v) { m_type = Type.int_; m_int = v; return v; }
/// ditto
BigInt opAssign(BigInt v) { m_type = Type.bigint; initBigInt(); m_bigint = v; return v; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK how BigInt are implemented, but I can see the cycle: 'blitting the default value' -> 'assigning data' -> 'Do it again' as potentially problematic: if BigInt use malloced memory, it's more likely that on assignement, either malloc or free (or both) happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK how BigInt are implemented, but I can see the cycle

Default value may be only static value, because BigInt is a struct, and BigInt.init should be known at compile time.
Thus everytime, a, b and c in the following example has a identical (bitwise) value. Thus I simply write BigInt.init to the m_data.

    BigInt a;
    BigInt b;
    BigInt c;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was refering to was, if someone calls opAssign with a BigInt value multiple time, the finalizers of all but the last 'instance' will not be run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case is fixed.
initBigInt() is called only if old m_type in assign is not Type.bigint
However there are another situation:
Json j = BigInt(...);
j = 5;
I may add a special destructor for it, however BigInt doesn't have ~this, I've checked id.
What do you think?

@Geod24
Copy link
Contributor

Geod24 commented May 26, 2015

All commits represents different approaches. If the my last approach will be considered best, I squash commits. Otherwice, it will be much easy to rollback to previous version.

That request is just to make the history easier to understand. I often find myself blameing a file, to see when the code was introduced, and why. Having to go through various modifications for the same P.R. just eats a lot of time (same goes for history).

As per the rollback, with git, you can easily checkout a branch that points to your current commit, and squash them together. git never destroys old objects - unless nothing points to it, and they are gc'ed, which is ~ 2 weeks later, but won't happen with another branch -.

@IgorStepanov IgorStepanov force-pushed the ulong-json branch 3 times, most recently from e9d66b6 to 02de141 Compare May 26, 2015 00:55
BigInt opAssign(BigInt v)
{
if (m_type != Type.bigint)
initBigInt();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, in the following code:

Json bi;
bi = BigInt("42");
bi = 42L;
bi = BigInt("42");

The bug is still present, isn't it ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've fixed it. (see the below description)

@s-ludwig
Copy link
Member

@IgorStepanov WRT union vs void*[...], the issues I had were memory leaks, but also it broke using std.algorithm.swap (and thus std.sort etc.). void* needs to be used instead of ubyte to tell the GC to look for pointers within the array. If we'd store for example a Json[] within an ubyte[] array, it may simply get collected even if it's still referenced.

@s-ludwig
Copy link
Member

@Geod24 The destructor issue is actually a good point. In fact, the same issue is in std_data_json/JSONNumber and nobody has noticed it yet (probably because it looks so natural with the union syntax). Maybe it would be worth to at least add a warning when a type with destructor is used within a union and that union is stored in a struct without destructors (same for postblit and possibly opAssign).

Also, I'll have to find the tagged union implementation for Phobos again that I had started some time ago and submit a PR. Algebraic is (not so) nice for many things, but often it's simply desirable to work with a type enum instead of TypeInfos and one shouldn't have to mess with this kind of low level stuff in regular code.

@IgorStepanov
Copy link
Contributor Author

@s-ludwig @Geod24 I've fixed BigInt destruct issue.
Now when we try to call opAssign for BigInt objects, and save non-BigInt to it, finiBigInt is called.
finiBigInt simply creates an automatic BigInt init_ variable, and swaps it with the stored BigInt value.
When finiBigInt is finished, old BigInt value will be destroyed with init_, and m_bigint now contains a BigInt.init value, which can be rewritten with any non-BigInt value without any possible leaks.

@s-ludwig
Copy link
Member

s-ludwig commented Jun 2, 2015

Looks good to me now. I'll just make some minor, mostly cosmetic, adjustments after merge. Does anyone else still see any issue?

@Geod24
Copy link
Contributor

Geod24 commented Jun 2, 2015

Well... I look forward to std.data.json :D

@s-ludwig
Copy link
Member

s-ludwig commented Jun 3, 2015

Good thing is that this change improves compatibility with std.data.json.

Merging now. Thanks Igor for your work!

s-ludwig added a commit that referenced this pull request Jun 3, 2015
@s-ludwig s-ludwig merged commit 309ea82 into vibe-d:master Jun 3, 2015
s-ludwig added a commit that referenced this pull request Jun 3, 2015
- Json.Type.bigint -> "bigInt" (to match "BigInt")
- Move private methods to the bottom of struct Json
- Move type check into finiBigInt and rename to runDestructors
@s-ludwig
Copy link
Member

s-ludwig commented Jun 3, 2015

Okay, got the changes in. The only interface change has been to rename Type.bigint to Type.bigInt.

@IgorStepanov
Copy link
Contributor Author

Okay, got the changes in. The only interface change has been to rename Type.bigint to Type.bigInt.

Thanks!
BTW, when do you plan to release the next vibe.d version (with this patch)? I mean, that when I'll able to fetch it with dub.
I use vibe.d in a commercial product, and if the next release will not be soon, I'll write temporary code to handle this issue until the next version will be released.

@s-ludwig
Copy link
Member

s-ludwig commented Jun 3, 2015

I could tag an alpha/beta release right away.

@s-ludwig
Copy link
Member

s-ludwig commented Jun 3, 2015

Okay 0.7.24-beta.1 is tagged now.

@IgorStepanov
Copy link
Contributor Author

Okay 0.7.24-beta.1 is tagged now.

Thanks a lot! :)

@Geod24 Geod24 mentioned this pull request Jun 8, 2015
@MartinNowak
Copy link
Contributor

???
Javascript doesn't have integer types at all, it uses double-precision (64-bit) float for any Number.
math - What is JavaScript's highest integer value that a Number can go to without losing precision? - Stack Overflow
Try alert(01234567890123456789);.

@s-ludwig
Copy link
Member

I don't think that this really matters, since this is JSON and not JavaScript. JSON doesn't specify anything about the numerical representation, so the implementation is basically free to choose whatever fits. Since large integers are used in practice, it would be quite arbitrary to not support them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants