Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change mod for Int #161

Closed
Thimoteus opened this issue Feb 17, 2018 · 21 comments
Closed

Change mod for Int #161

Thimoteus opened this issue Feb 17, 2018 · 21 comments

Comments

@Thimoteus
Copy link
Contributor

For example, -3 mod 9 should be 6 (since 6 - 9 = -3) but in PS we have mod (-3) 9 == (-3), which ultimately stems from JS's -3 % 9 === -3

Some people suggest using the following FFI:

function mod(n, m) = {
  return ((n % m) + m) % m;
}

References:
http://www.seqmedia.com/2011/03/16/javascript-negative-numbers-modulus-bug/
https://stackoverflow.com/questions/4467539/javascript-modulo-gives-a-negative-result-for-negative-numbers

@garyb
Copy link
Member

garyb commented Feb 17, 2018

Hmm interesting, as doing this would break our current Quotient/remainder law:

a = -3
b = 9
q = a / b = 0
r = a `mod` b = -3
a' = q * b + r = -3 = a

vs

r = a `mod` b = 6
a' = q * b + r = 6 /= a

@hdgarrood
Copy link
Contributor

We can have this version of mod and still abide by the quotient/remainder law by changing div appropriately. In fact the suggested version of mod is a lot more common in mathematics.

We’d just have to change div so that eg (-3) / 9 == (-9). Which is perhaps a bit weird, but then again so is the current mod. The way to think of it is that div a b gives you the largest multiple of b less than a.

@hdgarrood
Copy link
Contributor

See https://en.m.wikipedia.org/wiki/Modulo_operation. The suggested version is what is called “Euclidean division” on that page. Arguably that’s the most senate option given that the class is called EuclideanRing.

@hdgarrood
Copy link
Contributor

*sensible

@garyb
Copy link
Member

garyb commented Feb 17, 2018

I agree the proposed mod seems more sensible, I was just pointing out it's problematic for the rules it's supposed to be abiding by 🙂

The revised div seems extremely surprising to me, I'd probably rather we reconsidered the laws yet again than have that as the behaviour as I doubt it's what most people would assume for the term of "division" (a good example of why "friendly" names aren't great 😉).

@hdgarrood
Copy link
Contributor

hdgarrood commented Feb 17, 2018

Oops, sorry, I made a mistake - we’d actually have the revised div return the number you need to multiply the divisor by to get the largest multiple less than the dividend. So you’d have (-3) / 9 == (-1), so that ((-3) / 9) * 9 + ((-3) `mod` 9) = (-1) * 9 + 6 = (-3).

I disagree, I think this is an entirely sensible notion of division. It’s only that most programming languages do it the silly way, in that x mod n can take up to 2n-1 different values, and people are used to this.

@hdgarrood
Copy link
Contributor

I recommend reading http://a-guide-to-the-purescript-numeric-hierarchy.readthedocs.io/en/latest/, specifically the chapters on the Euclidean algorithm, on polynomials, and on Euclidean rings to see how this is a very natural and powerful way of generalising the idea of integer division.

@garyb
Copy link
Member

garyb commented Feb 17, 2018

You don't have to convince me that doing the lawful thing is right, I just worried about assigning not-most-layman-obvious behaviours to operations that everyone thinks they understand already. We have very few of those, just the arithmetic ones really. mod already gets us in trouble there a bit given than it's 0 for every Number - but that's a JS expectation, whereas I feel like just about everyone "knows" what division means.

Having said all that, -1 is a much less surprising result 😛, so I probably am fine with this. I assume 3 / 9 is 0 still?

@hdgarrood
Copy link
Contributor

Yep, 3/9 would still be 0. I’m pretty sure everything would be exactly the same when both arguments are nonnegative.

@hdgarrood
Copy link
Contributor

It's worth noting that once you have chosen either of div or mod, you don't have any choice in the other; say you've chosen your div - then, the laws force you to define

a `mod` b = a - (a `div` b) * b

(or equivalent). Alternatively, suppose you've chosen your mod, then you have to define a `div` b so that

(a `div` b) * b = a - a `mod` b

holds, and since we are working in an integral domain (this is a prerequisite to have a EuclideanRing instance), there can only be one choice of a `div` b which satisfies the above equation, by the cancellation law (which says that if xy=xz and x is not zero, then y=z).


There's one more alternative we should probably consider, which is where division rounds towards negative infinity. This has the property that a `mod` b takes the sign of b, which means that the function (_ `mod` b) can only take abs b different values, which I think gets you most of the way there, although it's possibly not quite as useful as guaranteeing that the result of mod is always nonnegative.

That leaves us with:

  • "T-definition", truncated division -- this is what we currently have:

    • div rounds towards zero
    • a `mod` b takes the sign of a
  • "F-definition", floored division -- this is what div and mod in Haskell do:

    • div rounds towards negative infinity
    • a `mod` b takes the sign of b
  • "E-definition", Euclidean division:

    • a `div` b rounds towards negative infinity when b is positive, and towards positive infinity when b is negative
    • a `mod` b is always nonnegative

where I'm using the same names as Boute does in his paper which discusses this exact topic [1]. (I accessed it through my university; I'm not sure how to access it if you're not a student, unfortunately.)

The first thing to note is that all three definitions are equivalent when both divisor and dividend are nonnegative (if I write a `div` b, then a is the dividend and b is the divisor). Also, the E- and F- definitions are equivalent when the divisor is positive.

The only thing going for the T-definition is that it has the property that (-a) `div` b = -(a `div` b); I'm not sure how useful this is, but it does at least make division with negative dividends easier to explain. With the E- and F-definitions, you have a slightly more complicated property; the above property only holds when a goes into b exactly, and otherwise the two quantities will have a difference of 1.

Both the E- and F-definitions have the useful property that a `mod` m == b `mod` m if and only if a and b are actually congruent modulo m, whereas the T-definition doesn't have this. The E- definition, however, goes a bit further: it is the only one which guarantees that the result of mod is nonnegative.

For me personally, this means that (imo) we should only be considering the E- and F-definitions for the default EuclideanRing Int instance, and we should provide separate functions and possibly a newtype wrapper for anyone wanting the T-definition (Haskell does this, by having quot and rem provide the T-definition).

The F-definition has one advantage over the E-definition in that plenty of existing programming languages implement it already; for example, Haskell (div/mod), Lua, Python, and Ruby. However there are also some which use the E-definition, e.g. Dart and the Z3 theorem prover.

I would be happy with either the E- or F- definitions. The suggested FFI code yields the F-definition. I think my preference would be the E-definition, but I'll mull it over.

[1]: "The Euclidean definition of the functions div and mod", Raymond T. Boute, ACM Transactions on Programming Languages and Systems, Vol 14, No. 2, April 1992, pages 127-144.

@Thimoteus
Copy link
Contributor Author

Thanks @hdgarrood for this amazing research. I'm also leaning towards E based on what you've presented, and agree that E and F are both better than T.

@garyb
Copy link
Member

garyb commented Feb 23, 2018

E appeals to me too, I wonder why F is apparently more common.

There is one slight problem with whatever we do here: we'll need to encode/update the behaviour in the inliner too I think.

@garyb
Copy link
Member

garyb commented Apr 13, 2018

I guess we should consider doing this now? I'm not too sure how to deal with it on the compiler side of things, since it will be a bit weird if using the 0.12 compiler with Prelude v3.x behaves differently than when using 0.11.

@LiamGoodacre
Copy link
Member

I think in this case all we can do document the behaviour change. If people are expecting things to just work across major versions without checking what's changed, they're likely in for a bad time. Not sure there's anything better?

@garyb
Copy link
Member

garyb commented Apr 14, 2018

That works for me, hopefully most people will be updating library dependencies along with the compiler, and I guess in some instances that'll be required even. Plus this is somewhat undefined behaviour right now, and only affects mod with negative operands so probably will be low impact.

@garyb
Copy link
Member

garyb commented Apr 14, 2018

Looking at the various options again, I think the E definition is probably the easiest to implement too, since div behaves that way already with the JS we're using for integers ((x / y) | 0):

(2 / 3) | 0 == 0
(2 / -3) | 0 == 0

(That's right for E - it'd be 2 / -3 = -1 for F, right?)

edit: moved sign to RHS

@hdgarrood
Copy link
Contributor

Not quite - the JS version is the T-definition, and T happens to coincide with E when the dividend is positive. However with negative dividends they won’t match. For example, you’ll have -2/3 being -1 for the E-definition but 0 for the T-definition.

@garyb
Copy link
Member

garyb commented Apr 15, 2018

Hmm, I see 🤔

@hdgarrood
Copy link
Contributor

Would you mind if I take what you've done so far and build on it in a separate PR? I think this change is probably important enough that it deserves its own PR.

@garyb
Copy link
Member

garyb commented Apr 15, 2018

Certainly, please do! I'm sure you have a better understanding of this than I do, just thought I'd get something moving with it.

hdgarrood added a commit that referenced this issue Apr 15, 2018
Also provide `quot` and `rem`, like Haskell does, for users who do want
truncating division - the one which matches what JS does.

I've temporarily exported `intDiv` and `intMod` so that I can use those
in the tests and the compiler won't 'inline' different definitions of
them; we'll want to modify the compiler to change this before merging.
garyb added a commit that referenced this issue Apr 22, 2018
Switch to Euclidean division for Int, resolves #161
@garyb
Copy link
Member

garyb commented Apr 30, 2018

Closing this as it's resolved in the 0.12 branch.

@garyb garyb closed this as completed Apr 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants