Tests for `nqp::coerce_si` #759

nwc10 · 2022-01-29T13:40:44Z

coerce_si is not sufficiently tested. This draft PR asks what we should test (and hence specify)

Currently with MoarVM master we see this:

Iteration past end of grapheme iterator
   at -e:1  (<ephemeral file>:<mainline>)
 from gen/moar/stage2/NQPHLL.nqp:1949  (/home/nick/Perl/nqp/NQPHLL.moarvm:eval)
 from gen/moar/stage2/NQPHLL.nqp:2059  (/home/nick/Perl/nqp/NQPHLL.moarvm:)
 from gen/moar/stage2/NQPHLL.nqp:2114  (/home/nick/Perl/nqp/NQPHLL.moarvm:command_eval)
 from gen/moar/stage2/NQPHLL.nqp:2039  (/home/nick/Perl/nqp/NQPHLL.moarvm:command_line)
 from gen/moar/stage2/NQP.nqp:4111  (nqp.moarvm:MAIN)
 from gen/moar/stage2/NQP.nqp:1  (nqp.moarvm:<mainline>)
 from <unknown>:1  (nqp.moarvm:<main>)
 from <unknown>:1  (nqp.moarvm:<entry>)

which is a change in behaviour from all releases. This is caused by MoarVM/MoarVM@b80996f (and will soon be fixed)

However, it's unclear what the desired behaviour is for corner cases. The tests here all pass once MoarVM is fixed. However, they would fail on released nqp

MoarVM used to use strtoll, which returns 0 on parsing failures. However, the call MVM_string_ascii_encode_any would throw an exception for any character outside of 0-127, hence the tests here fail like this:

ok 205 - coerce_si(chr 127) is 0
Error encoding ASCII string: could not encode codepoint 128
   at /home/nick/test/109-coercions.t:106  (<ephemeral file>:<mainline>)
 from gen/moar/stage2/NQPHLL.nqp:1949  (/home/nick/Perl/nqp/NQPHLL.moarvm:eval)
 from gen/moar/stage2/NQPHLL.nqp:2154  (/home/nick/Perl/nqp/NQPHLL.moarvm:evalfiles)
 from gen/moar/stage2/NQPHLL.nqp:2114  (/home/nick/Perl/nqp/NQPHLL.moarvm:command_eval)
 from gen/moar/stage2/NQPHLL.nqp:2039  (/home/nick/Perl/nqp/NQPHLL.moarvm:command_line)
 from gen/moar/stage2/NQP.nqp:4111  (nqp.moarvm:MAIN)
 from gen/moar/stage2/NQP.nqp:1  (nqp.moarvm:<mainline>)
 from <unknown>:1  (nqp.moarvm:<main>)
 from <unknown>:1  (nqp.moarvm:<entry>)

JVM is much stricter, failing at the first new test:

ok 28 - int32 to str conversion
java.lang.NumberFormatException: For input string: "    "
  in <mainline> (/home/nick/test/109-coercions.t:109)
  in eval (gen/jvm/stage2/NQPHLL.nqp:1217)
  in evalfiles (gen/jvm/stage2/NQPHLL.nqp:1461)
  in command_eval (gen/jvm/stage2/NQPHLL.nqp:1351)
  in command_line (gen/jvm/stage2/NQPHLL.nqp:1310)
  in MAIN (gen/jvm/stage2/NQP.nqp:4191)
  in <mainline> (gen/jvm/stage2/NQP.nqp:4187)
  in <anon> (gen/jvm/stage2/NQP.nqp)

@MasterDuke17 reports that the relevant conversion API we use is very exacting:

The characters in the string must all be decimal digits except that the first character may be an ASCII minus sign '-' (\u002D') to indicate a negative value or an ASCII plus sign '+' ('\u002B') to indicate a positive value.

Hence any spaces are right out:

$ ./nqp-j -e 'nqp::say(nqp::coerce_si(" 42"));'
java.lang.NumberFormatException: For input string: " 42"
  in <mainline> (-e:1)
  in eval (gen/jvm/stage2/NQPHLL.nqp:1217)
  in <anon> (gen/jvm/stage2/NQPHLL.nqp:1329)
  in command_eval (gen/jvm/stage2/NQPHLL.nqp:1326)
  in command_line (gen/jvm/stage2/NQPHLL.nqp:1310)
  in MAIN (gen/jvm/stage2/NQP.nqp:4191)
  in <mainline> (gen/jvm/stage2/NQP.nqp:4187)
  in <anon> (gen/jvm/stage2/NQP.nqp)

Hence the question is - what should the spec be?

I'm happy to recode either MoarVM or the Java code (or both) to be consistent, but I'd like to have a more tightly defined spec of how much slop we're allowed in the number, and for which cases we throw an exception, which cases we return 0, which leading characters we can ignore, and which cases we just ignore "trailing garbage"

BloomingAzaleas · 2022-02-01T19:31:32Z

"Hence the question is - what should the spec be?

I'm happy to recode either MoarVM or the Java code (or both) to be consistent, but I'd like to have a more tightly defined spec of how much slop we're allowed in the number, and for which cases we throw an exception, which cases we return 0, which leading characters we can ignore, and which cases we just ignore 'trailing garbage'"

I suggest a way to think about this should be in terms of layers based on the observation that the lower the layer, the more difficult and hidden it is to diagnose the sloppiness of a spec and the more widespread and opaque the symptoms. The lowest layer here should have the tightest spec, but part of that spec should be a statement or contract expectation on the next layer(s) up the expected usage call stack. I have only a vague grasp of the stack here but it seems to be approximately raku on nqp on moar/jvm, the latter being the bytecode engines at the lowest level. In this case of string to numeric conversion, likely a hot code path for almost everything, I suggest the bytecode engines should only ever have to deal with 'well-formed' string input with the goal of absolutely minimizing code complexity (meaning detecting and dealing with featuritus cases) in the bytecode engines. NQP (or higher?), the next level up in abstraction, would then offer coerce_possibly_sloppy_numeric_strings_xx () using code at a higher level of abstration (and thus presumably more transparent and diagnosable) to deal with any featuritus complexity you might want to offer.

In this model, you deliberately consider complexity elements and trade-off allocate them to appropriate layers in the stack. Your complete solution would then be (possibly) mods to the bytecode engines paired with (definitely) contract mods to NQP.

Regards.

Tests for nqp::coerce_si

1feccc9

nwc10 mentioned this pull request Jan 29, 2022

Inline ascii ctype MoarVM/MoarVM#1657

Merged

coke changed the base branch from master to main April 19, 2023 13:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests for `nqp::coerce_si` #759

Tests for `nqp::coerce_si` #759

nwc10 commented Jan 29, 2022

BloomingAzaleas commented Feb 1, 2022

Tests for nqp::coerce_si #759

Are you sure you want to change the base?

Tests for nqp::coerce_si #759

Conversation

nwc10 commented Jan 29, 2022

BloomingAzaleas commented Feb 1, 2022

Tests for `nqp::coerce_si` #759

Tests for `nqp::coerce_si` #759