Skip to content
This repository has been archived by the owner on Aug 31, 2023. It is now read-only.

Normalize numeric literals #2442

Closed
Tracked by #2403
MichaReiser opened this issue Apr 14, 2022 · 6 comments
Closed
Tracked by #2403

Normalize numeric literals #2442

MichaReiser opened this issue Apr 14, 2022 · 6 comments
Assignees
Labels
A-Formatter Area: formatter I-Normal Implementation: normal understanding of the tool and awareness

Comments

@MichaReiser
Copy link
Contributor

MichaReiser commented Apr 14, 2022

Prettier normalizes numeric literals, e.g. by removing unnecessarily trailing zeros but Rome doesn't. Playground

Input

// Add 0
.1
// Remove .
1.

// B -> b, O -> o, X -> x
0B1;
0O1;
0X1;

// X -> x, HEX digits to lowercase
0X123abcdef456ABCDEF

// E -> e
1.1_0_1E1;
// Remove +
1e+1;

// Remove .
1.e1;
// To 0.1e1
.1e1;

// Remove leading 0
1.1e0010
// Remove +, add leading 0
.1e+0010
// Add leading 0, remove unnecessarily leading 0
.1e-0010

// Simplify to 0.5
0.5e0;
0.5e00;
0.5e+0;
0.5e+00;
0.5e-0;
0.5e-00;

// Trim trailing zeros
1.00500;

// Add 0
.1_1;
// A 
0xa_1;
 // X -> x, A -> a
0XA_1;

Prettier

// Add 0
0.1;
// Remove .
1;

// B -> b, O -> o, X -> x
0b1;
0o1;
0x1;

// X -> x, HEX digits to lowercase
0x123abcdef456abcdef;

// E -> e
1.1_0_1e1;
// Remove +
1e1;

// Remove .
1e1;
// To 0.1e1
0.1e1;

// Remove leading 0
1.1e10;
// Remove +, add leading 0
0.1e10;
// Add leading 0, remove unnecessarily leading 0
0.1e-10;

// Simplify to 0.5
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;

// Trim trailing zeros
1.005;

// Add 0
0.1_1;
// A
0xa_1;
// X -> x, A -> a
0xa_1;

Rome

// Add 0
.1;
// Remove .
1.;

// B -> b, O -> o, X -> x
0B1;
0O1;
0X1;

// X -> x, HEX digits to lowercase
0X123abcdef456ABCDEF;

// E -> e
1.1_0_1E1;
// Remove +
1e+1;

// Remove .
1.e1;
// To 0.1e1a
.1e1;

// Remove leading 0
1.1e0010;
// Remove +, add leading 0
.1e+0010;
// Add leading 0, remove unnecessarily leading 0
.1e-0010;

// Simplify to 0.5
0.5e0;
0.5e00;
0.5e+0;
0.5e+00;
0.5e-0;
0.5e-00;

// Trim trailing zeros
1.00500;

// Add 0
.1_1;
// A
0xa_1;
// X -> x, A -> a
0XA_1;

Expected

Rome to normalize octal, hex, and byte escapes as well as exponential formats the same as Prettier does.

@Boshen
Copy link
Contributor

Boshen commented Apr 14, 2022

This normalization is actually number-to-string from the spec.

I think we can use https://crates.io/crates/ryu-js, it's from https://github.com/boa-dev/boa

Ryū-js is a fork of the ryu crate adjusted to comply to the ECMAScript number-to-string algorithm.

@MichaReiser
Copy link
Contributor Author

This normalization is actually number-to-string from the spec.

I think we can use https://crates.io/crates/ryu-js, it's from https://github.com/boa-dev/boa

Ryū-js is a fork of the ryu crate adjusted to comply to the ECMAScript number-to-string algorithm.

This library accepts a float and converts it to a string. The formatter on the other hand operates on a string input. The library also looks rather complicated, I was hoping we would get away with something less sophisticated. For example, all that prettier does is to run some Regular Expressions:

  return (
    rawNumber
      .toLowerCase()
      // Remove unnecessary plus and zeroes from scientific notation.
      .replace(/^([+-]?[\d.]+e)(?:\+|(-))?0*(\d)/, "$1$2$3")
      // Remove unnecessary scientific notation (1e0).
      .replace(/^([+-]?[\d.]+)e[+-]?0+$/, "$1")
      // Make sure numbers always start with a digit.
      .replace(/^([+-])?\./, "$10.")
      // Remove extraneous trailing decimal zeroes.
      .replace(/(\.\d+?)0+(?=e|$)/, "$1")
      // Remove trailing dot.
      .replace(/\.(?=e|$)/, "")
  );

@ematipico
Copy link
Contributor

Do you think we could achieve that with only string manipulation? We can't use regex unfortunately (it's better not to)

@Boshen
Copy link
Contributor

Boshen commented Apr 14, 2022

hmm ... number parsing hasn't been used anywhere yet, maybe this is the time? I'm afraid that any adhoc algorithm will eventually end up similar to the number-to-string algorithm.

impl JsNumberLiteralExpression {
pub fn as_number(&self) -> Option<f64> {
parse_js_number(self.value_token().unwrap().text())
}
}

@MichaReiser
Copy link
Contributor Author

Do you think we could achieve that with only string manipulation? We can't use regex unfortunately (it's better not to)

Sure. A RegEx is just a state machine. The only question is how much code it requires. But I haven't looked into the implementation yet. I'm only concerned with finding all discrepancies.

@Boshen I'm not sure if this is implementing the toString algorithm. For example:

0x01.toString() 

Returns 1 but Prettier prints it as 0x01 because that's closest to what the user wrote.

@xunilrj xunilrj self-assigned this Apr 23, 2022
@ematipico ematipico added the I-Normal Implementation: normal understanding of the tool and awareness label May 5, 2022
@MichaReiser
Copy link
Contributor Author

MichaReiser commented Oct 11, 2022

Duplicate of #3294

@MichaReiser MichaReiser marked this as a duplicate of #3308 Oct 11, 2022
@MichaReiser MichaReiser moved this to Done in Rome 2022 Oct 11, 2022
@MichaReiser MichaReiser marked this as a duplicate of #3294 Oct 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Formatter Area: formatter I-Normal Implementation: normal understanding of the tool and awareness
Projects
Status: Done
Development

No branches or pull requests

4 participants