Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datatype.number does not always generate numbers with expected precision #331

Closed
pkuczynski opened this issue Jan 28, 2022 · 9 comments · Fixed by #2581
Closed

datatype.number does not always generate numbers with expected precision #331

pkuczynski opened this issue Jan 28, 2022 · 9 comments · Fixed by #2581
Labels
c: bug Something isn't working help wanted Extra attention is needed m: datatype Something is referring to the datatype module p: 1-normal Nothing urgent s: accepted Accepted feature / Confirmed bug
Milestone

Comments

@pkuczynski
Copy link
Member

Describe the bug

Due to the limitations of float arithmetics the precision param is not always well reflected in the generated output.

Reproduction

> faker.datatype.number({ precision: 0.01 })
64246.18
> faker.datatype.number({ precision: 0.001 })
75598.72
> faker.datatype.number({ precision: 0.0001 })
15719.824
> faker.datatype.number({ precision: 0.00001 })
57268.07910000001

Additional Info

Suggested solution would be to use a better math library offering arbitrary-precision decimal arithmetic, eg. https://www.npmjs.com/package/big.js

@pkuczynski pkuczynski added s: pending triage Pending Triage c: bug Something isn't working labels Jan 28, 2022
@github-actions github-actions bot removed the s: pending triage Pending Triage label Jan 28, 2022
@Shinigami92 Shinigami92 added this to the v6.1 - First bugfixes milestone Jan 28, 2022
@luciferreeves
Copy link
Contributor

luciferreeves commented Jan 28, 2022

After I had some look into this, I noticed this only happens with precision of 0.0001. Then I came up with some simple code to just truncate the extra digits from the output. This way, we won't have to use Math.round() or some library and the results will work with seed too.

The code is in javascript, and probably not perfect and certainly a hacky approach, but it does cover all edge cases:

function getDecimals(a) {
  if (!isFinite(a)) return 0;
  let e = 1, p = 0;
  while (Math.round(a * e) / e !== a) { e *= 10; p++; }
  return p;
}

function toFixed(x) {
  if (Math.abs(x) < 1.0) {
    const e = parseInt(x.toString().split('e-')[1]);
    if (e) {
        x *= Math.pow(10,e-1);
        x = '0.' + (new Array(e)).join('0') + x.toString().substring(2);
    }
  } else {
    const e = parseInt(x.toString().split('+')[1]);
    if (e > 20) {
        e -= 20;
        x /= Math.pow(10,e);
        x += (new Array(e+1)).join('0');
    }
  }
  return x;
}

function toFixedTrunc(x, n) {
  x = toFixed(x) 
  const v = (typeof x === 'string' ? x : x.toString()).split('.');
  if (n <= 0) return v[0];
  let f = v[1] || '';
  if (f.length > n) return `${v[0]}.${f.substr(0,n)}`;
  while (f.length < n) f += '0';
  return `${v[0]}.${f}`;
}

const { faker } = require('@faker-js/faker');
const precision = 0.00001;
const decimals = getDecimals(precision);

function generate() {
  const generatedNumber = faker.datatype.number({ precision });
  return getDecimals(generatedNumber) <= decimals ? generatedNumber : toFixedTrunc(generatedNumber, decimals)
}

let counter = 0;
while(counter < 10) {
  console.log(generate())
  counter++;
}


/*
Output: 

32320.83572
37150.50227
74120.61501
97321.44276
86947.56555
98595.50214
24725.63345
40594.42884
921.41321
41778.05577
*/

repl here: https://replit.com/@luciferreeves/FrozenNiftyCategories#index.js

@pkuczynski
Copy link
Member Author

The issue is not only on 0.0001. Another example:

faker.datatype.number({ precision: 0.000000001 })
> 43583.241103992004

12 precision points, when 9 was requested. Big.js is not a large library and I think it does not make sense for us to battle with float arithmetics.

@luciferreeves
Copy link
Contributor

luciferreeves commented Jan 28, 2022

Even though the problem is with { precision: 0.000000001 } too, the code above should still work for all precision values. That's why I provided the repl, try changing the precision there. Here are the results with { precision: 0.000000001 }:

45543.57002466
54939.625723624
50.909680645
1796.946656181
43156.280539308
37206.544535025
77680.240670616
50875.845546187
61613.707753117
94161.122681139

Update:

I modified the generate method in the code above to take a precision:

function generate(precision) {
  const generatedNumber = faker.datatype.number({ precision });

  // This code from here goes into the faker.datatype.number implementation:
  const decimals = getDecimals(precision);
  return getDecimals(generatedNumber) <= decimals ? generatedNumber : toFixedTrunc(generatedNumber, decimals)
}

then, I added a simple test, generated 10000 numbers of the same precision and I did it for all precisions (upto 16 decimal places) and none of them fail:

// Here we test the precisions
let precision = 1;
let precisionArray = [];
let ctr = 0;
while(ctr !== 17) {
  precisionArray.push(precision);
  // this is equivalent of doing precision = precision/10 but with accurate results.
  precision = parseFloat(`0.${'0'.repeat(ctr)}1`);
  ctr += 1;
}

precisionArray.forEach(p => {
  process.stdout.write(`Precision: ${p.toExponential()} \t`)
  const fail = false;
  let counter = 1

  while(counter <= 10000) {
    const num = generate(precision);
    if (getDecimals(num) > getDecimals(precision)) {
      console.log(`Test ${counter} failed for precision ${precision}, generated number: ${num}.`)
      fail = true;
      break;
    }
    counter++;
  }

  if(!fail) {
    process.stdout.write(`PASS \n`)
  } else {
    process.stdout.write(`FAIL \n`)
  }
});

Output:

Precision: 1e+0     PASS 
Precision: 1e-1     PASS 
Precision: 1e-2     PASS 
Precision: 1e-3     PASS 
Precision: 1e-4     PASS 
Precision: 1e-5     PASS 
Precision: 1e-6     PASS 
Precision: 1e-7     PASS 
Precision: 1e-8     PASS 
Precision: 1e-9     PASS 
Precision: 1e-10    PASS 
Precision: 1e-11    PASS 
Precision: 1e-12    PASS 
Precision: 1e-13    PASS 
Precision: 1e-14    PASS 
Precision: 1e-15    PASS 
Precision: 1e-16    PASS 

I also updated the repl, so you can try this out yourself. I think this method fixes the bug.

@ST-DDT
Copy link
Member

ST-DDT commented Apr 5, 2022

How many digits of precision should we support?
And how thoroughly do we support these (e.g. precision such as 0.0003)

@Shinigami92
Copy link
Member

https://en.wikipedia.org/wiki/Double-precision_floating-point_format

IMO we should support a max precision of 15 digits, definitely 14 digits. Everything more precise I would highly suggest to use a different library that is more targeted to the use-case.

@ST-DDT
Copy link
Member

ST-DDT commented Apr 5, 2022

If we would limit ourselves to 12 digits, then a simple regex replace would do the job for us.

@pkuczynski
Copy link
Member Author

Any number would do for me as long as it works ;)

@import-brain import-brain added the s: needs decision Needs team/maintainer decision label Apr 7, 2022
@xDivisionByZerox xDivisionByZerox added the m: datatype Something is referring to the datatype module label Jul 30, 2022
@ST-DDT ST-DDT added help wanted Extra attention is needed s: accepted Accepted feature / Confirmed bug and removed s: needs decision Needs team/maintainer decision labels Sep 8, 2022
@clocke3
Copy link

clocke3 commented Jan 19, 2024

Heads up here, but I made this ticket a few weeks ago on the same matter. Are we expected to see this change in the next update?

@ST-DDT
Copy link
Member

ST-DDT commented Jan 19, 2024

Fixed by #2581 at least for precisions in the form of 10^-n.

This will be in the next release (v8.4).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: bug Something isn't working help wanted Extra attention is needed m: datatype Something is referring to the datatype module p: 1-normal Nothing urgent s: accepted Accepted feature / Confirmed bug
Projects
No open projects
Status: Todo
7 participants