Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow alternative space characters as group separator when parsing numbers #1007

Merged
merged 2 commits into from
Apr 22, 2024

Conversation

ronnix
Copy link
Contributor

@ronnix ronnix commented Jun 8, 2023

Context

The French group separator is "\u202f" (narrow non-breaking space), but when parsing numbers in the real world, you will most often encounter either a regular space character (" ") or a non-breaking space character ("\xa0").

The issue was partially adressed earlier in #637, but only to allow regular spaces instead of non-breaking spaces "\xa0" in parse_decimal.

Contents

This PR goes further by changing both parse_number and parse_decimal to allow any other space character (using the \s character class of regular expressions) any of those 3 space characters when the expected group symbol is itself one such space character, but is not present in the string to parse.

Unit tests are included.

Copy link
Member

@akx akx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Some perf-related comments within.

babel/numbers.py Outdated Show resolved Hide resolved
babel/numbers.py Outdated Show resolved Hide resolved
@ronnix ronnix force-pushed the allow-other-spaces-as-group-separator branch 2 times, most recently from cab7135 to 7fc29f6 Compare November 9, 2023 11:08
@akx
Copy link
Member

akx commented Apr 16, 2024

@ronnix Sorry for dropping the ball on my part here... Could you rebase this? :)

@ronnix ronnix force-pushed the allow-other-spaces-as-group-separator branch from 7fc29f6 to b213b51 Compare April 16, 2024 12:57
@ronnix
Copy link
Contributor Author

ronnix commented Apr 17, 2024

@akx done!

…mbers

The French group separator is `"\u202f"` (narrow non-breaking space),
but when parsing numbers in the real world, you will most often encounter
either a regular space character (`" "`) or a non-breaking space character
(`"\xa0"`).

The issue was partially adressed earlier in python-babel#637,
but only to allow regular spaces instead of non-breaking spaces `"\xa0"` in
`parse_decimal`.

This commit goes further by changing both `parse_number` and `parse_decimal`
to allow any other space character (using the `\s` character class of
regular expressions) when the group character is itself a space character,
but is not present in the string to parse.

Unit tests are included.
… as group symbols

The `\s` character class (or the `string.isspace()` method) could match
characters like new lines that we probably don’t want to consider as potential
group symbols in numbers.
@ronnix ronnix force-pushed the allow-other-spaces-as-group-separator branch from b213b51 to 181d701 Compare April 18, 2024 08:00
@ronnix
Copy link
Contributor Author

ronnix commented Apr 18, 2024

@akx I fixed the linter error

Copy link

codecov bot commented Apr 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.69%. Comparing base (e0d1018) to head (181d701).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1007      +/-   ##
==========================================
- Coverage   90.99%   90.69%   -0.31%     
==========================================
  Files          26       26              
  Lines        4444     4449       +5     
==========================================
- Hits         4044     4035       -9     
- Misses        400      414      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@akx akx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM! 😃

@akx akx merged commit c0fb56e into python-babel:master Apr 22, 2024
22 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants