Skip to content

Commit

Permalink
Parse negative numbers in Norwegian (and 59 other languages) (#290)
Browse files Browse the repository at this point in the history
The DecimalFormatSymbols for Norwegian and 59 other languages use the
minus-sign (unicode 8722) instead of the hyphen-minus sign (ascii 45).

While technically correct, Gherkin is written on regular keyboards and
there is no practical way to write a minus-sign. By patching the
`DecimalFormatSymbols` with a regular minus sign we solve this problem.

Additionally, for the same reason, the non-breaking space (ascii 160)
and right single quotation mark (unicode 8217) for thousands separators
are also patched with either a period or colon.

Fixes: #287
  • Loading branch information
mpkorstanje authored Mar 21, 2024
1 parent b3f0892 commit 3f535a2
Show file tree
Hide file tree
Showing 8 changed files with 220 additions and 15 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,11 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/)
and this project adheres to [Semantic Versioning](http://semver.org/).

## [Unreleased]
### Added
- [Java] Assume numbers use either a comma or period for the thousands separator instead of non-breaking spaces. ([#290](https://github.com/cucumber/cucumber-expressions/pull/290))

### Fixed
- [Java] Parse negative numbers in Norwegian (and 59 other languages) ([#290](https://github.com/cucumber/cucumber-expressions/pull/290))
- [Python] Remove support for Python 3.7 and extend support to 3.12 ([#280](https://github.com/cucumber/cucumber-expressions/pull/280))
- [Python] The `ParameterType` constructor's `transformer` should be optional ([#288](https://github.com/cucumber/cucumber-expressions/pull/288))

Expand Down
25 changes: 23 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,14 +65,35 @@ the following built-in parameter types:
| `{short}` | Matches the same as `{int}`, but converts to a 16 bit signed integer if the platform supports it. |
| `{long}` | Matches the same as `{int}`, but converts to a 64 bit signed integer if the platform supports it. |

### Cucumber-JVM
### Java

### The Anonymous Parameter

The *anonymous* parameter type will be converted to the parameter type of the step definition using an object mapper.
Cucumber comes with a built-in object mapper that can handle all numeric types as well as. `Enum`.

To automatically convert to other types it is recommended to install an object mapper. See [configuration](https://cucumber.io/docs/cucumber/configuration)
To automatically convert to other types it is recommended to install an object mapper. See [cucumber-java - Default Transformers](https://github.com/cucumber/cucumber-jvm/tree/main/cucumber-java#default-transformers)
to learn how.

### Number formats

Java supports parsing localised numbers. I.e. in your English feature file you
can format a-thousand-and-one-tenth as '1,000.1; while in French you would format it
as '1.000,1'.

Parsing is facilitated by Javas [`DecimalFormat`](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/text/DecimalFormat.html)
and includes support for the scientific notation. Unfortunately the default
localisation include symbols that can not be easily written on a regular
keyboard. So a few substitutions are made:

* The minus sign is always hyphen-minus - (ascii 45).
* If the decimal separator is a period (. ascii 46) the thousands separator is a comma (, ascii 44).
So '1 000.1' and '1’000.1' should always be written as '1,000.1'.
* If the decimal separator is a comma (, ascii 44) the thousands separator is a period (. ascii 46).
So '1 000,1' or '1’000,1' should always be written as '1.000,1'.

If support for your preferred language could be improved, please create an issue!

### Custom Parameter types

Cucumber Expressions can be extended so they automatically convert
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
package io.cucumber.cucumberexpressions;

import java.text.DecimalFormatSymbols;
import java.util.Locale;

/**
* A set of localized decimal symbols that can be written on a regular keyboard.
* <p>
* Note quite complete, feel free to make a suggestion.
*/
class KeyboardFriendlyDecimalFormatSymbols {

static DecimalFormatSymbols getInstance(Locale locale) {
DecimalFormatSymbols symbols = DecimalFormatSymbols.getInstance(locale);

// Replace the minus sign with minus-hyphen as available on most keyboards.
if (symbols.getMinusSign() == '\u2212') {
symbols.setMinusSign('-');
}

if (symbols.getDecimalSeparator() == '.') {
// For locales that use the period as the decimal separator
// always use the comma for thousands. The alternatives are
// not available on a keyboard
symbols.setGroupingSeparator(',');
} else if (symbols.getDecimalSeparator() == ',') {
// For locales that use the comma as the decimal separator
// always use the period for thousands. The alternatives are
// not available on a keyboard
symbols.setGroupingSeparator('.');
}
return symbols;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import java.math.BigDecimal;
import java.text.DecimalFormat;
import java.text.DecimalFormatSymbols;
import java.text.NumberFormat;
import java.text.ParseException;
import java.util.Locale;
Expand All @@ -14,6 +15,8 @@ final class NumberParser {
if (numberFormat instanceof DecimalFormat) {
DecimalFormat decimalFormat = (DecimalFormat) numberFormat;
decimalFormat.setParseBigDecimal(true);
DecimalFormatSymbols symbols = KeyboardFriendlyDecimalFormatSymbols.getInstance(locale);
decimalFormat.setDecimalFormatSymbols(symbols);
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ private ParameterTypeRegistry(ParameterByTypeTransformer defaultParameterTransfo
this.internalParameterTransformer = defaultParameterTransformer;
this.defaultParameterTransformer = defaultParameterTransformer;

DecimalFormatSymbols numberFormat = DecimalFormatSymbols.getInstance(locale);
DecimalFormatSymbols numberFormat = KeyboardFriendlyDecimalFormatSymbols.getInstance(locale);

List<String> localizedFloatRegexp = singletonList(FLOAT_REGEXPS
.replace("{decimal}", "" + numberFormat.getDecimalSeparator())
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
package io.cucumber.cucumberexpressions;

import org.junit.jupiter.api.Test;

import java.text.DecimalFormatSymbols;
import java.util.AbstractMap.SimpleEntry;
import java.util.Arrays;
import java.util.List;
import java.util.Locale;
import java.util.function.Function;
import java.util.stream.Stream;

import static java.util.Comparator.comparing;
import static java.util.stream.Collectors.groupingBy;
import static java.util.stream.Collectors.toList;

class KeyboardFriendlyDecimalFormatSymbolsTest {

@Test
void listMinusSigns(){
System.out.println("Original minus signs:");
listMinusSigns(DecimalFormatSymbols::getInstance);
System.out.println();
System.out.println("Friendly minus signs:");
listMinusSigns(KeyboardFriendlyDecimalFormatSymbols::getInstance);
System.out.println();
}

private static void listMinusSigns(Function<Locale, DecimalFormatSymbols> supplier) {
getAvailableLocalesAsStream()
.collect(groupingBy(locale -> supplier.apply(locale).getMinusSign()))
.forEach((c, locales) -> System.out.println(render(c) + " " + render(locales)));
}

@Test
void listDecimalAndGroupingSeparators(){
System.out.println("Original decimal and group separators:");
listDecimalAndGroupingSeparators(DecimalFormatSymbols::getInstance);
System.out.println();
System.out.println("Friendly decimal and group separators:");
listDecimalAndGroupingSeparators(KeyboardFriendlyDecimalFormatSymbols::getInstance);
System.out.println();
}

private static void listDecimalAndGroupingSeparators(Function<Locale, DecimalFormatSymbols> supplier) {
getAvailableLocalesAsStream()
.collect(groupingBy(locale -> {
DecimalFormatSymbols symbols = supplier.apply(locale);
return new SimpleEntry<>(symbols.getDecimalSeparator(), symbols.getGroupingSeparator());
}))
.entrySet()
.stream()
.sorted(comparing(entry -> entry.getKey().getKey()))
.forEach((entry) -> {
SimpleEntry<Character, Character> characters = entry.getKey();
List<Locale> locales = entry.getValue();
System.out.println(render(characters.getKey()) + " " + render(characters.getValue()) + " " + render(locales));
});
}

@Test
void listExponentSigns(){
System.out.println("Original exponent signs:");
listExponentSigns(DecimalFormatSymbols::getInstance);
System.out.println();
System.out.println("Friendly exponent signs:");
listExponentSigns(KeyboardFriendlyDecimalFormatSymbols::getInstance);
System.out.println();
}

private static void listExponentSigns(Function<Locale, DecimalFormatSymbols> supplier) {
getAvailableLocalesAsStream()
.collect(groupingBy(locale -> supplier.apply(locale).getExponentSeparator()))
.forEach((s, locales) -> {
if (s.length() == 1) {
System.out.println(render(s.charAt(0)) + " " + render(locales));
} else {
System.out.println(s + " " + render(locales));
}
});
}

private static Stream<Locale> getAvailableLocalesAsStream() {
return Arrays.stream(DecimalFormatSymbols.getAvailableLocales());
}

private static String render(Character character) {
return character + " (" + (int) character + ")";
}

private static String render(List<Locale> locales) {
return locales.size() + ": " + locales.stream()
.sorted(comparing(Locale::getDisplayName))
.map(Locale::getDisplayName)
.collect(toList());
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -5,36 +5,70 @@
import java.math.BigDecimal;
import java.util.Locale;

import static java.util.Locale.forLanguageTag;
import static org.junit.jupiter.api.Assertions.assertEquals;

public class NumberParserTest {
class NumberParserTest {

private final NumberParser english = new NumberParser(Locale.ENGLISH);
private final NumberParser german = new NumberParser(Locale.GERMAN);
private final NumberParser canadianFrench = new NumberParser(Locale.CANADA_FRENCH);
private final NumberParser norwegian = new NumberParser(forLanguageTag("no"));
private final NumberParser canadian = new NumberParser(Locale.CANADA);

@Test
public void can_parse_float() {
void can_parse_float() {
assertEquals(1042.2f, english.parseFloat("1,042.2"), 0);
assertEquals(1042.2f, german.parseFloat( "1.042,2"), 0);
assertEquals(1042.2f, canadianFrench.parseFloat( "1\u00A0042,2"), 0);
assertEquals(1042.2f, canadian.parseFloat("1,042.2"), 0);

assertEquals(1042.2f, german.parseFloat("1.042,2"), 0);
assertEquals(1042.2f, canadianFrench.parseFloat("1.042,2"), 0);
assertEquals(1042.2f, norwegian.parseFloat("1.042,2"), 0);
}

@Test
public void can_parse_double() {
void can_parse_double() {
assertEquals(1042.000000000000002, english.parseDouble("1,042.000000000000002"), 0);
assertEquals(1042.000000000000002, german.parseDouble( "1.042,000000000000002"), 0);
assertEquals(1042.000000000000002, canadianFrench.parseDouble( "1\u00A0042,000000000000002"), 0);
assertEquals(1042.000000000000002, canadian.parseDouble("1,042.000000000000002"), 0);

assertEquals(1042.000000000000002, german.parseDouble("1.042,000000000000002"), 0);
assertEquals(1042.000000000000002, canadianFrench.parseDouble("1.042,000000000000002"), 0);
assertEquals(1042.000000000000002, norwegian.parseDouble("1.042,000000000000002"), 0);
}

@Test
public void can_parse_big_decimals() {
void can_parse_big_decimals() {
assertEquals(new BigDecimal("1042.0000000000000000000002"), english.parseBigDecimal("1,042.0000000000000000000002"));
assertEquals(new BigDecimal("1042.0000000000000000000002"), german.parseBigDecimal( "1.042,0000000000000000000002"));
assertEquals(new BigDecimal("1042.0000000000000000000002"), canadianFrench.parseBigDecimal( "1\u00A0042,0000000000000000000002"));
assertEquals(new BigDecimal("1042.0000000000000000000002"), canadian.parseBigDecimal("1,042.0000000000000000000002"));

assertEquals(new BigDecimal("1042.0000000000000000000002"), german.parseBigDecimal("1.042,0000000000000000000002"));
assertEquals(new BigDecimal("1042.0000000000000000000002"), canadianFrench.parseBigDecimal("1.042,0000000000000000000002"));
assertEquals(new BigDecimal("1042.0000000000000000000002"), norwegian.parseBigDecimal("1.042,0000000000000000000002"));
}

@Test
void can_parse_negative() {
assertEquals(-1042.2f, english.parseFloat("-1,042.2"), 0);
assertEquals(-1042.2f, canadian.parseFloat("-1,042.2"), 0);

assertEquals(-1042.2f, german.parseFloat("-1.042,2"), 0);
assertEquals(-1042.2f, canadianFrench.parseFloat("-1.042,2"), 0);
assertEquals(-1042.2f, norwegian.parseFloat("-1.042,2"), 0);
}

@Test
void can_parse_exponents() {
assertEquals(new BigDecimal("100"), english.parseBigDecimal("1.00E2"));
assertEquals(new BigDecimal("100"), canadian.parseBigDecimal("1.00e2"));
assertEquals(new BigDecimal("100"), german.parseBigDecimal("1,00E2"));
assertEquals(new BigDecimal("100"), canadianFrench.parseBigDecimal("1,00E2"));
assertEquals(new BigDecimal("100"), norwegian.parseBigDecimal("1,00E2"));

assertEquals(new BigDecimal("0.01"), english.parseBigDecimal("1E-2"));
assertEquals(new BigDecimal("0.01"), canadian.parseBigDecimal("1e-2"));
assertEquals(new BigDecimal("0.01"), german.parseBigDecimal("1E-2"));
assertEquals(new BigDecimal("0.01"), canadianFrench.parseBigDecimal("1E-2"));
assertEquals(new BigDecimal("0.01"), norwegian.parseBigDecimal("1E-2"));
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -171,8 +171,19 @@ public void parse_decimal_numbers_in_canadian_french() {
ExpressionFactory factory = new ExpressionFactory(new ParameterTypeRegistry(Locale.CANADA_FRENCH));
Expression expression = factory.createExpression("{bigdecimal}");

assertThat(expression.match("1\u00A0000,1").get(0).getValue(), is(new BigDecimal("1000.1")));
assertThat(expression.match("1\u00A0000\u00A0000,1").get(0).getValue(), is(new BigDecimal("1000000.1")));
assertThat(expression.match("1.000,1").get(0).getValue(), is(new BigDecimal("1000.1")));
assertThat(expression.match("1.000.000,1").get(0).getValue(), is(new BigDecimal("1000000.1")));
assertThat(expression.match("-1,1").get(0).getValue(), is(new BigDecimal("-1.1")));
assertThat(expression.match("-,1E1").get(0).getValue(), is(new BigDecimal("-1")));
}

@Test
public void parse_decimal_numbers_in_norwegian() {
ExpressionFactory factory = new ExpressionFactory(new ParameterTypeRegistry(Locale.forLanguageTag("no")));
Expression expression = factory.createExpression("{bigdecimal}");

assertThat(expression.match("1.000,1").get(0).getValue(), is(new BigDecimal("1000.1")));
assertThat(expression.match("1.000.000,1").get(0).getValue(), is(new BigDecimal("1000000.1")));
assertThat(expression.match("-1,1").get(0).getValue(), is(new BigDecimal("-1.1")));
assertThat(expression.match("-,1E1").get(0).getValue(), is(new BigDecimal("-1")));
}
Expand Down

0 comments on commit 3f535a2

Please sign in to comment.