-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Script: ulong via fields API #76519
Script: ulong via fields API #76519
Conversation
Exposes unsigned long via the fields API. Unsigned longs default to java signed longs. That means the upper range appears negative. Consumers should use `Long.compareUnsigned(long, long)` `Long.divideUnsigned(long, long)` and `Long.remainderUnsigned(long, long)` to correctly work with values known to be unsigned long. Alternatively, users may treat the unsigned long type as `BigInteger` using the field API, `field('ul').as(Field.BigInteger).getValue(BigInteger.ZERO)`. ``` field('ul').as(Field.BigInteger).getValue(BigInteger.valueOf(1000)) field('ul').getValue(1000L) ``` This change also implements the beginning of the converters for the fields API. The following conversions have been added: ``` ulong <-> BigInteger long <-> BigInteger double -> BigInteger String (parsed as long or double) -> BigInteger double -> long String (parsed as long or double) -> long Date (epoch milliseconds) -> long Nano Date (epoch nanoseconds) -> long boolean (1L for true, 0L for false) -> long ``` Fixes: elastic#64361
…/69742-06-ulong-as
Pinging @elastic/es-core-infra (Team:Core/Infra) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Walked through this fully with @stu-elastic and this looks great! Thanks for this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Most of my comments are small nits.
* Converts between one scripting {@link Field} type and another, {@code CF}, with a different underlying | ||
* value type, {@code CT}. | ||
*/ | ||
public interface Converter<CT, CF extends Field<CT>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe these should be TC
for target class and FC
for Field class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
// No instances, please | ||
private Converters() {} | ||
|
||
public static BigIntegerField LongToBigInteger(LongField sourceField) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These can be private (or package private if we want tests for these), since they are only called in this file right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to package private.
* {@link Converters} for scripting fields. These constants are exposed as static fields on {@link Field} to | ||
* allow a user to convert via {@link Field#as(Converter)}. | ||
*/ | ||
public class Converters { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be package private since it is only accessed by Field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used by by UnsignedLongField
in x-pack, so we have to leave it public.
} | ||
|
||
@Override | ||
public java.math.BigInteger getNonPrimitiveValue() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why are these fully qualified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed package prefixes.
return new BigIntegerField(sourceField.getName(), new DelegatingFieldValues<BigInteger, String>(fv) { | ||
protected long parseNumber(String str) { | ||
try { | ||
return Long.parseLong(str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fail on values larger than Long.MAX_VALUE, and then lose precision (or fail again) when trying to parse as double.
I think for string, since the value can be arbitrarily large, we should completely use BigInteger and BigDecimal. So something like this:
try {
return new BigInteger(str);
} catch (NumberFormatException e) {
return new BigDecimal(str).toBigInteger();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh nice, changed.
public double getDoubleValue() { | ||
String str = values.getNonPrimitiveValue(); | ||
try { | ||
return Double.parseDouble(str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for first trying double? The underlying data could likewise be a BigInteger or BigDecimal that isn't parsable as long. The user asked to convert to long, so if the data can't fit it, it seems like getting a NumberFormatException is appropriate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, changed to using getLongValue()
, which will attempt to parse the long.
/** | ||
* Convert this {@code Field} into another {@code Field} type using a {@link Converter} of the target type. | ||
* | ||
* As this is called from user scripts, {@code as} may be called to convert a field into it's same type, if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's -> its
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
* Convert this {@code Field} into another {@code Field} type using a {@link Converter} of the target type. | ||
* | ||
* As this is called from user scripts, {@code as} may be called to convert a field into it's same type, if | ||
* so {@code this} is cast via that converters {@link Converter#getFieldClass()}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think "that" can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
// UnsignedLongFields must define their own conversions as they are in x-pack | ||
@Override | ||
public <CT, CF extends Field<CT>> Field<CT> as(Converter<CT, CF> converter) { | ||
if (converter.getFieldClass().isInstance(this)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unfortunate we have to repeat this. I wonder if we could instead have as
be final on Field, and have an implementation method (eg convert
) that is protected which can be overridden here, so that the identity check can always be first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added convert
which is what subclasses should override.
return converter.getFieldClass().cast(bigIntegerField); | ||
} | ||
|
||
return super.as(converter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a special case for converting to Long, since the getLongValue() doesn't return the right long in this case? ie if the value is under Long.MAX_VALUE then long is returned, otherwise an error should be raised, not a negative number returned? String would be similar, otherwise large values would show up as negative numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UnsignedLongScriptDocValues
/UnsignedLongField
returns long
s with exactly this property, values may be negative.
I expect diligent users will perform "confirming" type conversions and never take the original type from field('fieldname')
, in this case the a user who wants to handle possible negative values themselves would only be able to do a field('ul').as(Field.Long)
confirming type conversion.
If we want to allow field('ul').as(Field.UnsignedLong)
, we'll need to implement Field augmentation for painless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the complication with Long type since that is the underlying native type returned here, but I don’t think conversion to string will work correctly? It should not show negative values, yet I think it will since it would just stringify the raw long value. A minimal test for this conversion would be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added minimal test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you point at the test? I can't find it. I was specifically talking about conversion to string, to ensure that unsigned long values do not show up as negative values with this conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR does not implement conversion to String fields, only from String fields. To BigInteger and to Long are implemented here.
One general comment I forgot to make is I think we should have tests for every method/conversion we have exposed. I realize that is a lot, it doesn’t have to all come in this PR, but at least some minimal verification that this all works as we think it does. |
💔 Backport failed
To backport manually run |
Exposes unsigned long via the fields API. Unsigned longs default to java signed longs. That means the upper range appears negative. Consumers should use `Long.compareUnsigned(long, long)` `Long.divideUnsigned(long, long)` and `Long.remainderUnsigned(long, long)` to correctly work with values known to be unsigned long. Alternatively, users may treat the unsigned long type as `BigInteger` using the field API, `field('ul').as(Field.BigInteger).getValue(BigInteger.ZERO)`. ``` field('ul').as(Field.BigInteger).getValue(BigInteger.valueOf(1000)) field('ul').getValue(1000L) ``` This change also implements the beginning of the converters for the fields API. The following conversions have been added: ``` ulong <-> BigInteger long <-> BigInteger double -> BigInteger String (parsed as long or double) -> BigInteger double -> long String (parsed as long or double) -> long Date (epoch milliseconds) -> long Nano Date (epoch nanoseconds) -> long boolean (1L for true, 0L for false) -> long ``` Backport: aea8bff
Exposes unsigned long via the fields API. Unsigned longs default to java signed longs. That means the upper range appears negative. Consumers should use `Long.compareUnsigned(long, long)` `Long.divideUnsigned(long, long)` and `Long.remainderUnsigned(long, long)` to correctly work with values known to be unsigned long. Alternatively, users may treat the unsigned long type as `BigInteger` using the field API, `field('ul').as(Field.BigInteger).getValue(BigInteger.ZERO)`. ``` field('ul').as(Field.BigInteger).getValue(BigInteger.valueOf(1000)) field('ul').getValue(1000L) ``` This change also implements the beginning of the converters for the fields API. The following conversions have been added: ``` ulong <-> BigInteger long <-> BigInteger double -> BigInteger String (parsed as long or double) -> BigInteger double -> long String (parsed as long or double) -> long Date (epoch milliseconds) -> long Nano Date (epoch nanoseconds) -> long boolean (1L for true, 0L for false) -> long ``` Backport: aea8bff
We introduced script values support for unsigned_long from 7.15 in elastic/elasticsearch#76519. But forgot add this document for 7.x. This just backport documentation from elastic/elasticsearch#64422.
Exposes unsigned long via the fields API.
Unsigned longs default to java signed longs. That means the upper range
appears negative. Consumers should use
Long.compareUnsigned(long, long)
Long.divideUnsigned(long, long)
andLong.remainderUnsigned(long, long)
to correctly work with values known to be unsigned long.
Alternatively, users may treat the unsigned long type as
BigInteger
usingthe field API,
field('ul').as(Field.BigInteger).getValue(BigInteger.ZERO)
.This change also implements the beginning of the converters for the fields
API. The following conversions have been added:
Fixes: #64361