Skip to content

Commit

Permalink
Adds support for geo-bounds filtering in geogrid aggregations
Browse files Browse the repository at this point in the history
It is fairly common to filter the geo point candidates in
geohash_grid and geotile_grid aggregations according to some
viewable bounding box. This change introduces the option of
specifying this filter directly in the tiling aggregation.
  • Loading branch information
talevy committed Dec 20, 2019
1 parent e47711f commit af29e5e
Show file tree
Hide file tree
Showing 24 changed files with 476 additions and 67 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,62 @@ var bbox = geohash.decode_bbox('u17');
--------------------------------------------------
// NOTCONSOLE

==== Requests with additional bounding box filtering

The `geohash_grid` aggregation supports an optional `bounds` parameter
that restricts the points considered to those that fall within the
bounds provided. The `bounds` parameter accepts the bounding box in
all the same <<query-dsl-geo-bounding-box-query-accepted-formats,accepted formats>> of the
bounds specified in the Geo Bounding Box Query. This bounding box can be used with or
without an additional `geo_bounding_box` query filtering the points prior to aggregating.
It is an independent bounding box that can intersect with, be equal to, or be disjoint
to any additional `geo_bounding_box` queries defined in the context of the aggregation.

[source,console,id=geohashgrid-aggregation-with-bounds]
--------------------------------------------------
POST /museums/_search?size=0
{
"aggregations" : {
"tiles-in-bounds" : {
"geohash_grid" : {
"field" : "location",
"precision" : 8,
"bounds": {
"top_left" : "53.4375, 4.21875",
"bottom_right" : "52.03125, 5.625"
}
}
}
}
}
--------------------------------------------------
// TEST[continued]

[source,console-result]
--------------------------------------------------
{
...
"aggregations" : {
"tiles-in-bounds" : {
"buckets" : [
{
"key" : "u173zy3j",
"doc_count" : 1
},
{
"key" : "u173zvfz",
"doc_count" : 1
},
{
"key" : "u173zt90",
"doc_count" : 1
}
]
}
}
}
--------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]

==== Cell dimensions at the equator
The table below shows the metric dimensions for cells covered by various string lengths of geohash.
Expand Down Expand Up @@ -230,6 +286,8 @@ precision:: Optional. The string length of the geohashes used to define
to precision levels higher than the supported 12 levels,
(e.g. for distances <5.6cm) the value is rejected.

bounds: Optional. The bounding box to filter the points in the bucket.

size:: Optional. The maximum number of geohash buckets to return
(defaults to 10,000). When results are trimmed, buckets are
prioritised based on the volumes of documents they contain.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,62 @@ POST /museums/_search?size=0
--------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]

==== Requests with additional bounding box filtering

The `geotile_grid` aggregation supports an optional `bounds` parameter
that restricts the points considered to those that fall within the
bounds provided. The `bounds` parameter accepts the bounding box in
all the same <<query-dsl-geo-bounding-box-query-accepted-formats,accepted formats>> of the
bounds specified in the Geo Bounding Box Query. This bounding box can be used with or
without an additional `geo_bounding_box` query filtering the points prior to aggregating.
It is an independent bounding box that can intersect with, be equal to, or be disjoint
to any additional `geo_bounding_box` queries defined in the context of the aggregation.

[source,console,id=geotilegrid-aggregation-with-bounds]
--------------------------------------------------
POST /museums/_search?size=0
{
"aggregations" : {
"tiles-in-bounds" : {
"geotile_grid" : {
"field" : "location",
"precision" : 22,
"bounds": {
"top_left" : "52.4, 4.9",
"bottom_right" : "52.3, 5.0"
}
}
}
}
}
--------------------------------------------------
// TEST[continued]

[source,console-result]
--------------------------------------------------
{
...
"aggregations" : {
"tiles-in-bounds" : {
"buckets" : [
{
"key" : "22/2154412/1378379",
"doc_count" : 1
},
{
"key" : "22/2154385/1378332",
"doc_count" : 1
},
{
"key" : "22/2154259/1378425",
"doc_count" : 1
}
]
}
}
}
--------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]

==== Options

Expand All @@ -172,6 +228,8 @@ precision:: Optional. The integer zoom of the key used to define
cells/buckets in the results. Defaults to 7.
Values outside of [0,29] will be rejected.

bounds: Optional. The bounding box to filter the points in the bucket.

size:: Optional. The maximum number of geohash buckets to return
(defaults to 10,000). When results are trimmed, buckets are
prioritised based on the volumes of documents they contain.
Expand Down
1 change: 1 addition & 0 deletions docs/reference/query-dsl/geo-bounding-box-query.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ be executed in memory or indexed. See <<geo-bbox-type,Type>> below for further d
Default is `memory`.
|=======================================================================

[[query-dsl-geo-bounding-box-query-accepted-formats]]
[float]
==== Accepted Formats

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,11 @@ public GeoBoundingBox(StreamInput input) throws IOException {
this.bottomRight = input.readGeoPoint();
}

public boolean isUnbounded() {
return Double.isNaN(topLeft.lon()) || Double.isNaN(topLeft.lat())
|| Double.isNaN(bottomRight.lon()) || Double.isNaN(bottomRight.lat());
}

public GeoPoint topLeft() {
return topLeft;
}
Expand Down Expand Up @@ -120,6 +125,26 @@ public XContentBuilder toXContentFragment(XContentBuilder builder, boolean build
return builder;
}

/**
* If the bounding box crosses the date-line (left greater-than right) then the
* longitude of the point need only to be higher than the left or lower
* than the right. Otherwise, it must be both.
*
* @param lon the longitude of the point
* @param lat the latitude of the point
* @return whether the point (lon, lat) is in the specified bounding box
*/
public boolean pointInBounds(double lon, double lat) {
if (lat >= bottom() && lat <= top()) {
if (left() <= right()) {
return lon >= left() && lon <= right();
} else {
return lon >= left() || lon <= right();
}
}
return false;
}

@Override
public void writeTo(StreamOutput out) throws IOException {
out.writeGeoPoint(topLeft);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ public static boolean isValidLongitude(double longitude) {
return true;
}


/**
* Calculate the width (in meters) of geohash cells at a specific level
* @param level geohash level must be greater or equal to zero
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,10 @@

package org.elasticsearch.search.aggregations.bucket.composite;

import org.elasticsearch.Version;
import org.elasticsearch.common.ParseField;
import org.elasticsearch.common.geo.GeoBoundingBox;
import org.elasticsearch.common.geo.GeoPoint;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;
import org.elasticsearch.common.xcontent.ObjectParser;
Expand All @@ -45,6 +48,8 @@ public class GeoTileGridValuesSourceBuilder extends CompositeValuesSourceBuilder
static {
PARSER = new ObjectParser<>(GeoTileGridValuesSourceBuilder.TYPE);
PARSER.declareInt(GeoTileGridValuesSourceBuilder::precision, new ParseField("precision"));
PARSER.declareField(((p, builder, context) -> builder.geoBoundingBox(GeoBoundingBox.parseBoundingBox(p))),
GeoBoundingBox.BOUNDS_FIELD, ObjectParser.ValueType.OBJECT);
CompositeValuesSourceParserHelper.declareValuesSourceFields(PARSER, ValueType.NUMERIC);
}

Expand All @@ -53,6 +58,7 @@ static GeoTileGridValuesSourceBuilder parse(String name, XContentParser parser)
}

private int precision = GeoTileGridAggregationBuilder.DEFAULT_PRECISION;
private GeoBoundingBox geoBoundingBox = new GeoBoundingBox(new GeoPoint(Double.NaN, Double.NaN), new GeoPoint(Double.NaN, Double.NaN));

GeoTileGridValuesSourceBuilder(String name) {
super(name);
Expand All @@ -61,13 +67,21 @@ static GeoTileGridValuesSourceBuilder parse(String name, XContentParser parser)
GeoTileGridValuesSourceBuilder(StreamInput in) throws IOException {
super(in);
this.precision = in.readInt();
if (in.getVersion().onOrAfter(Version.V_8_0_0)) {
this.geoBoundingBox = new GeoBoundingBox(in);
}
}

public GeoTileGridValuesSourceBuilder precision(int precision) {
this.precision = GeoTileUtils.checkPrecisionRange(precision);
return this;
}

public GeoTileGridValuesSourceBuilder geoBoundingBox(GeoBoundingBox geoBoundingBox) {
this.geoBoundingBox = geoBoundingBox;
return this;
}

@Override
public GeoTileGridValuesSourceBuilder format(String format) {
throw new IllegalArgumentException("[format] is not supported for [" + TYPE + "]");
Expand All @@ -76,11 +90,17 @@ public GeoTileGridValuesSourceBuilder format(String format) {
@Override
protected void innerWriteTo(StreamOutput out) throws IOException {
out.writeInt(precision);
if (out.getVersion().onOrAfter(Version.V_8_0_0)) {
geoBoundingBox.writeTo(out);
}
}

@Override
protected void doXContentBody(XContentBuilder builder, Params params) throws IOException {
builder.field("precision", precision);
if (geoBoundingBox.isUnbounded() == false) {
geoBoundingBox.toXContent(builder, params);
}
}

@Override
Expand All @@ -90,7 +110,7 @@ String type() {

@Override
public int hashCode() {
return Objects.hash(super.hashCode(), precision);
return Objects.hash(super.hashCode(), precision, geoBoundingBox);
}

@Override
Expand All @@ -99,7 +119,8 @@ public boolean equals(Object obj) {
if (obj == null || getClass() != obj.getClass()) return false;
if (super.equals(obj) == false) return false;
GeoTileGridValuesSourceBuilder other = (GeoTileGridValuesSourceBuilder) obj;
return precision == other.precision;
return Objects.equals(precision,other.precision)
&& Objects.equals(geoBoundingBox, other.geoBoundingBox);
}

@Override
Expand All @@ -112,7 +133,7 @@ protected CompositeValuesSourceConfig innerBuild(QueryShardContext queryShardCon
ValuesSource.GeoPoint geoPoint = (ValuesSource.GeoPoint) orig;
// is specified in the builder.
final MappedFieldType fieldType = config.fieldContext() != null ? config.fieldContext().fieldType() : null;
CellIdSource cellIdSource = new CellIdSource(geoPoint, precision, GeoTileUtils::longEncode);
CellIdSource cellIdSource = new CellIdSource(geoPoint, precision, geoBoundingBox, GeoTileUtils::longEncode);
return new CompositeValuesSourceConfig(name, fieldType, cellIdSource, DocValueFormat.GEOTILE, order(),
missingBucket(), script() != null);
} else {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@

import org.apache.lucene.index.LeafReaderContext;
import org.apache.lucene.index.SortedNumericDocValues;
import org.elasticsearch.common.geo.GeoBoundingBox;
import org.elasticsearch.index.fielddata.AbstractSortingNumericDocValues;
import org.elasticsearch.index.fielddata.MultiGeoPointValues;
import org.elasticsearch.index.fielddata.SortedBinaryDocValues;
Expand All @@ -36,11 +37,13 @@ public class CellIdSource extends ValuesSource.Numeric {
private final ValuesSource.GeoPoint valuesSource;
private final int precision;
private final GeoPointLongEncoder encoder;
private final GeoBoundingBox geoBoundingBox;

public CellIdSource(GeoPoint valuesSource, int precision, GeoPointLongEncoder encoder) {
public CellIdSource(GeoPoint valuesSource,int precision, GeoBoundingBox geoBoundingBox, GeoPointLongEncoder encoder) {
this.valuesSource = valuesSource;
//different GeoPoints could map to the same or different hashing cells.
this.precision = precision;
this.geoBoundingBox = geoBoundingBox;
this.encoder = encoder;
}

Expand All @@ -55,7 +58,7 @@ public boolean isFloatingPoint() {

@Override
public SortedNumericDocValues longValues(LeafReaderContext ctx) {
return new CellValues(valuesSource.geoPointValues(ctx), precision, encoder);
return new CellValues(valuesSource.geoPointValues(ctx), precision, geoBoundingBox, encoder);
}

@Override
Expand All @@ -81,21 +84,28 @@ private static class CellValues extends AbstractSortingNumericDocValues {
private MultiGeoPointValues geoValues;
private int precision;
private GeoPointLongEncoder encoder;
private GeoBoundingBox geoBoundingBox;

protected CellValues(MultiGeoPointValues geoValues, int precision, GeoPointLongEncoder encoder) {
protected CellValues(MultiGeoPointValues geoValues, int precision, GeoBoundingBox geoBoundingBox, GeoPointLongEncoder encoder) {
this.geoValues = geoValues;
this.precision = precision;
this.encoder = encoder;
this.geoBoundingBox = geoBoundingBox;
}

@Override
public boolean advanceExact(int docId) throws IOException {
if (geoValues.advanceExact(docId)) {
resize(geoValues.docValueCount());
for (int i = 0; i < docValueCount(); ++i) {
int docValueCount = geoValues.docValueCount();
resize(docValueCount);
int j = 0;
for (int i = 0; i < docValueCount; i++) {
org.elasticsearch.common.geo.GeoPoint target = geoValues.nextValue();
values[i] = encoder.encode(target.getLon(), target.getLat(), precision);
if (geoBoundingBox.isUnbounded() || geoBoundingBox.pointInBounds(target.getLon(), target.getLat())) {
values[j++] = encoder.encode(target.getLon(), target.getLat(), precision);
}
}
resize(j);
sort();
return true;
} else {
Expand Down
Loading

0 comments on commit af29e5e

Please sign in to comment.