-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add functions to support Google polylines #23999
base: master
Are you sure you want to change the base?
Conversation
|
Thanks! A couple of formatting nits in the release note entry.
|
presto-main/src/main/java/com/facebook/presto/geospatial/GeoFunctions.java
Show resolved
Hide resolved
@SqlType("array(" + GEOMETRY_TYPE_NAME + ")") | ||
public static Block gMapsPolylineDecode(@SqlType(StandardTypes.VARCHAR) Slice polyline) throws PrestoException | ||
{ | ||
final double precision = Math.pow(10.0, (double) DEFAULT_POLYLINE_PRECISION_EXPONENT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this presume the underlying polyline has this precision? i.e. if a polyline was encoded using higher precision, would this result in information loss?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The standard and almost all implementations and use cases use 1e5. There are a couple of cases out there that use a precision of 6.
Ordinarily, an optional argument overriding the precision in the function might be worthwhile (this is how the python library does it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, to me this seems like approx_percentile
, which has two forms: one in which you can specify accuracy, and one in which we use a default accuracy and you don't specify it. I think we can take the same approach here.
presto-main/src/main/java/com/facebook/presto/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
@@ -1371,4 +1371,13 @@ private void assertInvalidGeometryJson(String json) | |||
{ | |||
assertInvalidFunction("geometry_from_geojson('" + json + "')", "Invalid GeoJSON:.*"); | |||
} | |||
|
|||
@Test | |||
public void testGMapsPolyline() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two test cases seems sparse. Are you sure this is sufficient test coverage? If we look at polyline libraries, how extensive are their test cases and is there anything we can learn from them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to agree. Ordinarily I would just grab some actual trip data and use that to form test cases, but on the public internet that reveals PII.
Instead I've created a case manually - a case with
(1) precision in the input floats higher than supported
(2) duplicate points in a row
(3) line crosses back over itself (not very significant, but hey)
(4) line terminates at the origin (non consecutive duplicate points)
Also, re naming of this function, strictly speaking this function is not a google maps only tool, but "Google" and "Google Maps" is a registered trade mark in the United States and elsewhere. Please advise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nits.
Is there any plan to backport this to Velox? If so, I think it would be worthwhile to work on that in parallel with this PR so we can use the fuzzer to work out any potential bugs. (Presto does not have a fuzzer.)
presto-main/src/main/java/com/facebook/presto/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/geospatial/GeoFunctions.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/geospatial/GeoFunctions.java
Show resolved
Hide resolved
I don't have any specific response. @tdcmeehan? |
I think this falls under fair use. We already have functions and reference Bing tiles, this seems like the same sort of thing, we can use this name because it is necessary to describe the format. |
@tdcmeehan what is the policy for presto c++ support? When should we start enforcing required c++ implementation? |
There's no need to enforce that they be consistent. What often happens is, because Velox has a sophisticated fuzzer, when the C++ version is developed, the fuzzer will surface bugs in the original Java implementation. (Even what seems like simple functions have very hard to catch edge cases that are difficult to catch with example based testing.) So if there is an intention to add a Velox counterpart, I'd recommend we add it now so we can use the fuzzer test to work out the bugs now. |
Okay got it thanks! This change is legacy and has already been merged internally. Currently, there are no plans for a C++ version but it will be needed as we migrate cc: @jquirke , but moving forward, we aim to enforce C++ implementations for Uber's internal UDFs. Wanted to check the Presto policy, as users often refer to the open-source documentation and use UDFs without knowing if they are compatible with both C++ and Java or just Java. |
Great, I renamed it: s/gmaps/google/ |
@jquirke LGTM, please squash commits |
* Support the popular and efficient Google polyline format by adding GeoFunctions - "google_decode_polyline" that decodes a Google polyline string into an array of ST_Points, specifying an optional precision - "google_encode_polyline" that encodes an array of ST_Points into a Google polyline string, specifying an optional precision Implements prestodb#15916
2f6a7df
to
7e7e97f
Compare
Done. |
Support the popular and efficient Google polyline format by adding a function "gmaps_decode_polyline" that decodes a polyline string into an array of ST_Points, and a "gmaps_encode_polyline" that does the reverse.
Implements #23998
Description
Motivation and Context
This is an increasingly popular format [1], [2], [3] for storing complex maps routes efficiently.
Implementing functionality to convert to the native Presto ST_Geometry types would be incredibly useful for working with this compressed route data.
For example routes of Uber trips are efficiently stored using this format.
[1] https://github.com/frederickjansen/polyline/stargazers
[2] https://github.com/urschrei/pypolyline/stargazers
[3] https://stackoverflow.com/search?q=google+polyline
Impact
Test Plan
Tests to cover successful decode and corrupted polyline string.
Contributor checklist
Release Notes