Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions to support Google polylines #23999

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jquirke
Copy link

@jquirke jquirke commented Nov 11, 2024

Support the popular and efficient Google polyline format by adding a function "gmaps_decode_polyline" that decodes a polyline string into an array of ST_Points, and a "gmaps_encode_polyline" that does the reverse.

Implements #23998

Description

  • Adds google_decode_polyline function (VARCHAR -> ARRAY of ST_Point), with optional precision argument
  • Adds google_encode_polyline function (ARRAY of ST_Point -> VARCHAR), with optional precision argument

Motivation and Context

This is an increasingly popular format [1], [2], [3] for storing complex maps routes efficiently.
Implementing functionality to convert to the native Presto ST_Geometry types would be incredibly useful for working with this compressed route data.

For example routes of Uber trips are efficiently stored using this format.

[1] https://github.com/frederickjansen/polyline/stargazers
[2] https://github.com/urschrei/pypolyline/stargazers
[3] https://stackoverflow.com/search?q=google+polyline

Impact

Test Plan

Tests to cover successful decode and corrupted polyline string.

mvnw test

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== RELEASE NOTES ==

General Changes
* Add :func:`google_polyline_decode` function to convert Google polyline to Presto ST_Geometry types. :pr:`23999`
* Add :func:`google_polyline_encode` function to convert Presto ST_Geometry to Google polyline types. :pr:`23999`

@jquirke jquirke requested a review from a team as a code owner November 11, 2024 01:39
Copy link

linux-foundation-easycla bot commented Nov 11, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: jquirke / name: Jeremy Quirke (7e7e97f)

@jquirke jquirke changed the title Add function to map Google polyline to ST_Geometry [Proposal] Add function to map Google polyline to ST_Geometry Nov 11, 2024
@tdcmeehan tdcmeehan self-assigned this Nov 11, 2024
@steveburnett
Copy link
Contributor

Thanks! A couple of formatting nits in the release note entry.

== RELEASE NOTES ==

General Changes
* Add :func:`gmaps_polyline_decode` function to convert Google polyline to Presto ST_Geometry types. :pr:`23999`

@SqlType("array(" + GEOMETRY_TYPE_NAME + ")")
public static Block gMapsPolylineDecode(@SqlType(StandardTypes.VARCHAR) Slice polyline) throws PrestoException
{
final double precision = Math.pow(10.0, (double) DEFAULT_POLYLINE_PRECISION_EXPONENT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this presume the underlying polyline has this precision? i.e. if a polyline was encoded using higher precision, would this result in information loss?

Copy link
Author

@jquirke jquirke Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The standard and almost all implementations and use cases use 1e5. There are a couple of cases out there that use a precision of 6.

Ordinarily, an optional argument overriding the precision in the function might be worthwhile (this is how the python library does it).

Copy link
Contributor

@tdcmeehan tdcmeehan Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, to me this seems like approx_percentile, which has two forms: one in which you can specify accuracy, and one in which we use a default accuracy and you don't specify it. I think we can take the same approach here.

@@ -1371,4 +1371,13 @@ private void assertInvalidGeometryJson(String json)
{
assertInvalidFunction("geometry_from_geojson('" + json + "')", "Invalid GeoJSON:.*");
}

@Test
public void testGMapsPolyline()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two test cases seems sparse. Are you sure this is sufficient test coverage? If we look at polyline libraries, how extensive are their test cases and is there anything we can learn from them?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree. Ordinarily I would just grab some actual trip data and use that to form test cases, but on the public internet that reveals PII.

Instead I've created a case manually - a case with

(1) precision in the input floats higher than supported
(2) duplicate points in a row
(3) line crosses back over itself (not very significant, but hey)
(4) line terminates at the origin (non consecutive duplicate points)

@jquirke
Copy link
Author

jquirke commented Nov 11, 2024

Thanks! A couple of formatting nits in the release note entry.

== RELEASE NOTES ==

General Changes
* Add :func:`gmaps_polyline_decode` function to convert Google polyline to Presto ST_Geometry types. :pr:`23999`

Also, re naming of this function, strictly speaking this function is not a google maps only tool, but "Google" and "Google Maps" is a registered trade mark in the United States and elsewhere. Please advise.

@jquirke jquirke changed the title [Proposal] Add function to map Google polyline to ST_Geometry [Proposal] Add functions to support Google polylines Nov 12, 2024
Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits.

Is there any plan to backport this to Velox? If so, I think it would be worthwhile to work on that in parallel with this PR so we can use the fuzzer to work out any potential bugs. (Presto does not have a fuzzer.)

@steveburnett
Copy link
Contributor

Thanks! A couple of formatting nits in the release note entry.

== RELEASE NOTES ==

General Changes
* Add :func:`gmaps_polyline_decode` function to convert Google polyline to Presto ST_Geometry types. :pr:`23999`

Also, re naming of this function, strictly speaking this function is not a google maps only tool, but "Google" and "Google Maps" is a registered trade mark in the United States and elsewhere. Please advise.

I don't have any specific response. @tdcmeehan?

@tdcmeehan
Copy link
Contributor

tdcmeehan commented Nov 12, 2024

Thanks! A couple of formatting nits in the release note entry.

== RELEASE NOTES ==

General Changes
* Add :func:`gmaps_polyline_decode` function to convert Google polyline to Presto ST_Geometry types. :pr:`23999`

Also, re naming of this function, strictly speaking this function is not a google maps only tool, but "Google" and "Google Maps" is a registered trade mark in the United States and elsewhere. Please advise.

I don't have any specific response. @tdcmeehan?

I think this falls under fair use. We already have functions and reference Bing tiles, this seems like the same sort of thing, we can use this name because it is necessary to describe the format.

@jaystarshot
Copy link
Member

@tdcmeehan what is the policy for presto c++ support? When should we start enforcing required c++ implementation?

@tdcmeehan
Copy link
Contributor

There's no need to enforce that they be consistent.

What often happens is, because Velox has a sophisticated fuzzer, when the C++ version is developed, the fuzzer will surface bugs in the original Java implementation. (Even what seems like simple functions have very hard to catch edge cases that are difficult to catch with example based testing.) So if there is an intention to add a Velox counterpart, I'd recommend we add it now so we can use the fuzzer test to work out the bugs now.

@jaystarshot
Copy link
Member

jaystarshot commented Nov 12, 2024

Okay got it thanks! This change is legacy and has already been merged internally. Currently, there are no plans for a C++ version but it will be needed as we migrate cc: @jquirke , but moving forward, we aim to enforce C++ implementations for Uber's internal UDFs. Wanted to check the Presto policy, as users often refer to the open-source documentation and use UDFs without knowing if they are compatible with both C++ and Java or just Java.

@jquirke
Copy link
Author

jquirke commented Nov 13, 2024

Thanks! A couple of formatting nits in the release note entry.

== RELEASE NOTES ==

General Changes
* Add :func:`gmaps_polyline_decode` function to convert Google polyline to Presto ST_Geometry types. :pr:`23999`

Also, re naming of this function, strictly speaking this function is not a google maps only tool, but "Google" and "Google Maps" is a registered trade mark in the United States and elsewhere. Please advise.

I don't have any specific response. @tdcmeehan?

I think this falls under fair use. We already have functions and reference Bing tiles, this seems like the same sort of thing, we can use this name because it is necessary to describe the format.

Great, I renamed it: s/gmaps/google/

@jquirke jquirke changed the title [Proposal] Add functions to support Google polylines Add functions to support Google polylines Nov 13, 2024
@tdcmeehan
Copy link
Contributor

@jquirke LGTM, please squash commits

* Support the popular and efficient Google polyline format by adding
GeoFunctions
 - "google_decode_polyline" that decodes a Google polyline string into
an array of ST_Points, specifying an optional precision
 - "google_encode_polyline" that encodes an array of ST_Points into
a Google polyline string, specifying an optional precision

Implements prestodb#15916
@jquirke
Copy link
Author

jquirke commented Nov 13, 2024

@jquirke LGTM, please squash commits

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants