Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Property merging or calculated values #2671

Closed
redspider opened this issue Jun 5, 2016 · 12 comments
Closed

Property merging or calculated values #2671

redspider opened this issue Jun 5, 2016 · 12 comments

Comments

@redspider
Copy link

redspider commented Jun 5, 2016

I am rendering a heat-map, consisting of NZ meshblock shapes stored in vector tiles, and coloured based on the currently selected subset of a multi-dimension dataset.

eyj1f

I am currently forced to do this by creating multiple layers from the tile source, each with a specific colour fill (low, medium, high etc), and then using setFilter to hide or show that particular meshblock in a given layer depending on the calculated value.

This has a number of downsides:

  1. It's slow, there are a large number of meshblocks and having to create a setFilter IN consisting of most of them for the lowest colour for every change in the query parameters, for every colour layer, is not quick (it works, but it's not that smooth).
  2. I am limited to a small number of different colours, subtle differences in shade aren't an option as it would require many layers.

With the advent of fill data styling I had hoped to be able to avoid at least number 2, but it doesn't seem to be the case. The problem is that with the geometry coming from vector tiles, but the data being loaded in separately, there appears to be no way to put the data into the features as a property, and thus no way to style it.

I can see this being solved in one of (at least) two ways:

  1. Permit me to push a dataset into the workers and a function to calculate a property based on this dataset. This would be ideal as the dataset is not huge (so duplicate copies in workers is not an issue), but it isn't practical to put every permutation into the tiles.
  2. Permit me to provide a set of properties to merge with a source or layer, based on an existing property - i.e. if the property meshblock = '1029392' then add mergeset['1029392'] or similar.

I am uncertain whether this is on the roadmap or whether there's an existing solution. It seems possible that the merge could be done by using querySourceFeatures, merging the properties then doing setData on a separate geoJSON source, but it seems likely that would result in needless rebuilds when zooming out.

@mollymerp
Copy link
Contributor

Thank you for submitting this issue. I'm not 100% sure what you're asking, but hopefully this will be helpful. Right now we don't support merging property data with vector tiles on the fly, but it is possible we will add this feature in the future and implementation will be less complicated after #2667 ships. You can do this data processing beforehand, however, using any of a number of geo-data processing tools like QGIS or the mapshaper command line tool.

If what you're looking for is computed property functions, this has also been discussed among the core developers and we hope to add this feature in the future. For example, that feature would allow styling based on population density like this:

{
  property: ['/', 'population', 'area'],
  stops: [[0, 'red'], [100, 'green']]
}

Closing because I'm not seeing anything directly actionable.

@andrewharvey
Copy link
Collaborator

@mollymerp The way I understood this is if don't want or can't attach attributes to server side vector tiles, then currently you can't attach it client side in GL JS, which is a desirable feature to have in GL JS.

It could be that you're integrating data from a 3rd party API and can't bake it into the vector tiles, or because their is a potentially huge number of attributes it would make the vector tiles too large to bake it all, and for performance reasons would make sense for the vector tiles just to contain the geometry + an id and you bring in the attributes for those features via a different channel client side.

I've run into this before when trying to make census thematic maps. You might have tens of thousands of individual attributes, so baking them into the vector tiles on the server side isn't as performant as doing it client side.

@redspider
Copy link
Author

Sorry I've got a cold at the moment and maybe didn't do a good job of explaining the issue. @andrewharvey has it covered fortunately.

In this case the dataset itself is not big (363,000 rows), but the number of different cross-sections the user would like to look at is. In simple terms, I'd need an attribute for every one of 18 different months (including subsets) multiplied by 6 different crime types and subsets. My math is hazy but I think that's around 16 million attributes, for every one of around 40,000 meshblock features. That's (sadly) not practical. I could trim it a bit by using contiguous subsets only for the months and trying to avoid combinations where there's no value, but it's still a lot of attributes and results in low-zoom tiles being impractical.

This feels like something mapbox-gl should offer a solution for. Raising an issue for it made sense since many other things seem to be raised like this and then are left to wait with dependencies on other related issues until a solution is viable. However, I respect the decision to close the issue and appreciate both of you taking the time to read!

@lucaswoj
Copy link
Contributor

lucaswoj commented Jun 7, 2016

To be clear, this is a feature we've spitballed and put on the roadmap. I just opened a ticket in the central GL repo to track it: mapbox/DEPRECATED-mapbox-gl#15. Please give that ticket a read and let us know how that would / would not fit into your use case!

@andrewharvey
Copy link
Collaborator

andrewharvey commented Jun 8, 2016

@lucaswoj I'm unsure if that's the best approach.

I think it might be simpler to add an interface to update the properties for your source, and then style functions mapbox/DEPRECATED-mapbox-gl#15 wouldn't care if the property was originally baked into the vector tile or it came from some other source. Otherwise your style functions become unnecessary more complex, won't they?

source.updateData(function (properties) {
   // this can both add new data and update existing data
   properties.population = myDataSource[properties.id];
});

Thoughts?

Although this ticket is about client generated slices of multidimensional datasets (where the properties aren't baked into the vector tiles), it's exactly the same issue of styling using realtime/sensor data. eg. if you have a fixed road network in your vector tile source, but realtime data for current traffic speeds, you won't be streaming out the whole geometry+current traffic data in realtime, you'll have the road network geometry in your vector tiles with an id property and stream in just the speed properties (possibly as geometryless vector tiles?) and join them client side to create a map styled by current traffic speed.

@redspider
Copy link
Author

I note that in a discussion on twitter I had with @mourner , he noted the problem with this pattern is that the workers cannot execute the function provided to updateData because they exist in a different context (i.e. they'd need to serialize then eval both the function, and any data it required).

I believe this could possibly be dealt with by providing the data context to the workers separately, i.e:

source.setWorkerData('myData', aSimpleObjectOfSomeKind);
source.setMergePropertiesFunction('mergeMyData', 'properties.population = myData[properties.id]');

This would enable the workers to serialize and transfer the data only as it changed (and with new worker data move support this would become very quick), and to run the merge properties function once per feature on the fly, caching the result if desired, until the data changes again.

I recognise Mapbox's strong desire not to go handing eval stuff into the workers to date and perhaps for this use case there could be a structured thing instead of the eval that I showed above. Either way I think this would work and perform well for both my use-case and the real-time data case outlined by @andrewharvey

The only downside is that we inevitably end up with N copies of the dataset (one per rendering worker). This may be a temporary thing though, as the worker interface in browsers improve.

@bmenant
Copy link

bmenant commented Jul 31, 2016

I can’t agree more with @andrewharvey

Although this ticket is about client generated slices of multidimensional datasets (where the properties aren't baked into the vector tiles), it's exactly the same issue of styling using realtime/sensor data

For instance, in my case, data properties of the GeoJSON take 5% of the whole file. Geometries change once upon a year, while metadata change every day. Therefore, reprocessing a new vectorTile source because few properties changed is totally inefficient. Not to mention the extra-work and skills needed to deploy and automate the backend stack able to process reasonable amount of geo-data.

As @redspider I’m looking forward to see a viable solution raising from Mapbox’s talented team.

@aratcliffe
Copy link

I started to update an existing app to use Mapbox GL but the data needs to be styled based on calculated properties that are modified through user interaction on the client and thus have needed to park the update. Something like the feature proposed by @andrewharvey would be ideal!

@andrewharvey
Copy link
Collaborator

It looks like #2812 will provide the ground work for this calculated values use case, allowing properties to be updated client side for vector tile sources, at least judging by the demo #2812 (comment).

@aratcliffe
Copy link

That certainly looks promising @andrewharvey !

@andrewharvey
Copy link
Collaborator

The new demo https://www.mapbox.com/mapbox-gl-js/example/data-join/ from @ryanbaumann might be a helpful workaround for some people running into this issue.

The demo uses a categorical DDS to map the id for each feature to a colour.

@andrewharvey
Copy link
Collaborator

The calculated values feature requested here is being implemented at #4777

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants