Skip to content
This repository has been archived by the owner on Jan 12, 2024. It is now read-only.

Certain selector combinations negatively affect compile time #470

Open
amandasaurus opened this issue Mar 24, 2017 · 19 comments
Open

Certain selector combinations negatively affect compile time #470

amandasaurus opened this issue Mar 24, 2017 · 19 comments

Comments

@amandasaurus
Copy link

I have ported the openstreetmap-carto style to vector tiles, and am using tessera, which is based on tilelive, to serve the raster tiles (which are created from vectortiles). However it takes a very long time to start up (~4 minutes) and that's from carto's processing of the cartocss files. I initially reported this on the tessera issue tracker, see more background there

Using simple print statement debugging, I can see that some of the longest parts are inside the rules = inheritDefinitions(matching, env); call. For some osm-carto layers, there are thousands of items in the matching variable. (e.g. test-polyg-low-zoom has 6,005. bridges has 1,099, test-poly as 6,064 etc). It takes about 1 second (on my machine) for every 1,000 items. so some of these layers take about 5sec to process. There are lots of layers, so it's about 3½ minutes to parse the whole file.

A weird thing is that I can use the same node_modules to parse the upstream project.mml into mapnik XML in 24sec. This is only a problem when parsing a vector tiles file.

I have patched tilelive-tm{style,source} to use the latest carto (0.17.2).

You can test this yourself by checking out the openstreetmap-carto-vector-tiles project and doing make tessera (which will serve the tiles after a while).

@nebulon42
Copy link
Collaborator

I'm undecided if this is a bug or not, but this would need improvement. Thanks for the investigation so far. openstreetmap-carto has slow layers also in the Mapnik XML case but not that slow. I know not enough about the vector workflow to say what could be different there. Needs more investigation.

@amandasaurus
Copy link
Author

Some more "print statement debugging" tells me that for the upstream project, the layers have much less items in the matching vartiable. text-poly-low-zoom only has 1,016 layers, bridges has 951, text-poly has 1.054. This is probably why the non-vector-tile version is much faster (30sec vs 3½min).

I know openstreetmap-carto is complicated and slow, but it's the difference which is the bug for me. If it took about the same time, that would be one thing.

To me, this is a bug, since it negatively affects what should be a valid work flow.

@nebulon42 nebulon42 added bug and removed enhancement labels Mar 24, 2017
@nebulon42
Copy link
Collaborator

Ok, let's call it a bug then.

@nebulon42
Copy link
Collaborator

Doesn't run for me. make tessera failed. I have singled out the command below.

$ ./node_modules/.bin/tessera -c tessera-serve-vector-tiles.json
/dir/openstreetmap-carto-vector-tiles/node_modules/tilelive-cache/index.js:232
    uri.query = uri.query || {};
                   ^

TypeError: Cannot read property 'query' of undefined
    at Object.<anonymous> (/dir/openstreetmap-carto-vector-tiles/node_modules/tilelive-cache/index.js:232:20)
    at Object.load (/dir/openstreetmap-carto-vector-tiles/node_modules/locking-cache/index.js:85:17)
    at /dir/openstreetmap-carto-vector-tiles/node_modules/tessera/server.js:100:16
    at Array.forEach (native)
    at module.exports (/dir/openstreetmap-carto-vector-tiles/node_modules/tessera/server.js:88:25)
    at Object.<anonymous> (/dir/openstreetmap-carto-vector-tiles/node_modules/tessera/bin/tessera.js:82:30)
    at Module._compile (module.js:570:32)
    at Object.Module._extensions..js (module.js:579:10)
    at Module.load (module.js:487:32)
    at tryModuleLoad (module.js:446:12)

Have you seen that before?

@amandasaurus
Copy link
Author

I've seen that. You can work around it by running the tmsource on port 8081 in one tessera process/terminatl window, and the tmstyle in another window. I'll post full details on Monday

@amandasaurus
Copy link
Author

amandasaurus commented Mar 25, 2017

I have fixed the my vector tiles project, so make tessera will work now. Please pull from the current master.

@nebulon42
Copy link
Collaborator

nebulon42 commented Mar 26, 2017

Some random notes not necessarily related to the behaviour, primarily for myself:

  • Is there a way to easily change datasource parameters (db, user, password) and not having to change them in osm-carto.tm2source/data.yml?
  • tilelive-tmsource/tilelive-tmstyle uses an outdated version of carto (0.14.x) as also noted above
  • the slow rendering happens in tilelive-tmstyle, not in tilelive-tmsource
  • not any really siginificant differences in terms of Stylesheet object as input for the renderer when comparing vanilla osm-carto and osm-carto-vector-tiles
  • Kosmtik seems to have the same problem, since I'm much more familiar with Kosmtik I will concentrate on this.
  • I suspect it might have to do something with the layers being referenced by their class attribute in the style files.
  • in the matching variable there are an identical number of references to amenity-points.mss for the vector and non-vector case
  • in the same variable the number of "elements": [ is almost 6x higher for the vector case, the overall number of lines deviates about the factor 6-7
  • classIndex is not correctly set for the vector case, because the layer classes are not correctly set in the vector case
  • adding the respective class to project.yml e.g. - {id: text-poly-low-zoom, class: text-low-zoom} did not change the behaviour, but now layer classes are correctly set

@nebulon42
Copy link
Collaborator

I have turned on benchmarking mode for carto, which you can achieve if you call new carto.Renderer({ benchmark: true }, {}).render(opts) in tilelive-tmsource. It reveals that the following layers are quite slow:

processing layer: text-poly-low-zoom
Inheritance time: 43857ms
...
processing layer: text-poly
Inheritance time: 54167ms
...
processing layer: text-point
Inheritance time: 53358ms

These three layers alone take nearly 3 minutes on my machine.

For comparison compiling osm-carto with the benchmark option on the command line gives:

processing layer: text-poly-low-zoom
Inheritance time: 1411ms
...
processing layer: text-poly
Inheritance time: 2062ms
...
processing layer: text-point
Inheritance time: 2044ms

@nebulon42 nebulon42 added this to the 0.18 milestone Mar 26, 2017
@nebulon42 nebulon42 added question and removed bug labels Mar 26, 2017
@nebulon42
Copy link
Collaborator

nebulon42 commented Mar 26, 2017

The reason seems to be that in the vector case the layer classes that osm-carto uses are not correctly set and thus not passed on to the renderer. This may lead to inheritance with way too many matching rules. Likely not a carto bug, but something that has to be done differently in osm-carto.tm2/project.yml.

The downstream version of osm-carto does not use classes, so this statement was invalid.

EDIT: changed project.yml e.g. - {id: text-poly-low-zoom, class: text-low-zoom}, but no effect, strange

@amandasaurus
Copy link
Author

Is there a way to easily change datasource parameters (db, user, password) and not having to change them in osm-carto.tm2source/data.yml?

If you change the project.mml in the root of the directory and re-run the make.

tilelive-tmsource/tilelive-tmstyle uses an outdated version of carto (0.14.x)

Yes. I have changed tilelive-tmsource & -tmstyle to use "latest", and the same problem occurs.

@amandasaurus
Copy link
Author

Further investigation implies that this isn't a bug in carto, per se, but that some of the changes to support vector tiles made the CartoCSS much more complicated, with more permutations, and hence it takes carto much longer to process and parse the file. You can see this if you open up the style files for upstream osm-carto and the vector tiles in a diff viewer (e.g. meld), and copy some of the changes to the original osm-carto style. You'll see the carto command takes much longer, and see that there are more matching values to process¹

Specifically look at the commits/patches to support way_pixels, or to remove all carto classes and use ids instead.

Which implies there's almost no way to fix this problem, short of making carto much faster....


¹ I have put this line in after L43 of lib/carto/renderer.js and it will print out the number of matching values.

console.warn("Matching l.name="+String(l.name)+" has "+matching.length+" values");

@nebulon42
Copy link
Collaborator

Ah, you removed the classes. I have not seen that. That means that I was on the wrong track. Until now I thought there have not been a lot of changes in the style. But now that means I cannot compare the two variants of the style anymore.

@nebulon42
Copy link
Collaborator

I think geofabrik/openstreetmap-carto-vector-tiles@fb863ad might be (one of) the reason/s for the rule explosion in the text layers.

@nebulon42
Copy link
Collaborator

nebulon42 commented Mar 26, 2017

There are some shortcomings in carto that require one to be careful what kind of CartoCSS is written. See e.g. #206 or #20. This needs improvement, but is hard.

@amandasaurus
Copy link
Author

The following simple patch on the current HEAD of osm-carto (8be18f047) makes the text-poly-low-zoom have 4,760 (from ~1,000), and makes the whole carto process take 2½ minutes.

diff --git a/amenity-points.mss b/amenity-points.mss
index 0cba35b..888835c 100644
--- a/amenity-points.mss
+++ b/amenity-points.mss
@@ -1597,18 +1597,22 @@
   [feature = 'leisure_track'],
   [feature = 'leisure_dog_park'],
   [feature = 'leisure_pitch'] {
-    [zoom >= 10][way_pixels > 3000][is_building = 'no'],
+    [zoom >= 10][way_pixels > 3000][is_building = 'no'], [is_building = 'no'][zoom=14][way_pixels > 3000], [is_building = 'no'][zoom=15][way_pixels > 750], [is_building = 'no'][zoom=16][way_pixels > 187],
     [zoom >= 17][is_building = 'no'],
-    [zoom >= 10][way_pixels > 3000][feature = 'shop_mall'],
+    [zoom >= 10][way_pixels > 3000][feature = 'shop_mall'], [feature = 'shop_mall'][zoom=14][way_pixels > 3000], [feature = 'shop_mall'][zoom=15][way_pixels > 750], [feature = 'shop_mall'][zoom=16][way_pixels > 187],
     [zoom >= 17][feature = 'shop_mall'] {
       text-name: "[name]";
       text-size: @landcover-font-size;
       text-wrap-width: @landcover-wrap-width-size;
-      [way_pixels > 12000] {
+      [way_pixels > 12000],
+      [zoom=14][way_pixels > 12000], [zoom=15][way_pixels > 3000], [zoom=16][way_pixels > 750], [zoom=17][way_pixels > 187], [zoom=18][way_pixels > 46], [zoom>=19][way_pixels > 11]
+      {
         text-size: @landcover-font-size-big;
         text-wrap-width: @landcover-wrap-width-size-big;
       }
-      [way_pixels > 48000] {
+      [way_pixels > 48000],
+      [zoom=14][way_pixels > 48000], [zoom=15][way_pixels > 12000], [zoom=16][way_pixels > 3000], [zoom=17][way_pixels > 750], [zoom=18][way_pixels > 187], [zoom>=19][way_pixels > 46]
+      {
         text-size: @landcover-font-size-bigger;
         text-wrap-width: @landcover-wrap-width-size-bigger;
       }

@nebulon42
Copy link
Collaborator

nebulon42 commented Mar 26, 2017

Yes, I have also tried reverting geofabrik/openstreetmap-carto-vector-tiles@fb863ad and the performance is back to "normal" for the text layers. Of course reverting is no option, but until #206, #20 or other problems are fixed, there have to be measures taken in the style to keep the performance in reasonable bounds. Help with improvements is always welcome. I myself am also quite late to the carto party and do not have insight into everything.

@nebulon42 nebulon42 removed this from the 0.18 milestone Mar 26, 2017
@nebulon42
Copy link
Collaborator

@rory Side note: You could also try magnacarto a Go implementation of CartoCSS, which I have made available to Node.js. It works with Kosmtik by installing the kosmtik-magnacarto plugin and specifiying --renderer magnacarto.

While it works well with the raster approach I just tried the vector approach but could not get it to work because the parser is called twice, once with Stylesheet property and once without. While carto works fine without this property magnacarto does not like that. You could also try plugging it into tessera if you like that better. magnacarto parses generally faster and I have not seen any longer parsing times with your modifications to osm-carto.

@amandasaurus
Copy link
Author

That's an interesting suggestion. However my nodejs knowledge is current not good enough for that. :)

@amandasaurus
Copy link
Author

I can probably improve my CartoCSS selectors. In this example I have one nested zoom selectors, which is probably causing a combinatorial expansion in terms, and hence causing the slow down.

@nebulon42 nebulon42 changed the title Incredibly slow when trying to parse large file (but only when it's a vector tiles file) Certain selector combinations negatively affect parse time Mar 27, 2017
@nebulon42 nebulon42 changed the title Certain selector combinations negatively affect parse time Certain selector combinations negatively affect compile time Mar 27, 2017
amandasaurus pushed a commit to geofabrik/openstreetmap-carto-vector-tiles that referenced this issue Mar 28, 2017
…sible

Previously, this style had attempted to replicate the upstream osm-carto
exactly. This meant the using the `way_pixels` and zoom trick to get the
same results. But that results in a combinatorial explosion for
amenity-points, which means carto can go from 30 seconds to parse the
file, to 3½minutes. By removing this inner selectors, the carto
performance will be back to about 30 seconds.

This affects the start up time of tessera/tilelive.

cf.:

 * mapbox/carto#470
 * mapbox/carto#20
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants