Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release v4.25.0 #4009

Closed
jeisenbe opened this issue Jan 25, 2020 · 22 comments
Closed

Release v4.25.0 #4009

jeisenbe opened this issue Jan 25, 2020 · 22 comments
Labels

Comments

@jeisenbe
Copy link
Collaborator

It has been nearly 3 months since the last release, and we have 6 PRs merged. I believe we can plan a new release at the end of the coming week, by January 31st.

Perhaps one or two small PRs could be merged by then, for example #3950, #3947, #3933, and #3921

@jeisenbe
Copy link
Collaborator Author

It would also be nice if #4010 or #3952 could be reviewed in time, but that's not required.

@pnorman
Copy link
Collaborator

pnorman commented Jan 28, 2020

I went through and reviewed and merged a bunch. I think the remaining open PRs are either WIP, have changes requested, are tagged with consensus needed, or more than I want to take on tonight.

@jeisenbe
Copy link
Collaborator Author

jeisenbe commented Feb 1, 2020

Release v4.25.0 is now done

@jeisenbe jeisenbe closed this as completed Feb 1, 2020
@mmd-osm
Copy link

mmd-osm commented Feb 5, 2020

FYI: This release has caused serious performance regressions in production and is about to be rolled back.

@imagico
Copy link
Collaborator

imagico commented Feb 5, 2020

Thanks.

At a quick look i could not see any obvious reason for that::

v4.24.1...v4.25.0

@mmd-osm
Copy link

mmd-osm commented Feb 5, 2020

St_pointonsurface turned out to be a massive cpu hog. See the respective issue on chef.

@imagico
Copy link
Collaborator

imagico commented Feb 5, 2020

For reference: openstreetmap/chef#264

Makes sense. Number of uses of ST_PointOnSurface() increased from 8 to 12 this release - which means this was likely already a problem before to some extent.

See also #1644.

@imagico
Copy link
Collaborator

imagico commented Feb 5, 2020

Note the actual performance killer is likely text-poly-low-zoom and this could be fixed temporarily by simply replacing

https://github.com/gravitystorm/openstreetmap-carto/blob/v4.25.0/project.mml#L1965

with way again and having Mapnik place the labels.

Also in general introducing a zoom level dependent upper way area limit to that query (because we have an upper way_pixels limit in MSS code for all polygon labels) could reduce the number of big and expensive polygons to be processed at higher zoom levels.

@mmd-osm
Copy link

mmd-osm commented Feb 5, 2020

If you believe this could be fixed on short notice and released as v4.25.1, maybe add a comment on the chef issue so that everyone is on the same page.

@imagico
Copy link
Collaborator

imagico commented Feb 5, 2020

Not really, since the mentioned hack would not be what we actually want to do, it would be a temporary workaround. I am also not completely sure with this being the actual performance killer because that is a z=0-9 only layer. So rolling back and testing if a PostGIS update will solve the problem seems the safest approach. If that does not turn out to be a solution or not viable in the short term we need to re-evaluate #1644.

@mmd-osm
Copy link

mmd-osm commented Feb 5, 2020

As Tom mentioned, upgrading to a newer PostGIS version would be part of moving to a newer PostgreSQL version and reloading the database (also see this issue: openstreetmap/chef#211 (comment)).

I don't know the exact timelines here, i.e. running those tests on a newer PostGIS version might take a while. If you happen to have a local planet database for testing, this might be a good alternative.

Could you add a comment on the chef issue that you recommend a roll back? I don't want to give any recommendations which could be seen as speaking on behalf of the carto project.

@pnorman
Copy link
Collaborator

pnorman commented Feb 5, 2020

I'm really skeptical that a z0-z9 layer is the cause of any problems because that would show up at https://munin.openstreetmap.org/openstreetmap.org/rhaegal.openstreetmap.org/renderd_zoom.html

In fact, looking at the throughput the peak throughput remains the same, so I don't see that we have any style issues.

image

@mmd-osm
Copy link

mmd-osm commented Feb 5, 2020

I mentioned on the chef repo that Odin has maxed out on cpu and people were complaining about gray tiles since a few days.

https://munin.openstreetmap.org/openstreetmap.org/odin.openstreetmap.org/cpu.html

@imagico
Copy link
Collaborator

imagico commented Feb 5, 2020

Yes, text-poly-low-zoom is unlikely to be the only culprit - and as mentioned on openstreetmap/chef#264 (comment) it is a bit weird that we should have major problems with ST_PointOnSurface() now while most of the major uses of it were already added in previous releases.

@pnorman
Copy link
Collaborator

pnorman commented Feb 5, 2020

I mentioned on the chef repo that Odin has maxed out on cpu and people were complaining about gray tiles since a few days.

sure, but none of that indicates a stylesheet problem. I expect CPU usage to spike when the server is given a bunch more tiles to render.

On a CPU-constrained server the throughput staying the same is pretty conclusive to me.

@mmd-osm
Copy link

mmd-osm commented Feb 5, 2020

Besides the increased CPU util across most tile servers, I also found the increased memory consumption quite odd: https://munin.openstreetmap.org/openstreetmap.org/rhaegal.openstreetmap.org/memory.html

I think it's reasonable to assume that a new style would trigger some re-rendering, and some extra CPU load. However, for some reason, previous style updates weren't nearly as visible on munin as this one.

@jeisenbe
Copy link
Collaborator Author

jeisenbe commented Feb 6, 2020

@pnorman what is your comment here openstreetmap/chef#264 (comment) about planning a new major release soon (including the schema_changes, new index, and suggesting upgrade to postgis 3.0+), vs releasing a minor update v4.25.1 to try to solve the performance problems?

@pnorman
Copy link
Collaborator

pnorman commented Feb 6, 2020

I see no need for a v4.25.1 because I see nothing wrong with v4.25.0. I think we're getting close on landing schema_changes.

@mmd-osm
Copy link

mmd-osm commented Feb 6, 2020

I think the increased number of dropped tiles is something users are experiencing (and complaining about) right now.

@StyXman
Copy link
Contributor

StyXman commented Feb 6, 2020

I wonder if the point_on_surface() value couldn't be calculated at import time. Does it even make sense?

@jeisenbe
Copy link
Collaborator Author

I wonder if the point_on_surface() value could be calculated at import time?

That is an interesting idea. I am not aware of a way that this could be done in osm2pgsql or the lua transformations.

@pnorman - any idea about this?

@pnorman
Copy link
Collaborator

pnorman commented Feb 16, 2020

See #4030 for something like that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants