Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read shapefiles in UTF-8 mode #137

Merged
merged 4 commits into from
Mar 17, 2022
Merged

Conversation

wipfli
Copy link
Contributor

@wipfli wipfli commented Mar 16, 2022

Currently, if a shapefile has attributes with letters like ä, ö, ü, etc, the resulting tiles show some strange letters. This pull requests fixes this issue by hard-coding the shapefile mode to UTF-8.



@wipfli
Copy link
Contributor Author

wipfli commented Mar 16, 2022

Current (bad):

image

With this patch (good):

image

This example rendered the following shapefile: https://data.geo.admin.ch/ch.bafu.schutzgebiete-aulav_jagdbanngebiete/

@github-actions
Copy link

github-actions bot commented Mar 16, 2022

Base 0a06479 This Branch f367bf3
0:02:02 DEB [mbtiles] - Tile stats:
0:02:02 DEB [mbtiles] - z0 avg:7.9k max:7.9k
0:02:02 DEB [mbtiles] - z1 avg:4k max:4k
0:02:02 DEB [mbtiles] - z2 avg:9.4k max:9.4k
0:02:02 DEB [mbtiles] - z3 avg:3.9k max:6.4k
0:02:02 DEB [mbtiles] - z4 avg:1.6k max:4.6k
0:02:02 DEB [mbtiles] - z5 avg:1.4k max:8.1k
0:02:02 DEB [mbtiles] - z6 avg:1.4k max:24k
0:02:02 DEB [mbtiles] - z7 avg:898 max:33k
0:02:02 DEB [mbtiles] - z8 avg:366 max:48k
0:02:02 DEB [mbtiles] - z9 avg:296 max:278k
0:02:02 DEB [mbtiles] - z10 avg:164 max:232k
0:02:02 DEB [mbtiles] - z11 avg:107 max:131k
0:02:02 DEB [mbtiles] - z12 avg:85 max:118k
0:02:02 DEB [mbtiles] - z13 avg:72 max:109k
0:02:02 DEB [mbtiles] - z14 avg:68 max:256k
0:02:02 DEB [mbtiles] - all avg:70 max:0
0:02:02 DEB [mbtiles] -  # features: 5,276,529
0:02:02 DEB [mbtiles] -     # tiles: 4,115,418
0:02:02 INF [mbtiles] - Finished in 30s cpu:54s avg:1.8
0:02:02 INF [mbtiles] -   read    1x(8% 2s wait:24s)
0:02:02 INF [mbtiles] -   encode  2x(45% 14s wait:8s)
0:02:02 INF [mbtiles] -   write   1x(36% 11s sys:1s wait:16s)
0:02:02 INF - Finished in 2m2s cpu:3m36s gc:4s avg:1.8
0:02:02 INF - FINISHED!
0:02:02 INF - 
0:02:02 INF - ----------------------------------------
0:02:02 INF - 	overall          2m2s cpu:3m36s gc:4s avg:1.8
0:02:02 INF - 	lake_centerlines 1s cpu:2s avg:1.8
0:02:02 INF - 	  read     1x(52% 0.6s)
0:02:02 INF - 	  process  2x(11% 0.1s)
0:02:02 INF - 	  write    1x(0% 0s)
0:02:02 INF - 	water_polygons   28s cpu:49s gc:2s avg:1.7
0:02:02 INF - 	  read     1x(59% 17s sys:1s wait:3s)
0:02:02 INF - 	  process  2x(27% 8s wait:14s)
0:02:02 INF - 	  write    1x(4% 1s wait:27s)
0:02:02 INF - 	natural_earth    11s cpu:18s avg:1.6
0:02:02 INF - 	  read     1x(89% 10s sys:2s)
0:02:02 INF - 	  process  2x(19% 2s wait:10s)
0:02:02 INF - 	  write    1x(0% 0s wait:10s)
0:02:02 INF - 	osm_pass1        3s cpu:6s avg:1.8
0:02:02 INF - 	  read     1x(3% 0.1s wait:3s)
0:02:02 INF - 	  parse    1x(74% 2s)
0:02:02 INF - 	  process  1x(37% 1s wait:2s)
0:02:02 INF - 	osm_pass2        35s cpu:1m9s avg:2
0:02:02 INF - 	  read     1x(0% 0.1s wait:4s done:31s)
0:02:02 INF - 	  process  2x(79% 27s)
0:02:02 INF - 	  write    1x(1% 0.4s wait:34s)
0:02:02 INF - 	boundaries       0s cpu:0.1s avg:1.5
0:02:02 INF - 	sort             5s cpu:7s avg:1.6
0:02:02 INF - 	  worker  2x(43% 2s done:2s)
0:02:02 INF - 	mbtiles          30s cpu:54s avg:1.8
0:02:02 INF - 	  read    1x(8% 2s wait:24s)
0:02:02 INF - 	  encode  2x(45% 14s wait:8s)
0:02:02 INF - 	  write   1x(36% 11s sys:1s wait:16s)
0:02:02 INF - ----------------------------------------
0:02:02 INF - 	features	268MB
0:02:02 INF - 	mbtiles	514MB
-rw-r--r-- 1 runner docker 55M Mar 17 07:49 run.jar
0:02:04 DEB [mbtiles] - Tile stats:
0:02:04 DEB [mbtiles] - z0 avg:7.9k max:7.9k
0:02:04 DEB [mbtiles] - z1 avg:4k max:4k
0:02:04 DEB [mbtiles] - z2 avg:9.4k max:9.4k
0:02:04 DEB [mbtiles] - z3 avg:3.9k max:6.4k
0:02:04 DEB [mbtiles] - z4 avg:1.6k max:4.6k
0:02:04 DEB [mbtiles] - z5 avg:1.4k max:8.1k
0:02:04 DEB [mbtiles] - z6 avg:1.4k max:24k
0:02:04 DEB [mbtiles] - z7 avg:898 max:33k
0:02:04 DEB [mbtiles] - z8 avg:366 max:48k
0:02:04 DEB [mbtiles] - z9 avg:296 max:278k
0:02:04 DEB [mbtiles] - z10 avg:164 max:232k
0:02:04 DEB [mbtiles] - z11 avg:107 max:131k
0:02:04 DEB [mbtiles] - z12 avg:85 max:118k
0:02:04 DEB [mbtiles] - z13 avg:72 max:109k
0:02:04 DEB [mbtiles] - z14 avg:68 max:256k
0:02:04 DEB [mbtiles] - all avg:70 max:0
0:02:04 DEB [mbtiles] -  # features: 5,276,529
0:02:04 DEB [mbtiles] -     # tiles: 4,115,418
0:02:04 INF [mbtiles] - Finished in 29s cpu:53s avg:1.8
0:02:04 INF [mbtiles] -   read    1x(8% 2s wait:24s)
0:02:04 INF [mbtiles] -   encode  2x(45% 13s wait:7s)
0:02:04 INF [mbtiles] -   write   1x(37% 11s sys:1s wait:16s)
0:02:04 INF - Finished in 2m5s cpu:3m37s gc:4s avg:1.7
0:02:04 INF - FINISHED!
0:02:04 INF - 
0:02:04 INF - ----------------------------------------
0:02:04 INF - 	overall          2m5s cpu:3m37s gc:4s avg:1.7
0:02:04 INF - 	lake_centerlines 1s cpu:2s avg:1.8
0:02:04 INF - 	  read     1x(56% 0.6s)
0:02:04 INF - 	  process  2x(12% 0.1s)
0:02:04 INF - 	  write    1x(0% 0s wait:1s)
0:02:04 INF - 	water_polygons   27s cpu:48s gc:2s avg:1.8
0:02:04 INF - 	  read     1x(58% 16s wait:3s)
0:02:04 INF - 	  process  2x(28% 8s wait:12s)
0:02:04 INF - 	  write    1x(4% 1s wait:26s)
0:02:04 INF - 	natural_earth    11s cpu:17s avg:1.6
0:02:04 INF - 	  read     1x(89% 10s sys:1s)
0:02:04 INF - 	  process  2x(19% 2s wait:10s)
0:02:04 INF - 	  write    1x(0% 0s wait:10s)
0:02:04 INF - 	osm_pass1        3s cpu:6s avg:1.9
0:02:04 INF - 	  read     1x(4% 0.1s wait:3s)
0:02:04 INF - 	  process  1x(33% 1s wait:2s)
0:02:04 INF - 	  parse    1x(74% 3s)
0:02:04 INF - 	osm_pass2        36s cpu:1m12s avg:2
0:02:04 INF - 	  read     1x(0% 0s wait:4s done:33s)
0:02:04 INF - 	  process  2x(76% 28s)
0:02:04 INF - 	  write    1x(1% 0.4s wait:36s)
0:02:04 INF - 	boundaries       0s cpu:0.1s avg:1.3
0:02:04 INF - 	sort             4s cpu:6s avg:1.2
0:02:04 INF - 	  worker  2x(41% 2s done:2s)
0:02:04 INF - 	mbtiles          29s cpu:53s avg:1.8
0:02:04 INF - 	  read    1x(8% 2s wait:24s)
0:02:04 INF - 	  encode  2x(45% 13s wait:7s)
0:02:04 INF - 	  write   1x(37% 11s sys:1s wait:16s)
0:02:04 INF - ----------------------------------------
0:02:04 INF - 	features	268MB
0:02:04 INF - 	mbtiles	514MB
-rw-r--r-- 1 runner docker 55M Mar 17 07:47 run.jar

https://github.com/onthegomap/planetiler/actions/runs/1997273067

ℹ️ Base Logs 0a06479
0:00:00 DEB - argument: config=null (path to config file)
0:00:00 DEB - argument: area=rhode island (name of the extract to download if osm_url/osm_path not specified (i.e. 'monaco' 'rhode island' 'australia' or 'planet'))
0:00:00 INF - Using in-memory stats
0:00:00 INF [overall] - 
0:00:00 INF [overall] - Starting...
0:00:00 DEB - argument: bounds=Env[-74.07 : -17.84, 21.34 : 43.55] (bounds)
0:00:00 DEB - argument: threads=2 (num threads)
0:00:00 DEB - argument: loginterval=10 seconds (time between logs)
0:00:00 DEB - argument: minzoom=0 (minimum zoom level)
0:00:00 DEB - argument: maxzoom=14 (maximum zoom level (limit 14))
0:00:00 DEB - argument: defer_mbtiles_index_creation=false (skip adding index to mbtiles file)
0:00:00 DEB - argument: optimize_db=false (optimize mbtiles after writing)
0:00:00 DEB - argument: emit_tiles_in_order=true (emit tiles in index order)
0:00:00 DEB - argument: force=false (overwriting output file and ignore disk/RAM warnings)
0:00:00 DEB - argument: gzip_temp=false (gzip temporary feature storage (uses more CPU, but less disk space))
0:00:00 DEB - argument: sort_max_readers=6 (maximum number of concurrent read threads to use when sorting chunks)
0:00:00 DEB - argument: sort_max_writers=6 (maximum number of concurrent write threads to use when sorting chunks)
0:00:00 DEB - argument: nodemap_type=sortedtable (type of node location map: noop, sortedtable, or sparsearray)
0:00:00 DEB - argument: nodemap_storage=mmap (storage for location map: mmap or ram)
0:00:00 DEB - argument: nodemap_madvise=false (use linux madvise(random) to improve memory-mapped read performance)
0:00:00 DEB - argument: http_user_agent=Planetiler downloader (https://github.com/onthegomap/planetiler) (User-Agent header to set when downloading files over HTTP)
0:00:00 DEB - argument: http_timeout=30 seconds (Timeout to use when downloading files over HTTP)
0:00:00 DEB - argument: http_retries=1 (Retries to use when downloading files over HTTP)
0:00:00 DEB - argument: download_chunk_size_mb=100 (Size of file chunks to download in parallel in megabytes)
0:00:00 DEB - argument: download_threads=1 (Number of parallel threads to use when downloading each file)
0:00:00 DEB - argument: min_feature_size_at_max_zoom=0.0625 (Default value for the minimum size in tile pixels of features to emit at the maximum zoom level to allow for overzooming)
0:00:00 DEB - argument: min_feature_size=1.0 (Default value for the minimum size in tile pixels of features to emit below the maximum zoom level)
0:00:00 DEB - argument: simplify_tolerance_at_max_zoom=0.0625 (Default value for the tile pixel tolerance to use when simplifying features at the maximum zoom level to allow for overzooming)
0:00:00 DEB - argument: simplify_tolerance=0.1 (Default value for the tile pixel tolerance to use when simplifying features below the maximum zoom level)
0:00:00 DEB - argument: osm_lazy_reads=false (Read OSM blocks from disk in worker threads)
0:00:00 DEB - argument: tmpdir=data/tmp (temp directory)
0:00:00 DEB - argument: only_download=false (download source data then exit)
0:00:00 DEB - argument: download=false (download sources)
0:00:00 DEB - argument: only_fetch_wikidata=false (fetch wikidata translations then quit)
0:00:00 DEB - argument: fetch_wikidata=false (fetch wikidata translations then continue)
0:00:00 DEB - argument: use_wikidata=true (use wikidata translations)
0:00:00 DEB - argument: wikidata_cache=data/sources/wikidata_names.json (wikidata cache file)
0:00:00 DEB - argument: lake_centerlines_path=data/sources/lake_centerline.shp.zip (lake_centerlines shapefile path)
0:00:00 DEB - argument: free_lake_centerlines_after_read=false (delete lake_centerlines input file after reading to make space for output (reduces peak disk usage))
0:00:00 DEB - argument: water_polygons_path=data/sources/water-polygons-split-3857.zip (water_polygons shapefile path)
0:00:00 DEB - argument: free_water_polygons_after_read=false (delete water_polygons input file after reading to make space for output (reduces peak disk usage))
0:00:00 DEB - argument: natural_earth_path=data/sources/natural_earth_vector.sqlite.zip (natural_earth sqlite db path)
0:00:00 DEB - argument: free_natural_earth_after_read=false (delete natural_earth input file after reading to make space for output (reduces peak disk usage))
0:00:00 DEB - argument: osm_path=data/sources/rhode_island.osm.pbf (osm OSM input file path)
0:00:00 DEB - argument: free_osm_after_read=false (delete osm input file after reading to make space for output (reduces peak disk usage))
0:00:00 DEB - argument: mbtiles=data/out.mbtiles (mbtiles output file)
0:00:00 DEB - argument: transliterate=true (attempt to transliterate latin names)
0:00:00 DEB - argument: languages=am,ar,az,be,bg,br,bs,ca,co,cs,cy,da,de,el,en,eo,es,et,eu,fi,fr,fy,ga,gd,he,hi,hr,hu,hy,id,is,it,ja,ja_kana,ja_rm,ja-Latn,ja-Hira,ka,kk,kn,ko,ko-Latn,ku,la,lb,lt,lv,mk,mt,ml,nl,no,oc,pl,pt,rm,ro,ru,sk,sl,sq,sr,sr-Latn,sv,ta,te,th,tr,uk,zh (languages to use)
0:00:00 DEB - argument: only_layers= (Include only certain layers)
0:00:00 DEB - argument: exclude_layers= (Exclude certain layers)
0:00:00 DEB - argument: boundary_country_names=true (boundary layer: add left/right codes of neighboring countries)
0:00:00 DEB - argument: transportation_z13_paths=false (transportation(_name) layer: show all paths on z13)
0:00:00 DEB - argument: building_merge_z13=true (building layer: merge nearby buildings at z13)
0:00:00 DEB - argument: transportation_name_brunnel=false (transportation_name layer: set to false to omit brunnel and help merge long highways)
0:00:00 DEB - argument: transportation_name_size_for_shield=false (transportation_name layer: allow road names on shorter segments (ie. they will have a shield))
0:00:00 DEB - argument: transportation_name_limit_merge=false (transportation_name layer: limit merge so we don't combine different relations to help merge long highways)
0:00:00 DEB - argument: mbtiles_name=OpenMapTiles ('name' attribute for mbtiles metadata)
0:00:00 DEB - argument: mbtiles_description=A tileset showcasing all layers in OpenMapTiles. https://openmaptiles.org ('description' attribute for mbtiles metadata)
0:00:00 DEB - argument: mbtiles_attribution=<a href="https://www.openmaptiles.org/" target="_blank">&copy; OpenMapTiles</a> <a href="https://www.openstreetmap.org/copyright" target="_blank">&copy; OpenStreetMap contributors</a> ('attribution' attribute for mbtiles metadata)
0:00:00 DEB - argument: mbtiles_version=3.13.0 ('version' attribute for mbtiles metadata)
0:00:00 DEB - argument: mbtiles_type=baselayer ('type' attribute for mbtiles metadata)
0:00:00 DEB - argument: help=false (show arguments then exit)
0:00:00 INF - Building BasemapProfile profile into data/out.mbtiles in these phases:
0:00:00 INF -   lake_centerlines: Process features in data/sources/lake_centerline.shp.zip
0:00:00 INF -   water_polygons: Process features in data/sources/water-polygons-split-3857.zip
0:00:00 INF -   natural_earth: Process features in data/sources/natural_earth_vector.sqlite.zip
0:00:00 INF -   osm_pass1: Pre-process OpenStreetMap input (store node locations then relation members)
0:00:00 INF -   osm_pass2: Process OpenStreetMap nodes, ways, then relations
0:00:00 INF -   sort: Sort rendered features by tile ID
0:00:00 INF -   mbtiles: Encode each tile and write to data/out.mbtiles
0:00:00 INF - error loading /home/runner/work/planetiler/planetiler/data/sources/wikidata_names.json: java.nio.file.NoSuchFileException: data/sources/wikidata_names.json
0:00:00 INF - Using merge sort feature map, chunk size=1000mb workers=2
0:00:02 INF - dataFileCache open start
0:00:02 INF [lake_centerlines] - 
0:00:02 INF [lake_centerlines] - Starting...
0:00:03 INF [lake_centerlines] -  read: [  29k 100%  29k/s ] write: [    0    0/s ] 0    
    cpus: 1.8 gc:  0% mem: 218M/4.2G direct: 499k postGC: 45M
    read( -%) ->    (0/1k) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:00:03 INF [lake_centerlines] -  read: [  29k 100%    0/s ] write: [    0    0/s ] 0    
    cpus: 1.7 gc: 48% mem: 28M/4.2G direct: 24k postGC: 74M
    read( -%) ->    (0/1k) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:00:03 INF [lake_centerlines] - Finished in 1s cpu:2s avg:1.8
0:00:03 INF [lake_centerlines] -   read     1x(52% 0.6s)
0:00:03 INF [lake_centerlines] -   process  2x(11% 0.1s)
0:00:03 INF [lake_centerlines] -   write    1x(0% 0s)
0:00:03 INF [water_polygons] - 
0:00:03 INF [water_polygons] - Starting...
0:00:13 INF [water_polygons] -  read: [ 1.9k  13%  190/s ] write: [  41k   4k/s ] 2.4M 
    cpus: 1.9 gc: 12% mem: 1.1G/4.2G direct: 52M postGC: 1.1G
    read(56%) ->   (35/1k) -> process(39% 11%) -> (886/53k) -> write( 0%)
0:00:23 INF [water_polygons] -  read: [ 4.5k  32%  267/s ] write: [ 233k  19k/s ] 12M  
    cpus: 1.7 gc:  5% mem: 1.7G/4.2G direct: 52M postGC: 1.4G
    read(67%) ->    (0/1k) -> process(29% 29%) ->  (1k/53k) -> write( 1%)
0:00:31 INF [water_polygons] -  read: [  14k 100% 1.2k/s ] write: [ 4.3M 499k/s ] 186M 
    cpus: 1.5 gc:  5% mem: 1.7G/4.2G direct: 52M postGC: 1.6G
    read( -%) ->    (0/1k) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:00:31 INF [water_polygons] -  read: [  14k 100%    0/s ] write: [ 4.3M    0/s ] 186M 
    cpus: 0 gc:  0% mem: 1.7G/4.2G direct: 52M postGC: 1.6G
    read( -%) ->    (0/1k) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:00:31 INF [water_polygons] - Finished in 28s cpu:49s gc:2s avg:1.7
0:00:31 INF [water_polygons] -   read     1x(59% 17s sys:1s wait:3s)
0:00:31 INF [water_polygons] -   process  2x(27% 8s wait:14s)
0:00:31 INF [water_polygons] -   write    1x(4% 1s wait:27s)
0:00:31 INF [natural_earth] - unzipping /home/runner/work/planetiler/planetiler/data/sources/natural_earth_vector.sqlite.zip to data/tmp/natearth.sqlite
0:00:37 INF [natural_earth] - 
0:00:37 INF [natural_earth] - Starting...
0:00:48 INF [natural_earth] -  read: [ 339k  97%  33k/s ] write: [    0    0/s ] 186M 
    cpus: 1.6 gc:  0% mem: 1.5G/4.2G direct: 52M postGC: 1.6G
    read(96%) ->   (14/1k) -> process(19% 21%) -> (138/53k) -> write( 0%)
0:00:48 INF [natural_earth] -  read: [ 349k 100%  25k/s ] write: [  181  470/s ] 186M 
    cpus: 1.8 gc:  0% mem: 1.6G/4.2G direct: 52M postGC: 1.6G
    read( -%) ->    (0/1k) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:00:48 INF [natural_earth] -  read: [ 349k 100%    0/s ] write: [  181    0/s ] 186M 
    cpus: 3.3 gc:  0% mem: 1.6G/4.2G direct: 52M postGC: 1.6G
    read( -%) ->    (0/1k) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:00:48 INF [natural_earth] - Finished in 11s cpu:18s avg:1.6
0:00:48 INF [natural_earth] -   read     1x(89% 10s sys:2s)
0:00:48 INF [natural_earth] -   process  2x(19% 2s wait:10s)
0:00:48 INF [natural_earth] -   write    1x(0% 0s wait:10s)
0:00:49 INF [osm_pass1] - 
0:00:49 INF [osm_pass1] - Starting...
0:00:52 INF [osm_pass1] -  nodes: [ 4.5M 1.3M/s ] 354M  ways: [ 326k  98k/s ] rels: [ 7.5k 2.2k/s ] blocks: [  614  184/s ]
    cpus: 1.8 gc:  0% mem: 2G/4.2G direct: 52M postGC: 670M hppc: 880k
    read( -%) ->     (0/4) -> parse( -%) ->     (0/4) -> process( -%)
0:00:52 DEB [osm_pass1] - processed blocks:614 nodes:4,574,716 ways:326,532 relations:7,511
0:00:52 INF [osm_pass1] - Finished in 3s cpu:6s avg:1.8
0:00:52 INF [osm_pass1] -   read     1x(3% 0.1s wait:3s)
0:00:52 INF [osm_pass1] -   parse    1x(74% 2s)
0:00:52 INF [osm_pass1] -   process  1x(37% 1s wait:2s)
0:00:52 INF [osm_pass2] - 
0:00:52 INF [osm_pass2] - Starting...
0:00:56 DEB [osm_pass2:process] - Sorting long long multimap...
0:00:56 DEB [osm_pass2:process] - Sorted long long multimap 0s cpu:0s avg:1.5
0:00:56 WAR [osm_pass2:process] - No GB polygon for inferring route network types
0:01:02 INF [osm_pass2] -  nodes: [ 4.5M 100% 457k/s ] 354M  ways: [  72k  22% 7.1k/s ] rels: [    0   0%    0/s ] features: [ 4.7M  39k/s ] 217M  blocks: [  581  95%   58/s ]
    cpus: 2 gc:  1% mem: 2G/4.2G direct: 52M postGC: 811M hppc: 1.3M
    read( -%) ->  (31/103) -> process(64% 67%) -> (1.2k/53k) -> write( 2%)
0:01:12 INF [osm_pass2] -  nodes: [ 4.5M 100%    0/s ] 354M  ways: [ 248k  76%  17k/s ] rels: [    0   0%    0/s ] features: [ 5.1M  39k/s ] 243M  blocks: [  603  98%    2/s ]
    cpus: 2 gc:  1% mem: 2.5G/4.2G direct: 52M postGC: 823M hppc:  20M
    read( -%) ->   (9/103) -> process(81% 82%) -> (1.7k/53k) -> write( 2%)
0:01:22 INF [osm_pass2] -  nodes: [ 4.5M 100%    0/s ] 354M  ways: [ 326k 100% 7.8k/s ] rels: [ 4.2k  56%  420/s ] features: [ 5.2M  15k/s ] 261M  blocks: [  613 100%   <1/s ]
    cpus: 2 gc:  1% mem: 2G/4.2G direct: 52M postGC: 824M hppc:  29M
    read( -%) ->   (0/103) -> process(82% 82%) -> (867/53k) -> write( 1%)
0:01:27 INF [osm_pass2] -  nodes: [ 4.5M 100%    0/s ] 354M  ways: [ 326k 100%    0/s ] rels: [ 7.5k 100%  700/s ] features: [ 5.2M 3.2k/s ] 268M  blocks: [  614 100%   <1/s ]
    cpus: 2 gc:  0% mem: 3.1G/4.2G direct: 52M postGC: 824M hppc:  29M
    read( -%) ->   (0/103) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:01:27 INF [osm_pass2] -  nodes: [ 4.5M 100%    0/s ] 354M  ways: [ 326k 100%    0/s ] rels: [ 7.5k 100%    0/s ] features: [ 5.2M    0/s ] 268M  blocks: [  614 100%    0/s ]
    cpus: 0 gc:  0% mem: 3.1G/4.2G direct: 52M postGC: 824M hppc:  29M
    read( -%) ->   (0/103) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:01:27 DEB [osm_pass2] - processed blocks:614 nodes:4,574,716 ways:326,532 relations:7,511
0:01:27 INF [osm_pass2] - Finished in 35s cpu:1m9s avg:2
0:01:27 INF [osm_pass2] -   read     1x(0% 0.1s wait:4s done:31s)
0:01:27 INF [osm_pass2] -   process  2x(79% 27s)
0:01:27 INF [osm_pass2] -   write    1x(1% 0.4s wait:34s)
0:01:27 INF [boundaries] - 
0:01:27 INF [boundaries] - Starting...
0:01:27 INF [boundaries] - Creating polygons for 1 boundaries
0:01:27 WAR [boundaries] - Unable to form closed polygon for OSM relation 148838 (likely missing edges)
0:01:27 INF [boundaries] - Finished creating 0 country polygons
0:01:27 INF [boundaries] - Finished in 0s cpu:0.1s avg:1.5
0:01:27 INF - Deleting node.db to make room for output file
0:01:27 INF [sort] - 
0:01:27 INF [sort] - Starting...
0:01:31 INF [sort] -  chunks: [   1 /   1 100% ] 268M 
    cpus: 1.3 gc:  3% mem: 605M/4.2G direct: 52M postGC: 356M
    ->     (0/4) -> worker( -%  -%)
0:01:31 INF [sort] -  chunks: [   1 /   1 100% ] 268M 
    cpus: 2.2 gc:  0% mem: 605M/4.2G direct: 52M postGC: 356M
    ->     (0/4) -> worker( -%  -%)
0:01:32 INF [sort] - Finished in 5s cpu:7s avg:1.6
0:01:32 INF [sort] -   worker  2x(43% 2s done:2s)
0:01:32 INF - read:1s write:1s sort:1s
0:01:32 INF [mbtiles] - 
0:01:32 INF [mbtiles] - Starting...
0:01:32 DEB [mbtiles:write] - Execute mbtiles: create table metadata (name text, value text);
0:01:32 DEB [mbtiles:write] - Execute mbtiles: create unique index name on metadata (name);
0:01:32 DEB [mbtiles:write] - Execute mbtiles: create table tiles (zoom_level integer, tile_column integer, tile_row, tile_data blob);
0:01:32 DEB [mbtiles:write] - Execute mbtiles: create unique index tile_index on tiles (zoom_level, tile_column, tile_row)
0:01:32 DEB [mbtiles:write] - Set mbtiles metadata: name=OpenMapTiles
0:01:32 DEB [mbtiles:write] - Set mbtiles metadata: format=pbf
0:01:32 DEB [mbtiles:write] - Set mbtiles metadata: description=A tileset showcasing all layers in OpenMapTiles. https://openmaptiles.org
0:01:32 DEB [mbtiles:write] - Set mbtiles metadata: attribution=<a href="https://www.openmaptiles.org/" target="_blank">&copy; OpenMapTiles</a> <a href="https://www.openstreetmap.org/copyright" target="_blank">&copy; OpenStreetMap contributors</a>
0:01:32 DEB [mbtiles:write] - Set mbtiles metadata: version=3.13.0
0:01:32 DEB [mbtiles:write] - Set mbtiles metadata: type=baselayer
0:01:32 DEB [mbtiles:write] - Set mbtiles metadata: bounds=-74.07,21.34,-17.84,43.55
0:01:32 DEB [mbtiles:write] - Set mbtiles metadata: center=-45.955,32.445,3
0:01:32 DEB [mbtiles:write] - Set mbtiles metadata: minzoom=0
0:01:32 DEB [mbtiles:write] - Set mbtiles metadata: maxzoom=14
0:01:32 DEB [mbtiles:write] - Set mbtiles metadata: json={"vector_layers":[{"id":"aerodrome_label","fields":{"name_int":"String","iata":"String","ele_ft":"Number","name_de":"String","name":"String","icao":"String","name:en":"String","class":"String","ele":"Number","name_en":"String","name:latin":"String"},"minzoom":10,"maxzoom":14},{"id":"aeroway","fields":{"ref":"String","class":"String"},"minzoom":10,"maxzoom":14},{"id":"boundary","fields":{"disputed":"Number","admin_level":"Number","maritime":"Number","disputed_name":"String"},"minzoom":0,"maxzoom":14},{"id":"building","fields":{"colour":"String","render_height":"Number","render_min_height":"Number"},"minzoom":13,"maxzoom":14},{"id":"housenumber","fields":{"housenumber":"String"},"minzoom":14,"maxzoom":14},{"id":"landcover","fields":{"subclass":"String","class":"String","_numpoints":"Number"},"minzoom":8,"maxzoom":14},{"id":"landuse","fields":{"class":"String"},"minzoom":4,"maxzoom":14},{"id":"mountain_peak","fields":{"name_int":"String","customary_ft":"Number","ele_ft":"Number","name_de":"String","name":"String","rank":"Number","class":"String","name_en":"String","name:latin":"String","ele":"Number"},"minzoom":7,"maxzoom":14},{"id":"park","fields":{"name_int":"String","name_de":"String","name":"String","name:en":"String","class":"String","name_en":"String","name:latin":"String"},"minzoom":6,"maxzoom":14},{"id":"place","fields":{"name:fy":"String","name_int":"String","capital":"Number","name:uk":"String","name:pl":"String","name:nl":"String","name:be":"String","name:ru":"String","name:ko":"String","name_de":"String","name":"String","rank":"Number","name:en":"String","name:eo":"String","class":"String","name:hu":"String","name:ta":"String","name:zh":"String","name_en":"String","name:latin":"String"},"minzoom":2,"maxzoom":14},{"id":"poi","fields":{"name_int":"String","level":"Number","name:nonlatin":"String","layer":"Number","name_de":"String","name":"String","subclass":"String","indoor":"Number","name:en":"String","class":"String","name:zh":"String","name_en":"String","name:latin":"String"},"minzoom":12,"maxzoom":14},{"id":"transportation","fields":{"access":"String","brunnel":"String","expressway":"Number","surface":"String","bicycle":"String","level":"Number","ramp":"Number","mtb_scale":"String","toll":"Number","oneway":"Number","layer":"Number","network":"String","horse":"String","service":"String","subclass":"String","class":"String","foot":"String"},"minzoom":4,"maxzoom":14},{"id":"transportation_name","fields":{"name_int":"String","name:nonlatin":"String","route_4":"String","route_3":"String","route_2":"String","route_1":"String","layer":"Number","network":"String","ref":"String","name_de":"String","name":"String","subclass":"String","ref_length":"Number","class":"String","name_en":"String","name:latin":"String"},"minzoom":6,"maxzoom":14},{"id":"water","fields":{"intermittent":"Number","class":"String"},"minzoom":0,"maxzoom":14},{"id":"water_name","fields":{"name_int":"String","name:nonlatin":"String","name_de":"String","name":"String","intermittent":"Number","class":"String","name_en":"String","name:latin":"String"},"minzoom":9,"maxzoom":14},{"id":"waterway","fields":{"name_int":"String","brunnel":"String","name_de":"String","_relid":"Number","intermittent":"Number","name":"String","class":"String","name:latin":"String","name_en":"String"},"minzoom":4,"maxzoom":14}]}
0:01:32 INF [mbtiles:write] - Starting z0
0:01:32 INF [mbtiles:write] - Finished z0 in 0s cpu:0s avg:0, now starting z1
0:01:32 INF [mbtiles:write] - Finished z1 in 0s cpu:0s avg:0, now starting z2
0:01:32 INF [mbtiles:write] - Finished z2 in 0s cpu:0s avg:0, now starting z3
0:01:32 INF [mbtiles:write] - Finished z3 in 0s cpu:0s avg:0, now starting z4
0:01:32 INF [mbtiles:write] - Finished z4 in 0s cpu:0s avg:0, now starting z5
0:01:32 INF [mbtiles:write] - Finished z5 in 0s cpu:0s avg:0, now starting z6
0:01:32 INF [mbtiles:write] - Finished z6 in 0s cpu:0s avg:2.3, now starting z7
0:01:32 INF [mbtiles:write] - Finished z7 in 0s cpu:0s avg:3.1, now starting z8
0:01:34 INF [mbtiles:write] - Finished z8 in 1s cpu:2s avg:1.9, now starting z9
0:01:35 INF [mbtiles:write] - Finished z9 in 1s cpu:3s avg:2, now starting z10
0:01:35 INF [mbtiles:write] - Finished z10 in 0.1s cpu:0.2s avg:2.1, now starting z11
0:01:36 INF [mbtiles:write] - Finished z11 in 0.9s cpu:2s avg:2, now starting z12
0:01:39 INF [mbtiles:write] - Finished z12 in 3s cpu:5s avg:2, now starting z13
0:01:42 INF [mbtiles] -  features: [ 645k  12%  64k/s ] 268M  tiles: [ 290k  29k/s ] 46M  
    cpus: 2 gc:  4% mem: 978M/4.2G direct: 52M postGC: 861M
    read( 4%) -> (214/217) -> encode(56% 56%) -> (215/216) -> write( 9%)
    last tile: 13/2469/3048 (z13 4%) https://www.openstreetmap.org/#map=13/41.77131/-71.49902
0:01:52 INF [mbtiles] -  features: [ 1.7M  34% 113k/s ] 268M  tiles: [   1M  70k/s ] 131M 
    cpus: 1.9 gc:  1% mem: 1.2G/4.2G direct: 52M postGC: 918M
    read( 4%) ->  (58/217) -> encode(55% 56%) -> (205/216) -> write(18%)
    last tile: 13/3640/3336 (z13 96%) https://www.openstreetmap.org/#map=13/31.65338/-20.03906
0:01:52 INF [mbtiles:write] - Finished z13 in 14s cpu:26s avg:2, now starting z14
0:02:02 INF [mbtiles:write] - Finished z14 in 10s cpu:14s avg:1.5
0:02:02 INF [mbtiles] -  features: [ 5.2M 100% 353k/s ] 268M  tiles: [ 4.1M 314k/s ] 514M 
    cpus: 1.5 gc:  1% mem: 1.8G/4.2G direct: 52M postGC: 916M
    read( -%) ->   (0/217) -> encode( -%  -%) ->   (0/216) -> write( -%)
    last tile: 14/7380/5985 (z14 100%) https://www.openstreetmap.org/#map=14/43.56447/-17.84180
0:02:02 DEB [mbtiles] - Tile stats:
0:02:02 DEB [mbtiles] - z0 avg:7.9k max:7.9k
0:02:02 DEB [mbtiles] - z1 avg:4k max:4k
0:02:02 DEB [mbtiles] - z2 avg:9.4k max:9.4k
0:02:02 DEB [mbtiles] - z3 avg:3.9k max:6.4k
0:02:02 DEB [mbtiles] - z4 avg:1.6k max:4.6k
0:02:02 DEB [mbtiles] - z5 avg:1.4k max:8.1k
0:02:02 DEB [mbtiles] - z6 avg:1.4k max:24k
0:02:02 DEB [mbtiles] - z7 avg:898 max:33k
0:02:02 DEB [mbtiles] - z8 avg:366 max:48k
0:02:02 DEB [mbtiles] - z9 avg:296 max:278k
0:02:02 DEB [mbtiles] - z10 avg:164 max:232k
0:02:02 DEB [mbtiles] - z11 avg:107 max:131k
0:02:02 DEB [mbtiles] - z12 avg:85 max:118k
0:02:02 DEB [mbtiles] - z13 avg:72 max:109k
0:02:02 DEB [mbtiles] - z14 avg:68 max:256k
0:02:02 DEB [mbtiles] - all avg:70 max:0
0:02:02 DEB [mbtiles] -  # features: 5,276,529
0:02:02 DEB [mbtiles] -     # tiles: 4,115,418
0:02:02 INF [mbtiles] - Finished in 30s cpu:54s avg:1.8
0:02:02 INF [mbtiles] -   read    1x(8% 2s wait:24s)
0:02:02 INF [mbtiles] -   encode  2x(45% 14s wait:8s)
0:02:02 INF [mbtiles] -   write   1x(36% 11s sys:1s wait:16s)
0:02:02 INF - Finished in 2m2s cpu:3m36s gc:4s avg:1.8
0:02:02 INF - FINISHED!
0:02:02 INF - 
0:02:02 INF - ----------------------------------------
0:02:02 INF - 	overall          2m2s cpu:3m36s gc:4s avg:1.8
0:02:02 INF - 	lake_centerlines 1s cpu:2s avg:1.8
0:02:02 INF - 	  read     1x(52% 0.6s)
0:02:02 INF - 	  process  2x(11% 0.1s)
0:02:02 INF - 	  write    1x(0% 0s)
0:02:02 INF - 	water_polygons   28s cpu:49s gc:2s avg:1.7
0:02:02 INF - 	  read     1x(59% 17s sys:1s wait:3s)
0:02:02 INF - 	  process  2x(27% 8s wait:14s)
0:02:02 INF - 	  write    1x(4% 1s wait:27s)
0:02:02 INF - 	natural_earth    11s cpu:18s avg:1.6
0:02:02 INF - 	  read     1x(89% 10s sys:2s)
0:02:02 INF - 	  process  2x(19% 2s wait:10s)
0:02:02 INF - 	  write    1x(0% 0s wait:10s)
0:02:02 INF - 	osm_pass1        3s cpu:6s avg:1.8
0:02:02 INF - 	  read     1x(3% 0.1s wait:3s)
0:02:02 INF - 	  parse    1x(74% 2s)
0:02:02 INF - 	  process  1x(37% 1s wait:2s)
0:02:02 INF - 	osm_pass2        35s cpu:1m9s avg:2
0:02:02 INF - 	  read     1x(0% 0.1s wait:4s done:31s)
0:02:02 INF - 	  process  2x(79% 27s)
0:02:02 INF - 	  write    1x(1% 0.4s wait:34s)
0:02:02 INF - 	boundaries       0s cpu:0.1s avg:1.5
0:02:02 INF - 	sort             5s cpu:7s avg:1.6
0:02:02 INF - 	  worker  2x(43% 2s done:2s)
0:02:02 INF - 	mbtiles          30s cpu:54s avg:1.8
0:02:02 INF - 	  read    1x(8% 2s wait:24s)
0:02:02 INF - 	  encode  2x(45% 14s wait:8s)
0:02:02 INF - 	  write   1x(36% 11s sys:1s wait:16s)
0:02:02 INF - ----------------------------------------
0:02:02 INF - 	features	268MB
0:02:02 INF - 	mbtiles	514MB
-rw-r--r-- 1 runner docker 55M Mar 17 07:49 run.jar
ℹ️ This Branch Logs f367bf3
0:00:00 DEB - argument: config=null (path to config file)
0:00:00 DEB - argument: area=rhode island (name of the extract to download if osm_url/osm_path not specified (i.e. 'monaco' 'rhode island' 'australia' or 'planet'))
0:00:00 INF - Using in-memory stats
0:00:00 INF [overall] - 
0:00:00 INF [overall] - Starting...
0:00:00 DEB - argument: bounds=Env[-74.07 : -17.84, 21.34 : 43.55] (bounds)
0:00:00 DEB - argument: threads=2 (num threads)
0:00:00 DEB - argument: loginterval=10 seconds (time between logs)
0:00:00 DEB - argument: minzoom=0 (minimum zoom level)
0:00:00 DEB - argument: maxzoom=14 (maximum zoom level (limit 14))
0:00:00 DEB - argument: defer_mbtiles_index_creation=false (skip adding index to mbtiles file)
0:00:00 DEB - argument: optimize_db=false (optimize mbtiles after writing)
0:00:00 DEB - argument: emit_tiles_in_order=true (emit tiles in index order)
0:00:00 DEB - argument: force=false (overwriting output file and ignore disk/RAM warnings)
0:00:00 DEB - argument: gzip_temp=false (gzip temporary feature storage (uses more CPU, but less disk space))
0:00:00 DEB - argument: sort_max_readers=6 (maximum number of concurrent read threads to use when sorting chunks)
0:00:00 DEB - argument: sort_max_writers=6 (maximum number of concurrent write threads to use when sorting chunks)
0:00:00 DEB - argument: nodemap_type=sortedtable (type of node location map: noop, sortedtable, or sparsearray)
0:00:00 DEB - argument: nodemap_storage=mmap (storage for location map: mmap or ram)
0:00:00 DEB - argument: nodemap_madvise=false (use linux madvise(random) to improve memory-mapped read performance)
0:00:00 DEB - argument: http_user_agent=Planetiler downloader (https://github.com/onthegomap/planetiler) (User-Agent header to set when downloading files over HTTP)
0:00:00 DEB - argument: http_timeout=30 seconds (Timeout to use when downloading files over HTTP)
0:00:00 DEB - argument: http_retries=1 (Retries to use when downloading files over HTTP)
0:00:00 DEB - argument: download_chunk_size_mb=100 (Size of file chunks to download in parallel in megabytes)
0:00:00 DEB - argument: download_threads=1 (Number of parallel threads to use when downloading each file)
0:00:00 DEB - argument: min_feature_size_at_max_zoom=0.0625 (Default value for the minimum size in tile pixels of features to emit at the maximum zoom level to allow for overzooming)
0:00:00 DEB - argument: min_feature_size=1.0 (Default value for the minimum size in tile pixels of features to emit below the maximum zoom level)
0:00:00 DEB - argument: simplify_tolerance_at_max_zoom=0.0625 (Default value for the tile pixel tolerance to use when simplifying features at the maximum zoom level to allow for overzooming)
0:00:00 DEB - argument: simplify_tolerance=0.1 (Default value for the tile pixel tolerance to use when simplifying features below the maximum zoom level)
0:00:00 DEB - argument: osm_lazy_reads=false (Read OSM blocks from disk in worker threads)
0:00:00 DEB - argument: tmpdir=data/tmp (temp directory)
0:00:00 DEB - argument: only_download=false (download source data then exit)
0:00:00 DEB - argument: download=false (download sources)
0:00:00 DEB - argument: only_fetch_wikidata=false (fetch wikidata translations then quit)
0:00:00 DEB - argument: fetch_wikidata=false (fetch wikidata translations then continue)
0:00:00 DEB - argument: use_wikidata=true (use wikidata translations)
0:00:00 DEB - argument: wikidata_cache=data/sources/wikidata_names.json (wikidata cache file)
0:00:00 DEB - argument: lake_centerlines_path=data/sources/lake_centerline.shp.zip (lake_centerlines shapefile path)
0:00:00 DEB - argument: free_lake_centerlines_after_read=false (delete lake_centerlines input file after reading to make space for output (reduces peak disk usage))
0:00:00 DEB - argument: water_polygons_path=data/sources/water-polygons-split-3857.zip (water_polygons shapefile path)
0:00:00 DEB - argument: free_water_polygons_after_read=false (delete water_polygons input file after reading to make space for output (reduces peak disk usage))
0:00:00 DEB - argument: natural_earth_path=data/sources/natural_earth_vector.sqlite.zip (natural_earth sqlite db path)
0:00:00 DEB - argument: free_natural_earth_after_read=false (delete natural_earth input file after reading to make space for output (reduces peak disk usage))
0:00:00 DEB - argument: osm_path=data/sources/rhode_island.osm.pbf (osm OSM input file path)
0:00:00 DEB - argument: free_osm_after_read=false (delete osm input file after reading to make space for output (reduces peak disk usage))
0:00:00 DEB - argument: mbtiles=data/out.mbtiles (mbtiles output file)
0:00:00 DEB - argument: transliterate=true (attempt to transliterate latin names)
0:00:00 DEB - argument: languages=am,ar,az,be,bg,br,bs,ca,co,cs,cy,da,de,el,en,eo,es,et,eu,fi,fr,fy,ga,gd,he,hi,hr,hu,hy,id,is,it,ja,ja_kana,ja_rm,ja-Latn,ja-Hira,ka,kk,kn,ko,ko-Latn,ku,la,lb,lt,lv,mk,mt,ml,nl,no,oc,pl,pt,rm,ro,ru,sk,sl,sq,sr,sr-Latn,sv,ta,te,th,tr,uk,zh (languages to use)
0:00:00 DEB - argument: only_layers= (Include only certain layers)
0:00:00 DEB - argument: exclude_layers= (Exclude certain layers)
0:00:00 DEB - argument: boundary_country_names=true (boundary layer: add left/right codes of neighboring countries)
0:00:00 DEB - argument: transportation_z13_paths=false (transportation(_name) layer: show all paths on z13)
0:00:00 DEB - argument: building_merge_z13=true (building layer: merge nearby buildings at z13)
0:00:00 DEB - argument: transportation_name_brunnel=false (transportation_name layer: set to false to omit brunnel and help merge long highways)
0:00:00 DEB - argument: transportation_name_size_for_shield=false (transportation_name layer: allow road names on shorter segments (ie. they will have a shield))
0:00:00 DEB - argument: transportation_name_limit_merge=false (transportation_name layer: limit merge so we don't combine different relations to help merge long highways)
0:00:00 DEB - argument: mbtiles_name=OpenMapTiles ('name' attribute for mbtiles metadata)
0:00:00 DEB - argument: mbtiles_description=A tileset showcasing all layers in OpenMapTiles. https://openmaptiles.org ('description' attribute for mbtiles metadata)
0:00:00 DEB - argument: mbtiles_attribution=<a href="https://www.openmaptiles.org/" target="_blank">&copy; OpenMapTiles</a> <a href="https://www.openstreetmap.org/copyright" target="_blank">&copy; OpenStreetMap contributors</a> ('attribution' attribute for mbtiles metadata)
0:00:00 DEB - argument: mbtiles_version=3.13.0 ('version' attribute for mbtiles metadata)
0:00:00 DEB - argument: mbtiles_type=baselayer ('type' attribute for mbtiles metadata)
0:00:00 DEB - argument: help=false (show arguments then exit)
0:00:00 INF - Building BasemapProfile profile into data/out.mbtiles in these phases:
0:00:00 INF -   lake_centerlines: Process features in data/sources/lake_centerline.shp.zip
0:00:00 INF -   water_polygons: Process features in data/sources/water-polygons-split-3857.zip
0:00:00 INF -   natural_earth: Process features in data/sources/natural_earth_vector.sqlite.zip
0:00:00 INF -   osm_pass1: Pre-process OpenStreetMap input (store node locations then relation members)
0:00:00 INF -   osm_pass2: Process OpenStreetMap nodes, ways, then relations
0:00:00 INF -   sort: Sort rendered features by tile ID
0:00:00 INF -   mbtiles: Encode each tile and write to data/out.mbtiles
0:00:00 INF - error loading /home/runner/work/planetiler/planetiler/data/sources/wikidata_names.json: java.nio.file.NoSuchFileException: data/sources/wikidata_names.json
0:00:00 INF - Using merge sort feature map, chunk size=1000mb workers=2
0:00:02 INF - dataFileCache open start
0:00:02 INF [lake_centerlines] - 
0:00:02 INF [lake_centerlines] - Starting...
0:00:03 INF [lake_centerlines] -  read: [  29k 100%  28k/s ] write: [    0    0/s ] 0    
    cpus: 1.8 gc:  0% mem: 219M/4.2G direct: 499k postGC: 46M
    read( -%) ->    (0/1k) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:00:03 INF [lake_centerlines] -  read: [  29k 100%    0/s ] write: [    0    0/s ] 0    
    cpus: 2.1 gc:  0% mem: 219M/4.2G direct: 499k postGC: 46M
    read( -%) ->    (0/1k) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:00:03 INF [lake_centerlines] - Finished in 1s cpu:2s avg:1.8
0:00:03 INF [lake_centerlines] -   read     1x(56% 0.6s)
0:00:03 INF [lake_centerlines] -   process  2x(12% 0.1s)
0:00:03 INF [lake_centerlines] -   write    1x(0% 0s wait:1s)
0:00:03 INF [water_polygons] - 
0:00:03 INF [water_polygons] - Starting...
0:00:13 INF [water_polygons] -  read: [ 2.1k  15%  216/s ] write: [  46k 4.5k/s ] 3.2M 
    cpus: 2 gc:  9% mem: 2.8G/4.2G direct: 52M postGC: 837M
    read(60%) ->    (0/1k) -> process(16% 41%) -> (1.3k/53k) -> write( 0%)
0:00:23 INF [water_polygons] -  read: [ 4.8k  33%  265/s ] write: [ 299k  25k/s ] 15M  
    cpus: 1.7 gc: 11% mem: 2.3G/4.2G direct: 52M postGC: 1.4G
    read(62%) ->    (0/1k) -> process(22% 31%) -> (1.2k/53k) -> write( 1%)
0:00:30 INF [water_polygons] -  read: [  14k 100% 1.3k/s ] write: [ 4.3M 571k/s ] 186M 
    cpus: 1.6 gc:  5% mem: 2.5G/4.2G direct: 52M postGC: 1.6G
    read( -%) ->    (0/1k) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:00:30 INF [water_polygons] -  read: [  14k 100%    0/s ] write: [ 4.3M    0/s ] 186M 
    cpus: 3 gc:  0% mem: 2.5G/4.2G direct: 52M postGC: 1.6G
    read( -%) ->    (0/1k) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:00:30 INF [water_polygons] - Finished in 27s cpu:48s gc:2s avg:1.8
0:00:30 INF [water_polygons] -   read     1x(58% 16s wait:3s)
0:00:30 INF [water_polygons] -   process  2x(28% 8s wait:12s)
0:00:30 INF [water_polygons] -   write    1x(4% 1s wait:26s)
0:00:30 INF [natural_earth] - unzipping /home/runner/work/planetiler/planetiler/data/sources/natural_earth_vector.sqlite.zip to data/tmp/natearth.sqlite
0:00:38 INF [natural_earth] - 
0:00:38 INF [natural_earth] - Starting...
0:00:49 INF [natural_earth] -  read: [ 340k  98%  34k/s ] write: [    0    0/s ] 186M 
    cpus: 1.6 gc:  0% mem: 3.6G/4.2G direct: 52M postGC: 1.6G
    read(96%) ->   (78/1k) -> process(19% 21%) -> (138/53k) -> write( 0%)
0:00:49 INF [natural_earth] -  read: [ 349k 100%  23k/s ] write: [  181  508/s ] 186M 
    cpus: 1.6 gc:  1% mem: 1.6G/4.2G direct: 52M postGC: 1.6G
    read( -%) ->    (0/1k) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:00:49 INF [natural_earth] -  read: [ 349k 100%    0/s ] write: [  181    0/s ] 186M 
    cpus: 3.1 gc:  0% mem: 1.6G/4.2G direct: 52M postGC: 1.6G
    read( -%) ->    (0/1k) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:00:49 INF [natural_earth] - Finished in 11s cpu:17s avg:1.6
0:00:49 INF [natural_earth] -   read     1x(89% 10s sys:1s)
0:00:49 INF [natural_earth] -   process  2x(19% 2s wait:10s)
0:00:49 INF [natural_earth] -   write    1x(0% 0s wait:10s)
0:00:50 INF [osm_pass1] - 
0:00:50 INF [osm_pass1] - Starting...
0:00:53 INF [osm_pass1] -  nodes: [ 4.5M 1.3M/s ] 354M  ways: [ 326k  95k/s ] rels: [ 7.5k 2.1k/s ] blocks: [  614  178/s ]
    cpus: 1.9 gc:  0% mem: 738M/4.2G direct: 52M postGC: 763M hppc: 880k
    read( -%) ->     (0/4) -> parse( -%) ->     (0/4) -> process( -%)
0:00:53 DEB [osm_pass1] - processed blocks:614 nodes:4,574,716 ways:326,532 relations:7,511
0:00:53 INF [osm_pass1] - Finished in 3s cpu:6s avg:1.9
0:00:53 INF [osm_pass1] -   read     1x(4% 0.1s wait:3s)
0:00:53 INF [osm_pass1] -   process  1x(33% 1s wait:2s)
0:00:53 INF [osm_pass1] -   parse    1x(74% 3s)
0:00:53 INF [osm_pass2] - 
0:00:53 INF [osm_pass2] - Starting...
0:00:57 DEB [osm_pass2:process] - Sorting long long multimap...
0:00:57 DEB [osm_pass2:process] - Sorted long long multimap 0s cpu:0s avg:1.7
0:00:57 WAR [osm_pass2:process] - No GB polygon for inferring route network types
0:01:03 INF [osm_pass2] -  nodes: [ 4.5M 100% 457k/s ] 354M  ways: [  64k  20% 6.3k/s ] rels: [    0   0%    0/s ] features: [ 4.6M  35k/s ] 215M  blocks: [  580  94%   57/s ]
    cpus: 2 gc:  1% mem: 2.6G/4.2G direct: 52M postGC: 795M hppc: 1.3M
    read( -%) ->  (32/103) -> process(65% 66%) -> (1.7k/53k) -> write( 2%)
0:01:13 INF [osm_pass2] -  nodes: [ 4.5M 100%    0/s ] 354M  ways: [ 248k  76%  18k/s ] rels: [    0   0%    0/s ] features: [   5M  41k/s ] 241M  blocks: [  603  98%    2/s ]
    cpus: 2 gc:  1% mem: 2.7G/4.2G direct: 52M postGC: 801M hppc:  13M
    read( -%) ->   (9/103) -> process(77% 77%) -> (1.5k/53k) -> write( 2%)
0:01:23 INF [osm_pass2] -  nodes: [ 4.5M 100%    0/s ] 354M  ways: [ 326k 100% 7.8k/s ] rels: [ 3.6k  49%  366/s ] features: [ 5.2M  16k/s ] 259M  blocks: [  613 100%   <1/s ]
    cpus: 2 gc:  1% mem: 2.2G/4.2G direct: 52M postGC: 803M hppc:  29M
    read( -%) ->   (0/103) -> process(82% 80%) -> (1.7k/53k) -> write( 1%)
0:01:30 INF [osm_pass2] -  nodes: [ 4.5M 100%    0/s ] 354M  ways: [ 326k 100%    0/s ] rels: [ 7.5k 100%  600/s ] features: [ 5.2M   3k/s ] 268M  blocks: [  614 100%   <1/s ]
    cpus: 2 gc:  1% mem: 1.5G/4.2G direct: 52M postGC: 796M hppc:  29M
    read( -%) ->   (0/103) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:01:30 INF [osm_pass2] -  nodes: [ 4.5M 100%    0/s ] 354M  ways: [ 326k 100%    0/s ] rels: [ 7.5k 100%    0/s ] features: [ 5.2M    0/s ] 268M  blocks: [  614 100%    0/s ]
    cpus: 0 gc:  0% mem: 1.5G/4.2G direct: 52M postGC: 796M hppc:  29M
    read( -%) ->   (0/103) -> process( -%  -%) ->   (0/53k) -> write( -%)
0:01:30 DEB [osm_pass2] - processed blocks:614 nodes:4,574,716 ways:326,532 relations:7,511
0:01:30 INF [osm_pass2] - Finished in 36s cpu:1m12s avg:2
0:01:30 INF [osm_pass2] -   read     1x(0% 0s wait:4s done:33s)
0:01:30 INF [osm_pass2] -   process  2x(76% 28s)
0:01:30 INF [osm_pass2] -   write    1x(1% 0.4s wait:36s)
0:01:30 INF [boundaries] - 
0:01:30 INF [boundaries] - Starting...
0:01:30 INF [boundaries] - Creating polygons for 1 boundaries
0:01:30 WAR [boundaries] - Unable to form closed polygon for OSM relation 148838 (likely missing edges)
0:01:30 INF [boundaries] - Finished creating 0 country polygons
0:01:30 INF [boundaries] - Finished in 0s cpu:0.1s avg:1.3
0:01:30 INF - Deleting node.db to make room for output file
0:01:30 INF [sort] - 
0:01:30 INF [sort] - Starting...
0:01:34 INF [sort] -  chunks: [   1 /   1 100% ] 268M 
    cpus: 1.2 gc:  0% mem: 2G/4.2G direct: 52M postGC: 796M
    ->     (0/4) -> worker( -%  -%)
0:01:34 INF [sort] -  chunks: [   1 /   1 100% ] 268M 
    cpus: 1.6 gc:  0% mem: 2G/4.2G direct: 52M postGC: 796M
    ->     (0/4) -> worker( -%  -%)
0:01:34 INF [sort] - Finished in 4s cpu:6s avg:1.2
0:01:34 INF [sort] -   worker  2x(41% 2s done:2s)
0:01:34 INF - read:1s write:1s sort:1s
0:01:35 INF [mbtiles] - 
0:01:35 INF [mbtiles] - Starting...
0:01:35 DEB [mbtiles:write] - Execute mbtiles: create table metadata (name text, value text);
0:01:35 DEB [mbtiles:write] - Execute mbtiles: create unique index name on metadata (name);
0:01:35 DEB [mbtiles:write] - Execute mbtiles: create table tiles (zoom_level integer, tile_column integer, tile_row, tile_data blob);
0:01:35 DEB [mbtiles:write] - Execute mbtiles: create unique index tile_index on tiles (zoom_level, tile_column, tile_row)
0:01:35 DEB [mbtiles:write] - Set mbtiles metadata: name=OpenMapTiles
0:01:35 DEB [mbtiles:write] - Set mbtiles metadata: format=pbf
0:01:35 DEB [mbtiles:write] - Set mbtiles metadata: description=A tileset showcasing all layers in OpenMapTiles. https://openmaptiles.org
0:01:35 DEB [mbtiles:write] - Set mbtiles metadata: attribution=<a href="https://www.openmaptiles.org/" target="_blank">&copy; OpenMapTiles</a> <a href="https://www.openstreetmap.org/copyright" target="_blank">&copy; OpenStreetMap contributors</a>
0:01:35 DEB [mbtiles:write] - Set mbtiles metadata: version=3.13.0
0:01:35 DEB [mbtiles:write] - Set mbtiles metadata: type=baselayer
0:01:35 DEB [mbtiles:write] - Set mbtiles metadata: bounds=-74.07,21.34,-17.84,43.55
0:01:35 DEB [mbtiles:write] - Set mbtiles metadata: center=-45.955,32.445,3
0:01:35 DEB [mbtiles:write] - Set mbtiles metadata: minzoom=0
0:01:35 DEB [mbtiles:write] - Set mbtiles metadata: maxzoom=14
0:01:35 DEB [mbtiles:write] - Set mbtiles metadata: json={"vector_layers":[{"id":"aerodrome_label","fields":{"name_int":"String","iata":"String","ele_ft":"Number","name_de":"String","name":"String","icao":"String","name:en":"String","class":"String","name_en":"String","name:latin":"String","ele":"Number"},"minzoom":10,"maxzoom":14},{"id":"aeroway","fields":{"ref":"String","class":"String"},"minzoom":10,"maxzoom":14},{"id":"boundary","fields":{"disputed":"Number","admin_level":"Number","maritime":"Number","disputed_name":"String"},"minzoom":0,"maxzoom":14},{"id":"building","fields":{"colour":"String","render_height":"Number","render_min_height":"Number"},"minzoom":13,"maxzoom":14},{"id":"housenumber","fields":{"housenumber":"String"},"minzoom":14,"maxzoom":14},{"id":"landcover","fields":{"subclass":"String","class":"String","_numpoints":"Number"},"minzoom":8,"maxzoom":14},{"id":"landuse","fields":{"class":"String"},"minzoom":4,"maxzoom":14},{"id":"mountain_peak","fields":{"name_int":"String","customary_ft":"Number","ele_ft":"Number","name_de":"String","name":"String","rank":"Number","class":"String","name_en":"String","name:latin":"String","ele":"Number"},"minzoom":7,"maxzoom":14},{"id":"park","fields":{"name_int":"String","name_de":"String","name":"String","name:en":"String","class":"String","name_en":"String","name:latin":"String"},"minzoom":6,"maxzoom":14},{"id":"place","fields":{"name:fy":"String","name_int":"String","capital":"Number","name:uk":"String","name:pl":"String","name:nl":"String","name:be":"String","name:ru":"String","name:ko":"String","name_de":"String","name":"String","rank":"Number","name:en":"String","name:eo":"String","class":"String","name:hu":"String","name:ta":"String","name:zh":"String","name_en":"String","name:latin":"String"},"minzoom":2,"maxzoom":14},{"id":"poi","fields":{"name_int":"String","level":"Number","name:nonlatin":"String","layer":"Number","name_de":"String","name":"String","subclass":"String","indoor":"Number","name:en":"String","class":"String","name:zh":"String","name_en":"String","name:latin":"String"},"minzoom":12,"maxzoom":14},{"id":"transportation","fields":{"access":"String","brunnel":"String","expressway":"Number","surface":"String","bicycle":"String","level":"Number","ramp":"Number","mtb_scale":"String","toll":"Number","oneway":"Number","layer":"Number","network":"String","horse":"String","service":"String","subclass":"String","class":"String","foot":"String"},"minzoom":4,"maxzoom":14},{"id":"transportation_name","fields":{"name_int":"String","name:nonlatin":"String","route_4":"String","route_3":"String","route_2":"String","route_1":"String","layer":"Number","network":"String","ref":"String","name_de":"String","name":"String","subclass":"String","ref_length":"Number","class":"String","name_en":"String","name:latin":"String"},"minzoom":6,"maxzoom":14},{"id":"water","fields":{"intermittent":"Number","class":"String"},"minzoom":0,"maxzoom":14},{"id":"water_name","fields":{"name_int":"String","name:nonlatin":"String","name_de":"String","name":"String","intermittent":"Number","class":"String","name_en":"String","name:latin":"String"},"minzoom":9,"maxzoom":14},{"id":"waterway","fields":{"name_int":"String","brunnel":"String","name_de":"String","_relid":"Number","intermittent":"Number","name":"String","class":"String","name:latin":"String","name_en":"String"},"minzoom":4,"maxzoom":14}]}
0:01:36 INF [mbtiles:write] - Starting z0
0:01:36 INF [mbtiles:write] - Finished z0 in 0s cpu:0s avg:0, now starting z1
0:01:36 INF [mbtiles:write] - Finished z1 in 0s cpu:0s avg:0, now starting z2
0:01:36 INF [mbtiles:write] - Finished z2 in 0s cpu:0s avg:0, now starting z3
0:01:36 INF [mbtiles:write] - Finished z3 in 0s cpu:0s avg:35.9, now starting z4
0:01:36 INF [mbtiles:write] - Finished z4 in 0s cpu:0s avg:0, now starting z5
0:01:36 INF [mbtiles:write] - Finished z5 in 0s cpu:0s avg:0, now starting z6
0:01:36 INF [mbtiles:write] - Finished z6 in 0s cpu:0s avg:0, now starting z7
0:01:36 INF [mbtiles:write] - Finished z7 in 0s cpu:0s avg:1.9, now starting z8
0:01:37 INF [mbtiles:write] - Finished z8 in 1s cpu:2s avg:2, now starting z9
0:01:38 INF [mbtiles:write] - Finished z9 in 1s cpu:3s avg:2, now starting z10
0:01:38 INF [mbtiles:write] - Finished z10 in 0.1s cpu:0.2s avg:2, now starting z11
0:01:39 INF [mbtiles:write] - Finished z11 in 0.6s cpu:1s avg:2, now starting z12
0:01:41 INF [mbtiles:write] - Finished z12 in 3s cpu:5s avg:2, now starting z13
0:01:45 INF [mbtiles] -  features: [ 645k  12%  64k/s ] 268M  tiles: [ 290k  29k/s ] 46M  
    cpus: 2 gc:  5% mem: 1.3G/4.2G direct: 52M postGC: 841M
    read( 4%) -> (214/217) -> encode(56% 56%) -> (215/216) -> write( 9%)
    last tile: 13/2469/3048 (z13 4%) https://www.openstreetmap.org/#map=13/41.77131/-71.49902
0:01:55 INF [mbtiles:write] - Finished z13 in 13s cpu:26s avg:2, now starting z14
0:01:55 INF [mbtiles] -  features: [ 1.8M  34% 115k/s ] 268M  tiles: [   1M  79k/s ] 142M 
    cpus: 2 gc:  1% mem: 2.6G/4.2G direct: 52M postGC: 848M
    read( 5%) -> (155/217) -> encode(57% 54%) -> (215/216) -> write(21%)
    last tile: 14/4875/6639 (z14 2%) https://www.openstreetmap.org/#map=14/32.26856/-72.88330
0:02:04 INF [mbtiles:write] - Finished z14 in 10s cpu:14s avg:1.5
0:02:04 INF [mbtiles] -  features: [ 5.2M 100% 371k/s ] 268M  tiles: [ 4.1M 323k/s ] 514M 
    cpus: 1.5 gc:  1% mem: 3.1G/4.2G direct: 52M postGC: 909M
    read( -%) ->   (0/217) -> encode( -%  -%) ->   (0/216) -> write( -%)
    last tile: 14/7380/5985 (z14 100%) https://www.openstreetmap.org/#map=14/43.56447/-17.84180
0:02:04 DEB [mbtiles] - Tile stats:
0:02:04 DEB [mbtiles] - z0 avg:7.9k max:7.9k
0:02:04 DEB [mbtiles] - z1 avg:4k max:4k
0:02:04 DEB [mbtiles] - z2 avg:9.4k max:9.4k
0:02:04 DEB [mbtiles] - z3 avg:3.9k max:6.4k
0:02:04 DEB [mbtiles] - z4 avg:1.6k max:4.6k
0:02:04 DEB [mbtiles] - z5 avg:1.4k max:8.1k
0:02:04 DEB [mbtiles] - z6 avg:1.4k max:24k
0:02:04 DEB [mbtiles] - z7 avg:898 max:33k
0:02:04 DEB [mbtiles] - z8 avg:366 max:48k
0:02:04 DEB [mbtiles] - z9 avg:296 max:278k
0:02:04 DEB [mbtiles] - z10 avg:164 max:232k
0:02:04 DEB [mbtiles] - z11 avg:107 max:131k
0:02:04 DEB [mbtiles] - z12 avg:85 max:118k
0:02:04 DEB [mbtiles] - z13 avg:72 max:109k
0:02:04 DEB [mbtiles] - z14 avg:68 max:256k
0:02:04 DEB [mbtiles] - all avg:70 max:0
0:02:04 DEB [mbtiles] -  # features: 5,276,529
0:02:04 DEB [mbtiles] -     # tiles: 4,115,418
0:02:04 INF [mbtiles] - Finished in 29s cpu:53s avg:1.8
0:02:04 INF [mbtiles] -   read    1x(8% 2s wait:24s)
0:02:04 INF [mbtiles] -   encode  2x(45% 13s wait:7s)
0:02:04 INF [mbtiles] -   write   1x(37% 11s sys:1s wait:16s)
0:02:04 INF - Finished in 2m5s cpu:3m37s gc:4s avg:1.7
0:02:04 INF - FINISHED!
0:02:04 INF - 
0:02:04 INF - ----------------------------------------
0:02:04 INF - 	overall          2m5s cpu:3m37s gc:4s avg:1.7
0:02:04 INF - 	lake_centerlines 1s cpu:2s avg:1.8
0:02:04 INF - 	  read     1x(56% 0.6s)
0:02:04 INF - 	  process  2x(12% 0.1s)
0:02:04 INF - 	  write    1x(0% 0s wait:1s)
0:02:04 INF - 	water_polygons   27s cpu:48s gc:2s avg:1.8
0:02:04 INF - 	  read     1x(58% 16s wait:3s)
0:02:04 INF - 	  process  2x(28% 8s wait:12s)
0:02:04 INF - 	  write    1x(4% 1s wait:26s)
0:02:04 INF - 	natural_earth    11s cpu:17s avg:1.6
0:02:04 INF - 	  read     1x(89% 10s sys:1s)
0:02:04 INF - 	  process  2x(19% 2s wait:10s)
0:02:04 INF - 	  write    1x(0% 0s wait:10s)
0:02:04 INF - 	osm_pass1        3s cpu:6s avg:1.9
0:02:04 INF - 	  read     1x(4% 0.1s wait:3s)
0:02:04 INF - 	  process  1x(33% 1s wait:2s)
0:02:04 INF - 	  parse    1x(74% 3s)
0:02:04 INF - 	osm_pass2        36s cpu:1m12s avg:2
0:02:04 INF - 	  read     1x(0% 0s wait:4s done:33s)
0:02:04 INF - 	  process  2x(76% 28s)
0:02:04 INF - 	  write    1x(1% 0.4s wait:36s)
0:02:04 INF - 	boundaries       0s cpu:0.1s avg:1.3
0:02:04 INF - 	sort             4s cpu:6s avg:1.2
0:02:04 INF - 	  worker  2x(41% 2s done:2s)
0:02:04 INF - 	mbtiles          29s cpu:53s avg:1.8
0:02:04 INF - 	  read    1x(8% 2s wait:24s)
0:02:04 INF - 	  encode  2x(45% 13s wait:7s)
0:02:04 INF - 	  write   1x(37% 11s sys:1s wait:16s)
0:02:04 INF - ----------------------------------------
0:02:04 INF - 	features	268MB
0:02:04 INF - 	mbtiles	514MB
-rw-r--r-- 1 runner docker 55M Mar 17 07:47 run.jar

@@ -125,7 +126,9 @@ private ShapefileDataStore open(Path path) {
} else {
throw new IllegalArgumentException("Invalid shapefile input: " + path + " must be zip or shp");
}
return new ShapefileDataStore(uri.toURL());
var store = new ShapefileDataStore(uri.toURL());
store.setCharset(Charset.forName("UTF8"));
Copy link
Contributor

@msbarry msbarry Mar 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing this up! I noticed the same issue when trying to read the natural earth shapefile. One concern here is that standard encoding for shapefiles is technically ISO8859-1 so I'd hesitate to always use UTF-8. It looks like shapefiles that have a different encoding actually contain a ".cpg" file next to the .shp file with the text of the encoding though, so maybe we could use that if it exists?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting point with the .cpg. QGIS seems to get the encoding right without further configuration:
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm you're right.. the shapefile reader should be doing this automatically. From the geotools release page it looks like starting with version 26 (we're on 26.3) they attempt to use cpg file to determine the encoding. I wonder why that's not working here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.geotools.org/stable/javadocs/org/geotools/data/shapefile/ShapefileDataStore.html#isTryCPGFile-- returns false, i.e., by default it does not access the .cpg file. Let's check if I can change the behavior with setTryCPGFile()...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msbarry
Copy link
Contributor

msbarry commented Mar 16, 2022

Thanks for making the change! Would it be possible to add a minimal test case that verifies a UTF-8 string from a shapefiles is decoded properly?

@wipfli
Copy link
Contributor Author

wipfli commented Mar 16, 2022

Works without hardcoding utf-8 now, by using the .cpg file. Thanks for the hint. Should I still look into a unit test or do we trust the library?

@msbarry
Copy link
Contributor

msbarry commented Mar 16, 2022

Works without hardcoding utf-8 now, by using the .cpg file. Thanks for the hint. Should I still look into a unit test or do we trust the library?

The unit test would be helpful! I'm hoping to switch to Apache SIS at some point so it would be good to make sure it has the same handling. There is a tiny shapefile.zip in here. I haven't worked much with shapefiles, do you have an easy way of modifying the shapefile to have a cpg and adding a test UTF-8 attribute to one of the points?

Also, it might be useful to have an extra option on ShapefileReader to override the encoding just like you can override the projection. What do you think?

@wipfli
Copy link
Contributor Author

wipfli commented Mar 17, 2022

I edited planetiler-core/src/test/resources/shapefile.zip in QGIS:

  1. Changed Van Dorn Street to Van Dörn Street and saved in the non-default encoding UTF-8.
  2. Added a shapefile.cpg file which is a text file containing the string UTF8

I updated the test file ShapeFileReaderTest.java to check if it finds the string Van Dörn Street. With the patch in this pull request, the test passes. Without, it fails.

@wipfli
Copy link
Contributor Author

wipfli commented Mar 17, 2022

Are you suggesting to overload .addShapefileSource("EPSG:2056", ...) to something like .addShapefileSource("UTF8", "EPSG:2056", ...)?

@msbarry
Copy link
Contributor

msbarry commented Mar 17, 2022

I edited planetiler-core/src/test/resources/shapefile.zip in QGIS:

  1. Changed Van Dorn Street to Van Dörn Street and saved in the non-default encoding UTF-8.
  2. Added a shapefile.cpg file which is a text file containing the string UTF8

I updated the test file ShapeFileReaderTest.java to check if it finds the string Van Dörn Street. With the patch in this pull request, the test passes. Without, it fails.

Awesome, looks great! Thank you!

Are you suggesting to overload .addShapefileSource("EPSG:2056", ...) to something like .addShapefileSource("UTF8", "EPSG:2056", ...)?

Yeah I could go either way on this, since there are now 2 optional parameters (source encoding and projection) there could be 4 signatures based on which ones you want to set and 8 if we add a third...I should refactor this to allow come kind of parameter object to make it these kinds of settings easier to add. Maybe a compromise would be to just add a character encoding argument to the existing method that includes a projection, and if it's null then ignore? Also I'd be fine doing nothing if adding the cpg file is an OK workaround.

@wipfli
Copy link
Contributor Author

wipfli commented Mar 17, 2022

I think the .cpg file is the simplest solution. It is a text file which contains the encoding, e.g., UTF8, as plain text. This is also already documented in the shapefile standard...

@msbarry
Copy link
Contributor

msbarry commented Mar 17, 2022

Sounds good to me! Thanks for making this change.

@msbarry msbarry merged commit 1cfcca2 into onthegomap:main Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants