Skip to content
This repository has been archived by the owner on Jan 12, 2024. It is now read-only.

Performance improvements #483

Open
nebulon42 opened this issue Jul 17, 2017 · 17 comments
Open

Performance improvements #483

nebulon42 opened this issue Jul 17, 2017 · 17 comments

Comments

@nebulon42
Copy link
Collaborator

This issue is meant to collect discussions about possible performance optimizations. More detailed sub tasks should be opened once identified.

@nebulon42
Copy link
Collaborator Author

nebulon42 commented Jul 17, 2017

About profiling

There is a simple built-in Node.js profiler (https://nodejs.org/en/docs/guides/simple-profiling/). Let's say we want to profile how carto processes osm-carto. We then run node --prof bin/carto -f test.xml path/to/openstreetmap-carto/project.mml from latest master we get a isolate-HASH-v8.log file in the working directory. We then have to post-process said file with node --prof-process isolate-HASH-v8.log > results.txt.

Then I got the following:

[JavaScript]:
   ticks  total  nonlib   name
   1023    6.2%    6.3%  LazyCompile: *[Symbol.replace] native regexp.js:481:31
    643    3.9%    3.9%  LazyCompile: *tree.Filterset.addable path/to/carto/lib/carto/tree/filterset.js:105:44
    324    2.0%    2.0%  Stub: CEntryStub
    305    1.8%    1.9%  Builtin: CallFunction_ReceiverIsAny
    289    1.7%    1.8%  LazyCompile: *RegExpReplace native regexp.js:356:23
    283    1.7%    1.7%  Builtin: Call_ReceiverIsAny
    265    1.6%    1.6%  Stub: StringAddStub
    240    1.5%    1.5%  LazyCompile: *tree.Filterset.cloneWith path/to/carto/lib/carto/tree/filterset.js:59:46
    231    1.4%    1.4%  Stub: FastCloneRegExpStub
    225    1.4%    1.4%  LazyCompile: *replace native string.js:57:23
    212    1.3%    1.3%  Builtin: ToString
    211    1.3%    1.3%  KeyedLoadIC: A keyed load IC from the snapshot {2}
    189    1.1%    1.2%  Builtin: RegExpPrototypeExec
...
 [C++]:
   ticks  total  nonlib   name
    635    3.8%    3.9%  v8::internal::StringReplaceGlobalRegExpWithString(v8::internal::Isolate*, v8::internal::Handle<v8::internal::String>, v8::internal::Handle<v8::internal::JSRegExp>, v8::internal::Handle<v8::internal::String>, v8::internal::Handle<v8::internal::JSObject>)
    550    3.3%    3.4%  v8::internal::Zone::DeleteAll()
    531    3.2%    3.2%  v8::internal::Runtime_StringReplaceGlobalRegExpWithString(int, v8::internal::Object**, v8::internal::Isolate*)
    394    2.4%    2.4%  v8::internal::String::GetFlatContent()
    352    2.1%    2.2%  v8::internal::Object* v8::internal::StringReplaceGlobalAtomRegExpWithString<v8::internal::SeqOneByteString>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::String>, v8::internal::Handle<v8::internal::JSRegExp>, v8::internal::Handle<v8::internal::String>, v8::internal::Handle<v8::internal::JSObject>)
    339    2.1%    2.1%  v8::internal::CompiledReplacement::Compile(v8::internal::Handle<v8::internal::String>, int, int)
    270    1.6%    1.7%  v8::internal::FindStringIndicesDispatch(v8::internal::Isolate*, v8::internal::String*, v8::internal::String*, v8::internal::ZoneList<int>*, unsigned int, v8::internal::Zone*) [clone .constprop.128]
    270    1.6%    1.7%  memchr
    241    1.5%    1.5%  v8::internal::String::ComputeAndSetHash()
    208    1.3%    1.3%  void v8::internal::LookupIterator::Start<false>()
    185    1.1%    1.1%  write
    184    1.1%    1.1%  v8::internal::Zone::New(unsigned long)
    183    1.1%    1.1%  v8::internal::Runtime_KeyedGetProperty(int, v8::internal::Object**, v8::internal::Isolate*)
    165    1.0%    1.0%  v8::internal::StringTable::LookupString(v8::internal::Isolate*, v8::internal::Handle<v8::internal::String>)
    162    1.0%    1.0%  v8::internal::FindTwoByteStringIndices(v8::internal::Vector<unsigned short const>, unsigned short, v8::internal::ZoneList<int>*, unsigned int, v8::internal::Zone*)
...
 [C++ entry points]:
   ticks    cpp   total   name
   3674   44.1%   22.2%  v8::internal::Runtime_StringReplaceGlobalRegExpWithString(int, v8::internal::Object**, v8::internal::Isolate*)
   1607   19.3%    9.7%  v8::internal::Runtime_KeyedGetProperty(int, v8::internal::Object**, v8::internal::Isolate*)
    307    3.7%    1.9%  v8::internal::Runtime_RegExpExec(int, v8::internal::Object**, v8::internal::Isolate*)
    252    3.0%    1.5%  v8::internal::Runtime_ForInEnumerate(int, v8::internal::Object**, v8::internal::Isolate*)
    230    2.8%    1.4%  v8::internal::Runtime_StringAdd(int, v8::internal::Object**, v8::internal::Isolate*)
    194    2.3%    1.2%  v8::internal::Runtime_StoreIC_Miss(int, v8::internal::Object**, v8::internal::Isolate*)
    179    2.2%    1.1%  v8::internal::Runtime_Equal(int, v8::internal::Object**, v8::internal::Isolate*)
    179    2.2%    1.1%  v8::internal::Runtime_CompileLazy(int, v8::internal::Object**, v8::internal::Isolate*)
    169    2.0%    1.0%  v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*)
    155    1.9%    0.9%  v8::internal::Runtime_KeyedStoreIC_Miss(int, v8::internal::Object**, v8::internal::Isolate*)
    139    1.7%    0.8%  v8::internal::Runtime_CompileOptimized_Concurrent(int, v8::internal::Object**, v8::internal::Isolate*)
    124    1.5%    0.8%  v8::internal::Runtime_SetProperty(int, v8::internal::Object**, v8::internal::Isolate*)
    110    1.3%    0.7%  v8::internal::Builtin_FunctionConstructor(int, v8::internal::Object**, v8::internal::Isolate*)
...
 [Bottom up (heavy) profile]:
  Note: percentage shows a share of a particular caller in the total
  amount of its parent calls.
  Callers occupying less than 2.0% are not shown.

   ticks parent  name
   1051    6.4%  UNKNOWN

   1023    6.2%  LazyCompile: *[Symbol.replace] native regexp.js:481:31
   1023  100.0%    LazyCompile: *replace native string.js:57:23
    987   96.5%      LazyCompile: *tree.Filterset.addable path/to/carto/lib/carto/tree/filterset.js:105:44
    987  100.0%        LazyCompile: *tree.Filterset.cloneWith path/to/carto/lib/carto/tree/filterset.js:59:46
    647   65.6%          LazyCompile: *addRules path/to/carto/lib/carto/renderer.js:473:18
    641   99.1%            LazyCompile: *inheritDefinitions path/to/carto/lib/carto/renderer.js:537:28
    336   34.0%          LazyCompile: *foldStyle path/to/carto/lib/carto/renderer.js:611:19
    336  100.0%            LazyCompile: ~render path/to/carto/lib/carto/renderer.js:160:50

    643    3.9%  LazyCompile: *tree.Filterset.addable path/to/carto/lib/carto/tree/filterset.js:105:44
    643  100.0%    LazyCompile: *tree.Filterset.cloneWith path/to/carto/lib/carto/tree/filterset.js:59:46
    439   68.3%      LazyCompile: *addRules path/to/carto/lib/carto/renderer.js:473:18
    426   97.0%        LazyCompile: *inheritDefinitions path/to/carto/lib/carto/renderer.js:537:28
    426  100.0%          LazyCompile: ~render path/to/carto/lib/carto/renderer.js:160:50
    426  100.0%            LazyCompile: ~compile path/to/carto/bin/carto:80:17
     13    3.0%        LazyCompile: ~inheritDefinitions path/to/carto/lib/carto/renderer.js:537:28
     13  100.0%          LazyCompile: ~render path/to/carto/lib/carto/renderer.js:160:50
     13  100.0%            LazyCompile: ~compile path/to/carto/bin/carto:80:17
    202   31.4%      LazyCompile: *foldStyle path/to/carto/lib/carto/renderer.js:611:19
    202  100.0%        LazyCompile: ~render path/to/carto/lib/carto/renderer.js:160:50
    202  100.0%          LazyCompile: ~compile path/to/carto/bin/carto:80:17
    202  100.0%            LazyCompile: ~load path/to/carto/lib/carto/mml.js:19:41
...

(Note: The last time I did this which I didn't document I remember getting some 12% in Filterset.addable or the like)

@StyXman
Copy link

StyXman commented Jul 25, 2017

I'm baffled. Befuddled. Bewildered.

Let's start with a recent profile:

 [Summary]:
   ticks  total  nonlib   name
  11097   60.1%   65.2%  C++
   5911   32.0%   34.7%  JavaScript
   1453    7.9%          Shared libraries
    163    0.9%    1.0%  GC
      7    0.0%          Unaccounted

 [C++ entry points]:
   ticks    cpp   total   name
   3493   28.5%   18.9%  v8::internal::Runtime_StringReplaceGlobalRegExpWithString(int, v8::internal::Object**, v8::internal::Isolate*)
   2054   16.7%   11.1%  v8::internal::Runtime_Apply(int, v8::internal::Object**, v8::internal::Isolate*)
   1900   15.5%   10.3%  v8::internal::Runtime_KeyedGetProperty(int, v8::internal::Object**, v8::internal::Isolate*)
    892    7.3%    4.8%  v8::internal::Runtime_BoundFunctionGetBindings(int, v8::internal::Object**, v8::internal::Isolate*)

 [JavaScript]:
   ticks  total  nonlib   name
    484    2.6%    2.8%  LazyCompile: *replace native string.js:146:23
    399    2.2%    2.3%  Stub: CEntryStub
    295    1.6%    1.7%  Stub: StringAddStub_CheckNone_NotTenured
    269    1.5%    1.6%  LazyCompile: ~InnerArrayFilter native array.js:891:26
    209    1.1%    1.2%  LazyCompile: ~<anonymous> native v8natives.js:1199:16
    205    1.1%    1.2%  Builtin: ArgumentsAdaptorTrampoline
    163    0.9%    1.0%  LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46

 [Shared libraries]:
   ticks  total  nonlib   name
   1241    6.7%          /usr/bin/nodejs
    205    1.1%          /lib/x86_64-linux-gnu/libc-2.24.so
      3    0.0%          /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.24
      2    0.0%          [vdso]
      2    0.0%          /lib/x86_64-linux-gnu/libm-2.24.so

 [Bottom up (heavy) profile]:
  Note: percentage shows a share of a particular caller in the total
  amount of its parent calls.
  Callers occupying less than 2.0% are not shown.

   ticks parent  name
   1241    6.7%  /usr/bin/nodejs
    537   43.3%    v8::internal::Runtime_Apply(int, v8::internal::Object**, v8::internal::Isolate*)
    534   99.4%      LazyCompile: ~<anonymous> native v8natives.js:1199:16
    534  100.0%        LazyCompile: ~InnerArrayFilter native array.js:891:26
    534  100.0%          LazyCompile: *filter native array.js:915:21
    464   86.9%            LazyCompile: *tree.Variable.ev /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/variable.js:16:17
     55   10.3%            LazyCompile: *tree.Operation.ev /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/operation.js:16:39
     15    2.8%            LazyCompile: ~tree.Variable.ev /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/variable.js:16:17
    355   28.6%    v8::internal::Runtime_StringReplaceGlobalRegExpWithString(int, v8::internal::Object**, v8::internal::Isolate*)
    355  100.0%      LazyCompile: *replace native string.js:146:23
     55   15.5%        LazyCompile: *tree.Filterset.addable /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:105:44
     55  100.0%          LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
     41   74.5%            LazyCompile: *addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
     12   21.8%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
      2    3.6%            LazyCompile: ~foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
     36   10.1%        Handler: 67092480#__default__#line-width {5}
     31   86.1%          LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
     22   71.0%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      9   29.0%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
      5   13.9%          LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
      3   60.0%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
      2   40.0%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
     36   10.1%        Handler: 67043328#__default__#line-color
     35   97.2%          LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
     24   68.6%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
     11   31.4%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
      1    2.8%          LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
      1  100.0%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
     32    9.0%        Handler: 67107840#__default__#line-color
     28   87.5%          LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
     18   64.3%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
     10   35.7%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
      4   12.5%          LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
      2   50.0%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      2   50.0%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
     30    8.5%        Handler: 67107840#__default__#line-width
     25   83.3%          LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
     15   60.0%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
     10   40.0%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      5   16.7%          LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
      5  100.0%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
     29    8.2%        Handler: 67107840#__default__#line-color {1}
     25   86.2%          LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
     19   76.0%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      6   24.0%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
      4   13.8%          LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
      3   75.0%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      1   25.0%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
     28    7.9%        Handler: 67107840#__default__#line-cap
     25   89.3%          LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
     17   68.0%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      8   32.0%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
      3   10.7%          LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
      2   66.7%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      1   33.3%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
     24    6.8%        Handler: 67076096#access#line-dasharray
     20   83.3%          LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
     10   50.0%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
     10   50.0%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
      4   16.7%          LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
      4  100.0%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
     23    6.5%        Handler: 67106816#__default__#line-width
     19   82.6%          LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
     10   52.6%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      9   47.4%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
      4   17.4%          LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
      3   75.0%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      1   25.0%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
     22    6.2%        Handler: 67092480#__default__#line-color
     17   77.3%          LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
     12   70.6%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      5   29.4%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
      5   22.7%          LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
      3   60.0%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      2   40.0%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
     21    5.9%        Handler: 67076096#access#line-color {1}
     18   85.7%          LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
      9   50.0%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      9   50.0%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
      3   14.3%          LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
      2   66.7%            LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      1   33.3%            LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
     12    3.4%        LazyCompile: *tree.Quoted.toString /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/quoted.js:10:23
      7   58.3%          LazyCompile: *DefaultString native runtime.js:657:23
      7  100.0%            LazyCompile: *ConvertToString native array.js:159:25
      2   16.7%          LazyCompile: ~<anonymous> /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/expression.js:23:48
      2  100.0%            LazyCompile: *InnerArrayMap native array.js:1001:23
      2   16.7%          LazyCompile: *DefaultNumber native runtime.js:642:23
      2  100.0%            LazyCompile: *ToPrimitive native runtime.js:528:21
      1    8.3%          LazyCompile: ~tree.Filter.toObject /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filter.js:32:42
      1  100.0%            LazyCompile: ~tree.Filterset.toObject /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:8:45
    109    8.8%    v8::internal::Runtime_KeyedGetProperty(int, v8::internal::Object**, v8::internal::Isolate*)
     70   64.2%      Handler: 67076096#access#line-color
     64   91.4%        LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
     44   68.8%          LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
     44  100.0%            LazyCompile: *inheritDefinitions /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:537:28
     20   31.3%          LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
     20  100.0%            LazyCompile: render /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:160:50
      6    8.6%        LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
      6  100.0%          LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      6  100.0%            LazyCompile: *inheritDefinitions /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:537:28
     23   21.1%      Handler: 67076096#access#line-join
     21   91.3%        LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
     13   61.9%          LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
     13  100.0%            LazyCompile: *inheritDefinitions /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:537:28
      8   38.1%          LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
      8  100.0%            LazyCompile: render /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:160:50
      2    8.7%        LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
      2  100.0%          LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      2  100.0%            LazyCompile: *inheritDefinitions /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:537:28
     11   10.1%      LazyCompile: *tree.Filterset.addable /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:105:44
     11  100.0%        LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
      9   81.8%          LazyCompile: *addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
      6   66.7%            LazyCompile: *inheritDefinitions /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:537:28
      3   33.3%            LazyCompile: ~inheritDefinitions /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:537:28
      1    9.1%          LazyCompile: ~foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
      1  100.0%            LazyCompile: render /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:160:50
      1    9.1%          LazyCompile: *tree.Ruleset.flatten /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/ruleset.js:94:22
      1  100.0%            LazyCompile: *tree.Ruleset.flatten /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/ruleset.js:94:22

So the plan was to attack the calls to v8::internal::Runtime_StringReplaceGlobalRegExpWithString(). From the bottom up, I understand that most of these calls happen in lib/carto/tree/filterset.js:105:44, tree.Filterset.addable(). Notice that the line number, 105, is the line of where the definition of addable() starts. I can't find any reference about what the 44 means.

But there seem to be no such call in that function; the closest I could find is the regexp in line 110, so I thought: maybe that regexp is taking too long. I replaced the code like this:

    /*
    if (value.match(/^[+-]?[0-9]+(\.[0-9]*)?([e|E][+-]?[0-9]+)?$/)) {
        value = parseFloat(value);
    }
    */
    f = parseFloat(value);
    if (f !== NaN) {
        value = f;
    }

I understand that the code is not equivalent; in particular, if value is like 123abc, with the original code it would retain its string type and original value, while with the new one it would become the float 123.

Guess what: instead of ~18s in my system, now it takes ~4m30s!

I tried to find other, non-sampling profilers to fully profile the script and I can't find any. The closest I got was https://github.com/ralphv/zoran, but it's a 5 commit tool wrote in two days, one in 2015 and another in 2016, and I couldn't make it work with a CLI tool like carto.

As you can see, I'm not a JS developer, so I'm not familiar with its toolchain. I you could provide some pointers on how to actually tackle this, I'll be happy to follow them.

@nebulon42
Copy link
Collaborator Author

I also wanted to get rid of the regex already but that resulted in the weird behaviour you noticed which especially breaks OSM Bright.

I think the problem here is not the regex itself but that the system spends so much time in addable that it is the most important calculation. The filter system of carto is quite complex so I don't understand it fully yet. There is some experimental branch which tries to modify the filter system and also removes the particular function, see #206.

So far I didn't have the need for performance optimizations in JS, so I'm also quite new to the topic. At this point I haven't any specific advice but maybe @tmcw could share some thoughts how he would approach this.

@StyXman
Copy link

StyXman commented Jul 27, 2017

Well, one thing I learned is that regexps literals are automatically compiled by v8, and probably has a JIT for that too. I also tried caching the results of several toString() and toObject() methods, but nothing made it better and in some cases made it worse, so I also think the JIT is better than us.

I found new ways to profile with chromium (yes, the best JS tool for profiling is in a browser, but it kinda makes sense because of the genesis of the language) and running carto with a recent version of node and its --inspect option. I hope to get some news by the weekend.

@StyXman
Copy link

StyXman commented Jul 29, 2017

More info:

time ./bin/carto -a 3.0.6 -p=true -b ~/src/projects/osm/osm-carto/project.mml 2>&1 | awk '/^processing/ { layer=$3 }; /^Inheritance/ { print $3, layer }' | sort -n
[...]
113ms amenity-points
120ms amenity-points-poly
292ms landcover
792ms roads-fill
803ms tunnels
967ms bridges
2646ms text-point
2684ms text-poly

real    0m17.062s
user    0m17.513s
sys     0m0.187s

@StyXman
Copy link

StyXman commented Aug 1, 2017

So chomium/chrome's profiler is as bad as the command line one; that is, it provides no new information, just the same thing graphed in three ways: the bottom up, the top down (slightly more useful) and a flamegraph (ditto).

The latter at least tells us that we're dying of a gazillion of paper cuts:

screenshot_20170801_091714

Zoom to one of the more expensive inheritedDefinitions() / foldStyle() pair:

screenshot_20170801_091833

Zoom to a couple of addRules() and one cloneWith():

screenshot_20170801_092403

So the conclusion so far is: if we think we can gain some processing time, it will have to be by rethinking those 4 functions.

@nebulon42
Copy link
Collaborator Author

Thanks for the time you invested so far! About functions: it was my feeling too that it boils down to those. So we have found that out by research too. The question remains what needs to be done to gain something.

@HolgerJeromin
Copy link

HolgerJeromin commented Aug 2, 2017

I have no knowledge about carto but js optimisation. You loose lot of time in the matching.
Could this code be changed:

if (value.match(/^[+-]?[0-9]+(\.[0-9]*)?([e|E][+-]?[0-9]+)?$/)) {
    value = parseFloat(value);
}

to

let tempvalue = parseFloat(value);
if (!isNaN(tempvalue)) {
    value = tempvalue;
}

@StyXman suggested something like that but forgot that you can not compare NaN with NaN in JS!

NaN !== NaN
true
10 !== NaN
true

Thats why StyXman code dropped the condition and worked with the parseFloat´ed value in any case. No real wonder that carto exploded as it had to handle NaN instead of a string or number.

@nebulon42
Copy link
Collaborator Author

@HolgerJeromin How would you prevent something like that:

var tempvalue = parseFloat('1layer');
if (!isNaN(tempvalue)) {
    console.log('a number');
}

Ref #471 (shows a real world case for that problem)

@HolgerJeromin
Copy link

HolgerJeromin commented Aug 2, 2017

Backed with jsPerf i would do similar code as jQuery has (explanation):

let value = '1layer';
let tempvalue = parseFloat(value);
if (isFinite( tempvalue - value )) {
    value = tempvalue;
}

I hope this does not kill performance again :-)
I could do a blind PR if you like.

@StyXman
Copy link

StyXman commented Aug 2, 2017

When I used an unguarded parseFloat(), times went from 30s to 4m30s. Can you do the change and test locally? I should probably see if I should upgrade my node version.

@HolgerJeromin
Copy link

@StyXman Did you read this comment?

Thats why StyXman code dropped the condition and worked with the parseFloat´ed value in any case. No real wonder that carto exploded as it had to handle NaN instead of a string or number.

@nebulon42
Copy link
Collaborator Author

nebulon42 commented Aug 2, 2017

I did it like this

var key = filter.key.toString(),
        value = filter.val.toString(),
        valueIsNull = filter.val.is == 'keyword' && value == 'null',
        valueIsNumber = parseFloat(value);

    if (isFinite(value - valueIsNumber)) {
        value = valueIsNumber;
    }

Unfortunately, I didn't see any gains. The new code is a little bit slower (~100ms). :) But nice suggestion for the parseFloat workaround, thanks.

@kocio-pl
Copy link

Are there any work in progress with this issue?

@tmcw
Copy link
Contributor

tmcw commented Oct 14, 2017

I'd suspect the biggest gains lie in big-o/algorithmic optimization instead of JS microbenchmarking and the like. One thing I'd really think about is whether, instead of treating parsing as "generating an AST and then mutating it until it's the stylesheet", it could be treated as a full transform: the unprocessed input and processed output would be entirely different data representations and there could be an intermediate form that uses JavaScript's blessed new datastructures (sets and maps, etc) to optimize lookup and modification.

@StyXman
Copy link

StyXman commented Oct 15, 2017

Sorry that I dissapeared, my family just expanded. @HolgerJeromin's change doesn't seem to make a change, I still get ~17s for the same version of osm-carto.

@pnorman
Copy link
Contributor

pnorman commented Oct 18, 2017

I'd suspect the biggest gains lie in big-o/algorithmic optimization instead of JS microbenchmarking and the like.

I agree - there don't seem to be obvious big gains in the inner loops. #206 was mentioned above, and that would eliminate the hot function completely, and improve the resulting XML.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants