Add memory usage with OTP26's call_memory? #37

jrfondren · 2024-05-16T10:45:57Z

With a quick escript to demonstrate the feature:

#! /usr/bin/env escript
main(Args) ->
    Bins = [compile(A) || A <- Args],
    Words = [run(B) || B <- Bins],
    lists:foreach(
        fun({A, [{_Pid, _Count, W}]}) ->
            io:fwrite("~10B - ~ts~n", [W, A])
        end,
        lists:zip(Args, Words)
    ),
    erlang:halt(0).

compile(Code) ->
    Lines = ["-module(f).", "-export([f/0]).", "f() -> " ++ Code],
    Tokens = [
        begin
            {ok, T, _} = erl_scan:string(Line),
            T
        end
     || Line <- Lines
    ],
    Forms = [
        begin
            {ok, F} = erl_parse:parse_form(T),
            F
        end
     || T <- Tokens
    ],
    case compile:forms(Forms, [no_spawn_compiler_process, binary, return]) of
        {ok, _, Bin} -> Bin;
        {ok, _, Bin, _Warnings} -> Bin
    end.

run(Bin) ->
    {module, f} = code:load_binary(f, f, Bin),
    1 = erlang:trace_pattern({f, f, 0}, true, [call_memory]),
    1 = erlang:trace(self(), true, [call, set_on_first_spawn]),
    Self = self(),
    spawn(fun() ->
        f:f(),
        Self ! done
    end),
    receive
        done -> ok
    end,
    {call_memory, Words} = erlang:trace_info({f, f, 0}, call_memory),
    Words.

erlperf and this script's output on some list-sorting functions from stackoverflow:

$ erlperf 'shuffle:do(lists:seq(1,1000)).' 'shuffle:do2(lists:seq(1,1000)).' 'shuffle:do3(lists:seq(1,1000)).' 'shuffle:do4(lists:seq(1,1000)).'
Code                                    ||        QPS       Time   Rel
shuffle:do(lists:seq(1,1000)).           1      11337   88206 ns  100%
shuffle:do2(lists:seq(1,1000)).          1       3780     265 us   33%
shuffle:do3(lists:seq(1,1000)).          1       1288     777 us   11%
shuffle:do4(lists:seq(1,1000)).          1        444    2251 us    4%

$ ./memused 'shuffle:do(lists:seq(1,1000)).' 'shuffle:do2(lists:seq(1,1000)).' 'shuffle:do3(lists:seq(1,1000)).' 'shuffle:do4(lists:seq(1,1000)).'
     25415 - shuffle:do(lists:seq(1,1000)).
     36744 - shuffle:do2(lists:seq(1,1000)).
   2020372 - shuffle:do3(lists:seq(1,1000)).
    983250 - shuffle:do4(lists:seq(1,1000)).

quickly shows that do (the cut answer) is the most economical with both time and space, while do3 (the list_to_tuple answer) isn't the worst for time, but generates the most garbage.

The text was updated successfully, but these errors were encountered:

max-au · 2024-05-18T01:13:40Z

Thanks!
It is, indeed, a good idea, also considering trace sessions added for OTP27. It has to be a separate mode, similar to timed/continuous modes - "memory profile" mode. Otherwise memory profiling skews the benchmark. Moving this to v 2.4.0 to think a bit longer on the implementation.

max-au added the enhancement New feature or request label May 17, 2024

max-au modified the milestones: 2.3.0, 2.4.0 May 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add memory usage with OTP26's call_memory? #37

Add memory usage with OTP26's call_memory? #37

jrfondren commented May 16, 2024

max-au commented May 18, 2024

Add memory usage with OTP26's call_memory? #37

Add memory usage with OTP26's call_memory? #37

Comments

jrfondren commented May 16, 2024

max-au commented May 18, 2024