Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add memory usage with OTP26's call_memory? #37

Open
jrfondren opened this issue May 16, 2024 · 1 comment
Open

Add memory usage with OTP26's call_memory? #37

jrfondren opened this issue May 16, 2024 · 1 comment
Labels
enhancement New feature or request
Milestone

Comments

@jrfondren
Copy link

With a quick escript to demonstrate the feature:

#! /usr/bin/env escript
main(Args) ->
    Bins = [compile(A) || A <- Args],
    Words = [run(B) || B <- Bins],
    lists:foreach(
        fun({A, [{_Pid, _Count, W}]}) ->
            io:fwrite("~10B - ~ts~n", [W, A])
        end,
        lists:zip(Args, Words)
    ),
    erlang:halt(0).

compile(Code) ->
    Lines = ["-module(f).", "-export([f/0]).", "f() -> " ++ Code],
    Tokens = [
        begin
            {ok, T, _} = erl_scan:string(Line),
            T
        end
     || Line <- Lines
    ],
    Forms = [
        begin
            {ok, F} = erl_parse:parse_form(T),
            F
        end
     || T <- Tokens
    ],
    case compile:forms(Forms, [no_spawn_compiler_process, binary, return]) of
        {ok, _, Bin} -> Bin;
        {ok, _, Bin, _Warnings} -> Bin
    end.

run(Bin) ->
    {module, f} = code:load_binary(f, f, Bin),
    1 = erlang:trace_pattern({f, f, 0}, true, [call_memory]),
    1 = erlang:trace(self(), true, [call, set_on_first_spawn]),
    Self = self(),
    spawn(fun() ->
        f:f(),
        Self ! done
    end),
    receive
        done -> ok
    end,
    {call_memory, Words} = erlang:trace_info({f, f, 0}, call_memory),
    Words.

erlperf and this script's output on some list-sorting functions from stackoverflow:

$ erlperf 'shuffle:do(lists:seq(1,1000)).' 'shuffle:do2(lists:seq(1,1000)).' 'shuffle:do3(lists:seq(1,1000)).' 'shuffle:do4(lists:seq(1,1000)).'
Code                                    ||        QPS       Time   Rel
shuffle:do(lists:seq(1,1000)).           1      11337   88206 ns  100%
shuffle:do2(lists:seq(1,1000)).          1       3780     265 us   33%
shuffle:do3(lists:seq(1,1000)).          1       1288     777 us   11%
shuffle:do4(lists:seq(1,1000)).          1        444    2251 us    4%

$ ./memused 'shuffle:do(lists:seq(1,1000)).' 'shuffle:do2(lists:seq(1,1000)).' 'shuffle:do3(lists:seq(1,1000)).' 'shuffle:do4(lists:seq(1,1000)).'
     25415 - shuffle:do(lists:seq(1,1000)).
     36744 - shuffle:do2(lists:seq(1,1000)).
   2020372 - shuffle:do3(lists:seq(1,1000)).
    983250 - shuffle:do4(lists:seq(1,1000)).

quickly shows that do (the cut answer) is the most economical with both time and space, while do3 (the list_to_tuple answer) isn't the worst for time, but generates the most garbage.

@max-au max-au added the enhancement New feature or request label May 17, 2024
@max-au max-au modified the milestones: 2.3.0, 2.4.0 May 17, 2024
@max-au
Copy link
Owner

max-au commented May 18, 2024

Thanks!
It is, indeed, a good idea, also considering trace sessions added for OTP27. It has to be a separate mode, similar to timed/continuous modes - "memory profile" mode. Otherwise memory profiling skews the benchmark. Moving this to v 2.4.0 to think a bit longer on the implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants