Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a pointer for PC instead of an instruction counter #776

Merged
merged 1 commit into from
Oct 16, 2023

Conversation

pguyot
Copy link
Collaborator

@pguyot pguyot commented Aug 21, 2023

Also mark code in modules as const.
This optimization brings an additional 20% speed upgrade to the Sudoku
benchmark.

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

@bettio
Copy link
Collaborator

bettio commented Sep 11, 2023

I found out that for some reason this floating point test is way slower:

-module(pi).
-export([run/0, pi/4, ok/0]).

ok() ->
    ok.

run() ->
    pi(4, 3, -1, 1000000).

pi(Value, _N, _Sign, 0) ->
    Value;
pi(Value, N, Sign, Iterations) ->
    pi(Value + Sign * (4 / N), N + 2, -Sign, Iterations - 1).

@pguyot
Copy link
Collaborator Author

pguyot commented Sep 11, 2023

I found out that for some reason this floating point test is way slower:

I added the test to atomvm_benchmark in pguyot/atomvm_benchmark#3.

However, I don't get your result on macOS with OTP26.

paul@yuzu ~/P/A/A/build (master)> ./src/AtomVM ../../atomvm_benchmark/_build/default/lib/benchmark.avm libs/atomvmlib.avm
Running tests:
pingpong_speed_test: 2298446
prime_speed_test: 106290
prng_test: 624476
pi_test: 681115
sudoku_solution_test: 167670
sudoku_puzzle_test: 6324736
pingpong_speed_test [schedulers=1]: 175687
prime_speed_test [schedulers=1]: 373233
Return value: ok
paul@yuzu ~/P/A/A/build (w34/use-pc-pointer-instead-of-i)> ./src/AtomVM ../../atomvm_benchmark/_build/default/lib/benchmark.avm libs/atomvmlib.avm
Running tests:
pingpong_speed_test: 2260228
prime_speed_test: 90409
prng_test: 608688
pi_test: 627082
sudoku_solution_test: 156944
sudoku_puzzle_test: 6008711
pingpong_speed_test [schedulers=1]: 174490
prime_speed_test [schedulers=1]: 296484
Return value: ok

Everything was compiled with OTP26. Which platform and which OTP version did you test it with?

@pguyot
Copy link
Collaborator Author

pguyot commented Sep 11, 2023

The results depend on the SDK/gcc versions.

sdk 5.1

with PR:
pi_test: 119369657

with master:
pi_test: 199435054

sdk 4.4

with PR:
pi_test: 129949250

with master:
pi_test: 82786098

@pguyot
Copy link
Collaborator Author

pguyot commented Sep 12, 2023

On top of #805, this brings a significant speed increase (38%) to pi_test benchmark on ESP32 with sdk 4.4

Also mark code in modules as const.

Signed-off-by: Paul Guyot <[email protected]>
@pguyot pguyot force-pushed the w34/use-pc-pointer-instead-of-i branch from 1233d4b to 360b979 Compare September 17, 2023 21:59
@bettio
Copy link
Collaborator

bettio commented Oct 16, 2023

For future reference I'm posting some benchmarks and some further information that can be useful for further optimizations.

Screenshot_20231016_141457

The following are normalized scores (lower is better here too):
Screenshot_20231016_140839
Screenshot_20231016_140829

So there is a general improvement, even if the issue with prime test (here attached) should be investigated.

prime.zip

The .S file has been generated using OTP 24.

@bettio bettio merged commit f43a03a into atomvm:master Oct 16, 2023
80 checks passed
@pguyot pguyot deleted the w34/use-pc-pointer-instead-of-i branch October 16, 2023 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants