You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Maximum simulated power draw (stress-ng --matrix 0): ~2 W
During Geekbench multicore benchmark: ~2W
During top500 HPL benchmark: TODO W
Disk
PNY Elite-X 64GB micro SD
Benchmark
Result
fio 1M sequential read
TODO MB/s
iozone 1M random read
TODO MB/s
iozone 1M random write
TODO MB/s
iozone 4K random read
TODO MB/s
iozone 4K random write
TODO MB/s
Network
iperf3 results:
iperf3 -c $SERVER_IP: TODO Mbps
iperf3 --reverse -c $SERVER_IP: TODO Mbps
iperf3 --bidir -c $SERVER_IP: TODO Mbps up, TODO Mbps down
GPU
TODO: Haven't determined standardized benchmark yet. See Issue #2.
Memory
tinymembench results:
Click to expand memory benchmark result
tinymembench v0.4.10 (simple benchmark for memory throughput and latency)
==========================================================================
== Memory bandwidth tests ==
== ==
== Note 1: 1MB = 1000000 bytes ==
== Note 2: Results for 'copy' tests show how many bytes can be ==
== copied per second (adding together read and writen ==
== bytes would have provided twice higher numbers) ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
== to first fetch data into it, and only then write it to the ==
== destination (source -> L1 cache, L1 cache -> destination) ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in ==
== brackets ==
C copy backwards : 1507.2 MB/s (0.6%)
C copy backwards (32 byte blocks) : 1524.4 MB/s (0.2%)
C copy backwards (64 byte blocks) : 1538.0 MB/s (0.4%)
C copy : 1464.8 MB/s (1.2%)
C copy prefetched (32 bytes step) : 1082.4 MB/s
C copy prefetched (64 bytes step) : 1000.3 MB/s
C 2-pass copy : 1294.4 MB/s
C 2-pass copy prefetched (32 bytes step) : 913.7 MB/s
C 2-pass copy prefetched (64 bytes step) : 773.7 MB/s
C fill : 5991.7 MB/s (0.3%)
C fill (shuffle within 16 byte blocks) : 5990.3 MB/s
C fill (shuffle within 32 byte blocks) : 5986.2 MB/s
C fill (shuffle within 64 byte blocks) : 5992.5 MB/s (0.1%)
NEON 64x2 COPY : 1533.9 MB/s (0.3%)
NEON 64x2x4 COPY : 1549.2 MB/s
NEON 64x1x4_x2 COPY : 1209.8 MB/s (1.0%)
NEON 64x2 COPY prefetch x2 : 340.1 MB/s
NEON 64x2x4 COPY prefetch x1 : 1598.2 MB/s
NEON 64x2 COPY prefetch x1 : 1656.7 MB/s
NEON 64x2x4 COPY prefetch x1 : 1598.3 MB/s
standard memcpy : 1548.2 MB/s (0.1%)
standard memset : 6000.6 MB/s (0.2%)
==========================================================================
== Memory latency test ==
== ==
== Average time is measured for random memory accesses in the buffers ==
== of different sizes. The larger is the buffer, the more significant ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM ==
== accesses. For extremely large buffer sizes we are expecting to see ==
== page table walk with several requests to SDRAM for almost every ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest). ==
== ==
== Note 1: All the numbers are representing extra time, which needs to ==
== be added to L1 cache latency. The cycle timings for L1 cache ==
== latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
== two independent memory accesses at a time. In the case if ==
== the memory subsystem can't handle multiple outstanding ==
== requests, dual random read has the same timings as two ==
== single reads performed one after another. ==
Basic information
Linux/system information
Official Ubuntu Image
Benchmark results
CPU
Power
stress-ng --matrix 0
): ~2 Wtop500
HPL benchmark: TODO WDisk
PNY Elite-X 64GB micro SD
Network
iperf3
results:iperf3 -c $SERVER_IP
: TODO Mbpsiperf3 --reverse -c $SERVER_IP
: TODO Mbpsiperf3 --bidir -c $SERVER_IP
: TODO Mbps up, TODO Mbps downGPU
Memory
tinymembench
results:Click to expand memory benchmark result
tinymembench v0.4.10 (simple benchmark for memory throughput and latency)==========================================================================
== Memory bandwidth tests ==
== ==
== Note 1: 1MB = 1000000 bytes ==
== Note 2: Results for 'copy' tests show how many bytes can be ==
== copied per second (adding together read and writen ==
== bytes would have provided twice higher numbers) ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
== to first fetch data into it, and only then write it to the ==
== destination (source -> L1 cache, L1 cache -> destination) ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in ==
== brackets ==
C copy backwards : 1507.2 MB/s (0.6%)
C copy backwards (32 byte blocks) : 1524.4 MB/s (0.2%)
C copy backwards (64 byte blocks) : 1538.0 MB/s (0.4%)
C copy : 1464.8 MB/s (1.2%)
C copy prefetched (32 bytes step) : 1082.4 MB/s
C copy prefetched (64 bytes step) : 1000.3 MB/s
C 2-pass copy : 1294.4 MB/s
C 2-pass copy prefetched (32 bytes step) : 913.7 MB/s
C 2-pass copy prefetched (64 bytes step) : 773.7 MB/s
C fill : 5991.7 MB/s (0.3%)
C fill (shuffle within 16 byte blocks) : 5990.3 MB/s
C fill (shuffle within 32 byte blocks) : 5986.2 MB/s
C fill (shuffle within 64 byte blocks) : 5992.5 MB/s (0.1%)
NEON 64x2 COPY : 1533.9 MB/s (0.3%)
NEON 64x2x4 COPY : 1549.2 MB/s
NEON 64x1x4_x2 COPY : 1209.8 MB/s (1.0%)
NEON 64x2 COPY prefetch x2 : 340.1 MB/s
NEON 64x2x4 COPY prefetch x1 : 1598.2 MB/s
NEON 64x2 COPY prefetch x1 : 1656.7 MB/s
NEON 64x2x4 COPY prefetch x1 : 1598.3 MB/s
standard memcpy : 1548.2 MB/s (0.1%)
standard memset : 6000.6 MB/s (0.2%)
NEON LDP/STP copy : 1523.8 MB/s (0.5%)
NEON LDP/STP copy pldl2strm (32 bytes step) : 955.5 MB/s (1.2%)
NEON LDP/STP copy pldl2strm (64 bytes step) : 1160.9 MB/s (0.3%)
NEON LDP/STP copy pldl1keep (32 bytes step) : 1768.4 MB/s
NEON LDP/STP copy pldl1keep (64 bytes step) : 1768.0 MB/s
NEON LD1/ST1 copy : 1523.4 MB/s (0.7%)
NEON STP fill : 5995.0 MB/s (0.3%)
NEON STNP fill : 3854.1 MB/s (0.5%)
ARM LDP/STP copy : 1530.3 MB/s (0.6%)
ARM STP fill : 5999.8 MB/s (0.3%)
ARM STNP fill : 3904.9 MB/s (0.8%)
==========================================================================
== Memory latency test ==
== ==
== Average time is measured for random memory accesses in the buffers ==
== of different sizes. The larger is the buffer, the more significant ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM ==
== accesses. For extremely large buffer sizes we are expecting to see ==
== page table walk with several requests to SDRAM for almost every ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest). ==
== ==
== Note 1: All the numbers are representing extra time, which needs to ==
== be added to L1 cache latency. The cycle timings for L1 cache ==
== latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
== two independent memory accesses at a time. In the case if ==
== the memory subsystem can't handle multiple outstanding ==
== requests, dual random read has the same timings as two ==
== single reads performed one after another. ==
block size : single random read / dual random read, [MADV_NOHUGEPAGE]
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 4.3 ns / 7.3 ns
131072 : 6.6 ns / 10.4 ns
262144 : 7.7 ns / 11.7 ns
524288 : 8.3 ns / 12.4 ns
1048576 : 10.1 ns / 14.8 ns
2097152 : 94.1 ns / 142.7 ns
4194304 : 141.2 ns / 186.7 ns
8388608 : 165.0 ns / 201.7 ns
16777216 : 178.7 ns / 211.1 ns
33554432 : 186.9 ns / 215.3 ns
67108864 : 191.3 ns / 217.2 ns
block size : single random read / dual random read, [MADV_HUGEPAGE]
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 4.3 ns / 7.3 ns
131072 : 6.6 ns / 10.4 ns
262144 : 7.7 ns / 11.7 ns
524288 : 8.3 ns / 12.3 ns
1048576 : 10.0 ns / 14.6 ns
2097152 : 93.4 ns / 141.5 ns
4194304 : 135.3 ns / 178.2 ns
8388608 : 155.1 ns / 189.3 ns
16777216 : 164.3 ns / 193.2 ns
33554432 : 169.3 ns / 196.0 ns
67108864 : 172.0 ns / 197.9 ns
sbc-bench
resultsRun sbc-bench and paste a link to the results here: gist
Phoronix Test Suite
Results from pi-general-benchmark.sh:
The text was updated successfully, but these errors were encountered: