Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orange Pi 5 #5

Open
geerlingguy opened this issue Jan 26, 2023 · 8 comments
Open

Orange Pi 5 #5

geerlingguy opened this issue Jan 26, 2023 · 8 comments

Comments

@geerlingguy
Copy link
Owner

geerlingguy commented Jan 26, 2023

DSC02743

Basic information

Linux/system information

(Running Armbian build because I couldn't find a download for Orange Pi OS today...)

# output of `neofetch`
                                 admin@orangepi5 
                                 --------------- 
      █ █ █ █ █ █ █ █ █ █ █      OS: Armbian (23.8.1) aarch64 
     ███████████████████████     Host: Orange Pi 5 
   ▄▄██                   ██▄▄   Kernel: 5.10.160-legacy-rk35xx 
   ▄▄██    ███████████    ██▄▄   Uptime: 2 mins 
   ▄▄██   ██         ██   ██▄▄   Packages: 533 (dpkg) 
   ▄▄██   ██         ██   ██▄▄   Shell: bash 5.2.15 
   ▄▄██   ██         ██   ██▄▄   Resolution: 1920x1080 
   ▄▄██   █████████████   ██▄▄   Terminal: /dev/pts/0 
   ▄▄██   ██         ██   ██▄▄   CPU: (8) @ 1.800GHz 
   ▄▄██   ██         ██   ██▄▄   Memory: 172MiB / 3921MiB 
   ▄▄██   ██         ██   ██▄▄
   ▄▄██                   ██▄▄                           
     ███████████████████████                             
      █ █ █ █ █ █ █ █ █ █ █

# output of `uname -a`
Linux orangepi5 5.10.160-legacy-rk35xx #1 SMP Mon Aug 28 01:21:24 UTC 2023 aarch64 GNU/Linux

Benchmark results

CPU

Power

  • Idle power draw (at wall): 1.0 W
  • Maximum simulated power draw (stress-ng --matrix 0): 10.0 W
  • During Geekbench multicore benchmark: 8.9 W
  • During top500 HPL benchmark: 11.5 W

Disk

SanDisk Extreme 128GB microSD

Benchmark Result
fio 1M sequential read 68.1 MB/s
iozone 1M random read 59.78 MB/s
iozone 1M random write 20.93 MB/s
iozone 4K random read 6.80 MB/s
iozone 4K random write 3.36 MB/s

KIOXIA XG6 1TB NVMe SSD

Benchmark Result
fio 1M sequential read 428 MB/s
iozone 1M random read 361 MB/s
iozone 1M random write 359 MB/s
iozone 4K random read 32.87 MB/s
iozone 4K random write 80.85 MB/s

curl https://raw.githubusercontent.com/geerlingguy/pi-cluster/master/benchmarks/disk-benchmark.sh | sudo bash

Run benchmark on any attached storage device (e.g. eMMC, microSD, NVMe, SATA) and add results under an additional heading. Download the script with curl -o disk-benchmark.sh [URL_HERE] and run sudo DEVICE_UNDER_TEST=/dev/sda DEVICE_MOUNT_PATH=/mnt/sda1 ./disk-benchmark.sh (assuming the device is sda).

Also consider running PiBenchmarks.com script.

Network

iperf3 results:

  • iperf3 -c $SERVER_IP: 942 Mbps
  • iperf3 --reverse -c $SERVER_IP: 906 Mbps
  • iperf3 --bidir -c $SERVER_IP: 931 Mbps up / 340 Mbps down

GPU

  • TODO: Haven't determined standardized benchmark yet. See Issue #2.

Memory

tinymembench results:

Click to expand memory benchmark result
tinymembench v0.4.10 (simple benchmark for memory throughput and latency)

==========================================================================
== Memory bandwidth tests                                               ==
==                                                                      ==
== Note 1: 1MB = 1000000 bytes                                          ==
== Note 2: Results for 'copy' tests show how many bytes can be          ==
==         copied per second (adding together read and writen           ==
==         bytes would have provided twice higher numbers)              ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
==         to first fetch data into it, and only then write it to the   ==
==         destination (source -> L1 cache, L1 cache -> destination)    ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in    ==
==         brackets                                                     ==
==========================================================================

 C copy backwards                                     :  12027.4 MB/s (0.3%)
 C copy backwards (32 byte blocks)                    :  12000.8 MB/s
 C copy backwards (64 byte blocks)                    :  11994.9 MB/s
 C copy                                               :  12216.1 MB/s
 C copy prefetched (32 bytes step)                    :  12494.3 MB/s
 C copy prefetched (64 bytes step)                    :  12508.1 MB/s
 C 2-pass copy                                        :   5486.6 MB/s (0.1%)
 C 2-pass copy prefetched (32 bytes step)             :   9841.2 MB/s
 C 2-pass copy prefetched (64 bytes step)             :  10479.9 MB/s
 C fill                                               :  29387.2 MB/s (2.3%)
 C fill (shuffle within 16 byte blocks)               :  29387.6 MB/s
 C fill (shuffle within 32 byte blocks)               :  29371.2 MB/s
 C fill (shuffle within 64 byte blocks)               :  29264.7 MB/s
 NEON 64x2 COPY                                       :  12341.6 MB/s
 NEON 64x2x4 COPY                                     :  12388.7 MB/s
 NEON 64x1x4_x2 COPY                                  :  12417.8 MB/s
 NEON 64x2 COPY prefetch x2                           :  11351.8 MB/s
 NEON 64x2x4 COPY prefetch x1                         :  11709.1 MB/s
 NEON 64x2 COPY prefetch x1                           :  11419.9 MB/s
 NEON 64x2x4 COPY prefetch x1                         :  11718.4 MB/s
 ---
 standard memcpy                                      :  12420.1 MB/s
 standard memset                                      :  29405.7 MB/s (0.1%)
 ---
 NEON LDP/STP copy                                    :  12454.8 MB/s
 NEON LDP/STP copy pldl2strm (32 bytes step)          :  12279.3 MB/s
 NEON LDP/STP copy pldl2strm (64 bytes step)          :  12323.1 MB/s
 NEON LDP/STP copy pldl1keep (32 bytes step)          :  12455.0 MB/s
 NEON LDP/STP copy pldl1keep (64 bytes step)          :  12453.0 MB/s
 NEON LD1/ST1 copy                                    :  12383.3 MB/s
 NEON STP fill                                        :  29418.7 MB/s (0.2%)
 NEON STNP fill                                       :  29368.8 MB/s
 ARM LDP/STP copy                                     :  12430.3 MB/s
 ARM STP fill                                         :  29350.7 MB/s (0.1%)
 ARM STNP fill                                        :  29342.7 MB/s

==========================================================================
== Framebuffer read tests.                                              ==
==                                                                      ==
== Many ARM devices use a part of the system memory as the framebuffer, ==
== typically mapped as uncached but with write-combining enabled.       ==
== Writes to such framebuffers are quite fast, but reads are much       ==
== slower and very sensitive to the alignment and the selection of      ==
== CPU instructions which are used for accessing memory.                ==
==                                                                      ==
== Many x86 systems allocate the framebuffer in the GPU memory,         ==
== accessible for the CPU via a relatively slow PCI-E bus. Moreover,    ==
== PCI-E is asymmetric and handles reads a lot worse than writes.       ==
==                                                                      ==
== If uncached framebuffer reads are reasonably fast (at least 100 MB/s ==
== or preferably >300 MB/s), then using the shadow framebuffer layer    ==
== is not necessary in Xorg DDX drivers, resulting in a nice overall    ==
== performance improvement. For example, the xf86-video-fbturbo DDX     ==
== uses this trick.                                                     ==
==========================================================================

 NEON LDP/STP copy (from framebuffer)                 :   1962.5 MB/s (0.1%)
 NEON LDP/STP 2-pass copy (from framebuffer)          :   1668.2 MB/s
 NEON LD1/ST1 copy (from framebuffer)                 :   1958.7 MB/s
 NEON LD1/ST1 2-pass copy (from framebuffer)          :   1676.7 MB/s
 ARM LDP/STP copy (from framebuffer)                  :   1918.2 MB/s
 ARM LDP/STP 2-pass copy (from framebuffer)           :   1672.1 MB/s

==========================================================================
== Memory latency test                                                  ==
==                                                                      ==
== Average time is measured for random memory accesses in the buffers   ==
== of different sizes. The larger is the buffer, the more significant   ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM      ==
== accesses. For extremely large buffer sizes we are expecting to see   ==
== page table walk with several requests to SDRAM for almost every      ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest).                                         ==
==                                                                      ==
== Note 1: All the numbers are representing extra time, which needs to  ==
==         be added to L1 cache latency. The cycle timings for L1 cache ==
==         latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
==         two independent memory accesses at a time. In the case if    ==
==         the memory subsystem can't handle multiple outstanding       ==
==         requests, dual random read has the same timings as two       ==
==         single reads performed one after another.                    ==
==========================================================================

block size : single random read / dual random read
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    0.0 ns          /     0.0 ns 
    131072 :    1.1 ns          /     1.5 ns 
    262144 :    2.2 ns          /     2.8 ns 
    524288 :    3.4 ns          /     3.9 ns 
   1048576 :    9.9 ns          /    12.9 ns 
   2097152 :   13.7 ns          /    15.6 ns 
   4194304 :   35.7 ns          /    53.1 ns 
   8388608 :   74.8 ns          /   155.9 ns 
  16777216 :  198.8 ns          /   245.6 ns 
  33554432 :  223.1 ns          /   257.5 ns 
  67108864 :  236.4 ns          /   263.0 ns 

Phoronix Test Suite

Results from pi-general-benchmark.sh:

  • pts/encode-mp3: 12.269 sec
  • pts/x264 4K: 3.52 fps
  • pts/x264 1080p: 22.69 fps
  • pts/phpbench: 420027
  • pts/build-linux-kernel (defconfig): 1321.267 sec
@github-actions
Copy link

This issue has been marked 'stale' due to lack of recent activity. If there is no further activity, the issue will be closed in another 30 days. Thank you for your contribution!

Please read this blog post to see the reasons why I mark issues as stale.

@geerlingguy
Copy link
Owner Author

I tried downloading Orange Pi OS today and all the Google Drive folders for Arch and OH (whatever that is?) were empty :(

Screenshot 2023-09-05 at 9 06 50 AM

So sticking with Armbian for my testing.

@dkarter
Copy link

dkarter commented Dec 28, 2023

Hi Jeff,

First, huge thank you for all your work and for making technology more accessible and easy to understand!

I'm wondering if you've ever done/planning to do a review of the Orange Pi 5 Plus - the specs seem VERY compelling when compared even with Raspberry Pi 5. Particularly the 32GB RAM, built-in NVME and 2x 2.5Gbps ports, NPU and GPU. Too good to be true? :)

http://www.orangepi.org/html/hardWare/computerAndMicrocontrollers/details/Orange-Pi-5-plus-32GB.html

Cheers!

@geerlingguy
Copy link
Owner Author

@dkarter - I would, but it seems impossible to get my hands on one... It seems to be even harder to get than a Raspberry Pi 5 lately.

@leon332157
Copy link

It appears that the OH version is the orangepi os based on OpenHarmony.

@amrutprabhu
Copy link

Hey @geerlingguy ,

I was doing some research about the Orange Pi 5 and its new MAX version.

I see in the geekbench 6 scores you mentioned above, I see that it only shows the CPU to be 1.8 GHz, while the specs of the Orange Pi 5 have the A76 @2.4 GHz and an A55 @1.8 GHz.

is it that GeekBench only uses the A55 to do the multicore test?

@geerlingguy
Copy link
Owner Author

@amrutprabhu - Geekbench is always a bit funky with CPU + frequency detection. Sometimes it thinks there's just 1 core with 4 threads for a 4 core Arm SoC, sometimes it just reports 1/1, sometimes it reports a lower frequency or higher... I wouldn't put much stock into the Geekbench results.

It's better to run something like sbc-bench to see actual frequencies on all cores. Honestly I was also testing this board with Armbian as the Orange Pi OS version was hard to find when I was testing... so that could have something to do with it too.

@amrutprabhu
Copy link

@amrutprabhu - Geekbench is always a bit funky with CPU + frequency detection. Sometimes it thinks there's just 1 core with 4 threads for a 4 core Arm SoC, sometimes it just reports 1/1, sometimes it reports a lower frequency or higher... I wouldn't put much stock into the Geekbench results.

It's better to run something like sbc-bench to see actual frequencies on all cores. Honestly I was also testing this board with Armbian as the Orange Pi OS version was hard to find when I was testing... so that could have something to do with it too.

ohh.. thanks for this. I haven't explored this tool yet.
Geekbench kind of made it simple to measure scores across devices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants