Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results for ebay ThunderX super SBC #38

Closed
fifteenhex opened this issue Jan 21, 2022 · 7 comments
Closed

Results for ebay ThunderX super SBC #38

fifteenhex opened this issue Jan 21, 2022 · 7 comments

Comments

@fifteenhex
Copy link

| bigboy | no cpufreq support | 5.16 | Sid arm64 | 102630 | 166930 | 511940 | 4180 | 17120 | - | http://ix.io/3N0h |

https://twitter.com/linux_chenxing/status/1449603003057532930

@ThomasKaiser
Copy link
Owner

Would be nice if you could provide output from lscpu -e and with latest 0.9.2 version first ~110 lines from sbc-bench -m.

@fifteenhex
Copy link
Author

daniel@bigboy:~$ lscpu -e
CPU NODE CLUSTER CORE ONLINE
  0    0       0    0    yes
  1    0       0    1    yes
  2    0       0    2    yes
  3    0       0    3    yes
  4    0       0    4    yes
  5    0       0    5    yes
  6    0       0    6    yes
  7    0       0    7    yes
  8    0       0    8    yes
  9    0       0    9    yes
 10    0       0   10    yes
 11    0       0   11    yes
 12    0       0   12    yes
 13    0       0   13    yes
 14    0       0   14    yes
 15    0       0   15    yes
 16    0       0   16    yes
 17    0       0   17    yes
 18    0       0   18    yes
 19    0       0   19    yes
 20    0       0   20    yes
 21    0       0   21    yes
 22    0       0   22    yes
 23    0       0   23    yes
 24    0       0   24    yes
 25    0       0   25    yes
 26    0       0   26    yes
 27    0       0   27    yes
 28    0       0   28    yes
 29    0       0   29    yes
 30    0       0   30    yes
 31    0       0   31    yes
 32    0       0   32    yes
 33    0       0   33    yes
 34    0       0   34    yes
 35    0       0   35    yes
 36    0       0   36    yes
 37    0       0   37    yes
 38    0       0   38    yes
 39    0       0   39    yes
 40    0       0   40    yes
 41    0       0   41    yes
 42    0       0   42    yes
 43    0       0   43    yes
 44    0       0   44    yes
 45    0       0   45    yes
 46    0       0   46    yes
 47    0       0   47    yes
 48    1       1   48    yes
 49    1       1   49    yes
 50    1       1   50    yes
 51    1       1   51    yes
 52    1       1   52    yes
 53    1       1   53    yes
 54    1       1   54    yes
 55    1       1   55    yes
 56    1       1   56    yes
 57    1       1   57    yes
 58    1       1   58    yes
 59    1       1   59    yes
 60    1       1   60    yes
 61    1       1   61    yes
 62    1       1   62    yes
 63    1       1   63    yes
 64    1       1   64    yes
 65    1       1   65    yes
 66    1       1   66    yes
 67    1       1   67    yes
 68    1       1   68    yes
 69    1       1   69    yes
 70    1       1   70    yes
 71    1       1   71    yes
 72    1       1   72    yes
 73    1       1   73    yes
 74    1       1   74    yes
 75    1       1   75    yes
 76    1       1   76    yes
 77    1       1   77    yes
 78    1       1   78    yes
 79    1       1   79    yes
 80    1       1   80    yes
 81    1       1   81    yes
 82    1       1   82    yes
 83    1       1   83    yes
 84    1       1   84    yes
 85    1       1   85    yes
 86    1       1   86    yes
 87    1       1   87    yes
 88    1       1   88    yes
 89    1       1   89    yes
 90    1       1   90    yes
 91    1       1   91    yes
 92    1       1   92    yes
 93    1       1   93    yes
 94    1       1   94    yes
 95    1       1   95    yes
daniel@bigboy:~$ 

@fifteenhex
Copy link
Author

daniel@bigboy:~/sbcbench$ ./sbc-bench.sh -m
2 x ThunderX CN8890, Kernel: aarch64, Userland: arm64
CPU sysfs topology (clusters, cpufreq members, clockspeeds)
                 cpufreq   min    max
 CPU    cluster  policy   speed  speed   core type
  0        0        -       -       -    Cavium ThunderX 88XX / r1p1
  1        0        -       -       -    Cavium ThunderX 88XX / r1p1
  2        0        -       -       -    Cavium ThunderX 88XX / r1p1
  3        0        -       -       -    Cavium ThunderX 88XX / r1p1
  4        0        -       -       -    Cavium ThunderX 88XX / r1p1
  5        0        -       -       -    Cavium ThunderX 88XX / r1p1
  6        0        -       -       -    Cavium ThunderX 88XX / r1p1
  7        0        -       -       -    Cavium ThunderX 88XX / r1p1
  8        0        -       -       -    Cavium ThunderX 88XX / r1p1
  9        0        -       -       -    Cavium ThunderX 88XX / r1p1
 10        0        -       -       -    Cavium ThunderX 88XX / r1p1
 11        0        -       -       -    Cavium ThunderX 88XX / r1p1
 12        0        -       -       -    Cavium ThunderX 88XX / r1p1
 13        0        -       -       -    Cavium ThunderX 88XX / r1p1
 14        0        -       -       -    Cavium ThunderX 88XX / r1p1
 15        0        -       -       -    Cavium ThunderX 88XX / r1p1
 16        0        -       -       -    Cavium ThunderX 88XX / r1p1
 17        0        -       -       -    Cavium ThunderX 88XX / r1p1
 18        0        -       -       -    Cavium ThunderX 88XX / r1p1
 19        0        -       -       -    Cavium ThunderX 88XX / r1p1
 20        0        -       -       -    Cavium ThunderX 88XX / r1p1
 21        0        -       -       -    Cavium ThunderX 88XX / r1p1
 22        0        -       -       -    Cavium ThunderX 88XX / r1p1
 23        0        -       -       -    Cavium ThunderX 88XX / r1p1
 24        0        -       -       -    Cavium ThunderX 88XX / r1p1
 25        0        -       -       -    Cavium ThunderX 88XX / r1p1
 26        0        -       -       -    Cavium ThunderX 88XX / r1p1
 27        0        -       -       -    Cavium ThunderX 88XX / r1p1
 28        0        -       -       -    Cavium ThunderX 88XX / r1p1
 29        0        -       -       -    Cavium ThunderX 88XX / r1p1
 30        0        -       -       -    Cavium ThunderX 88XX / r1p1
 31        0        -       -       -    Cavium ThunderX 88XX / r1p1
 32        0        -       -       -    Cavium ThunderX 88XX / r1p1
 33        0        -       -       -    Cavium ThunderX 88XX / r1p1
 34        0        -       -       -    Cavium ThunderX 88XX / r1p1
 35        0        -       -       -    Cavium ThunderX 88XX / r1p1
 36        0        -       -       -    Cavium ThunderX 88XX / r1p1
 37        0        -       -       -    Cavium ThunderX 88XX / r1p1
 38        0        -       -       -    Cavium ThunderX 88XX / r1p1
 39        0        -       -       -    Cavium ThunderX 88XX / r1p1
 40        0        -       -       -    Cavium ThunderX 88XX / r1p1
 41        0        -       -       -    Cavium ThunderX 88XX / r1p1
 42        0        -       -       -    Cavium ThunderX 88XX / r1p1
 43        0        -       -       -    Cavium ThunderX 88XX / r1p1
 44        0        -       -       -    Cavium ThunderX 88XX / r1p1
 45        0        -       -       -    Cavium ThunderX 88XX / r1p1
 46        0        -       -       -    Cavium ThunderX 88XX / r1p1
 47        0        -       -       -    Cavium ThunderX 88XX / r1p1
 48        1        -       -       -    Cavium ThunderX 88XX / r1p1
 49        1        -       -       -    Cavium ThunderX 88XX / r1p1
 50        1        -       -       -    Cavium ThunderX 88XX / r1p1
 51        1        -       -       -    Cavium ThunderX 88XX / r1p1
 52        1        -       -       -    Cavium ThunderX 88XX / r1p1
 53        1        -       -       -    Cavium ThunderX 88XX / r1p1
 54        1        -       -       -    Cavium ThunderX 88XX / r1p1
 55        1        -       -       -    Cavium ThunderX 88XX / r1p1
 56        1        -       -       -    Cavium ThunderX 88XX / r1p1
 57        1        -       -       -    Cavium ThunderX 88XX / r1p1
 58        1        -       -       -    Cavium ThunderX 88XX / r1p1
 59        1        -       -       -    Cavium ThunderX 88XX / r1p1
 60        1        -       -       -    Cavium ThunderX 88XX / r1p1
 61        1        -       -       -    Cavium ThunderX 88XX / r1p1
 62        1        -       -       -    Cavium ThunderX 88XX / r1p1
 63        1        -       -       -    Cavium ThunderX 88XX / r1p1
 64        1        -       -       -    Cavium ThunderX 88XX / r1p1
 65        1        -       -       -    Cavium ThunderX 88XX / r1p1
 66        1        -       -       -    Cavium ThunderX 88XX / r1p1
 67        1        -       -       -    Cavium ThunderX 88XX / r1p1
 68        1        -       -       -    Cavium ThunderX 88XX / r1p1
 69        1        -       -       -    Cavium ThunderX 88XX / r1p1
 70        1        -       -       -    Cavium ThunderX 88XX / r1p1
 71        1        -       -       -    Cavium ThunderX 88XX / r1p1
 72        1        -       -       -    Cavium ThunderX 88XX / r1p1
 73        1        -       -       -    Cavium ThunderX 88XX / r1p1
 74        1        -       -       -    Cavium ThunderX 88XX / r1p1
 75        1        -       -       -    Cavium ThunderX 88XX / r1p1
 76        1        -       -       -    Cavium ThunderX 88XX / r1p1
 77        1        -       -       -    Cavium ThunderX 88XX / r1p1
 78        1        -       -       -    Cavium ThunderX 88XX / r1p1
 79        1        -       -       -    Cavium ThunderX 88XX / r1p1
 80        1        -       -       -    Cavium ThunderX 88XX / r1p1
 81        1        -       -       -    Cavium ThunderX 88XX / r1p1
 82        1        -       -       -    Cavium ThunderX 88XX / r1p1
 83        1        -       -       -    Cavium ThunderX 88XX / r1p1
 84        1        -       -       -    Cavium ThunderX 88XX / r1p1
 85        1        -       -       -    Cavium ThunderX 88XX / r1p1
 86        1        -       -       -    Cavium ThunderX 88XX / r1p1
 87        1        -       -       -    Cavium ThunderX 88XX / r1p1
 88        1        -       -       -    Cavium ThunderX 88XX / r1p1
 89        1        -       -       -    Cavium ThunderX 88XX / r1p1
 90        1        -       -       -    Cavium ThunderX 88XX / r1p1
 91        1        -       -       -    Cavium ThunderX 88XX / r1p1
 92        1        -       -       -    Cavium ThunderX 88XX / r1p1
 93        1        -       -       -    Cavium ThunderX 88XX / r1p1
 94        1        -       -       -    Cavium ThunderX 88XX / r1p1
 95        1        -       -       -    Cavium ThunderX 88XX / r1p1

Time      CPU n/a    load %cpu %sys %usr %nice %io %irq   Temp
01:21:42:   ---      0.90   0%   0%   0%   0%   0%   0%     0°C
01:21:47:   ---      0.83   0%   0%   0%   0%   0%   0%     0°C
01:21:52:   ---      0.76   0%   0%   0%   0%   0%   0%     0°C
01:21:57:   ---      0.70   0%   0%   0%   0%   0%   0%     0°C
01:22:02:   ---      0.72   0%   0%   0%   0%   0%   0%     0°C
01:22:07:   ---      0.75   0%   0%   0%   0%   0%   0%     0°C
01:22:12:   ---      0.69   0%   0%   0%   0%   0%   0%     0°C
01:22:17:   ---      0.71   0%   0%   0%   0%   0%   0%     0°C

@ThomasKaiser
Copy link
Owner

ThomasKaiser commented Jan 21, 2022

Thank you! With latest commit now cluster/NUMA handling should work and the script should stop logging like crazy with multithreaded 7-zip benchmark on the 96 core machine. Please give it a try, preferably cpuminer scores included: sbc-bench -c :)

@fifteenhex
Copy link
Author

| bigboy | no cpufreq support | 5.16 | Sid arm64 | 107180 | 110340 | 340750 | 4180 | 17130 | - | http://ix.io/3N5c |

@ThomasKaiser
Copy link
Owner

ThomasKaiser commented Jan 22, 2022

Interestingly the ThunderX in the 2nd socket performs slightly better.

Let's do another comparison just for fun since achieving same 7-ZIP-MIPS total score (kinda irrelevant though since the basic assumption that a host with twice the 7-ZIPS MIPS performs twice as fast with 'server stuff in general' is severely flawed with designs that wide and differing memory/individual performance)

That's your king-sized SBC with 2 x ThunderX CN8890 / DDR4-2133:

RAM size:   32039 MB,  # CPU hardware threads:  96
RAM usage:  21180 MB,  # Benchmark threads:     96

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:      51725  8686    580  50319  |    1905804  8725   1864 162515
23:      52447  8624    620  53438  |    1862970  8900   1812 161198
24:      54708  8691    677  58822  |    1851752  8976   1812 162521
25:      51843  8769    675  59193  |    1912189  9147   1861 170156
----------------------------------  | ------------------------------
Avr:            8692    638  55443  |             8937   1837 164097
Tot:            8815   1238 109770

Now comparing with an Amazon m6g.8xlarge instance based on a 2nd gen Graviton2 (Neoverse-N1) limited to 32 N1 cores clocked at 2.5GHz with some virtualization overhead:

RAM size:  126038 MB,  # CPU hardware threads:  32
RAM usage:   7060 MB,  # Benchmark threads:     32

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:     110223  2852   3760 107226  |    1299248  3183   3481 110799
23:     102714  2800   3738 104654  |    1274412  3181   3466 110276
24:     101936  2878   3808 109603  |    1248819  3182   3445 109613
25:      99609  2907   3912 113730  |    1210036  3154   3414 107685
----------------------------------  | ------------------------------
Avr:            2859   3805 108803  |             3175   3452 109593
Tot:            3017   3628 109198

The N1 setup with 1/3 the cores is twice as fast when compressing but achieves 'only' 2/3 at decompressing compared to the dual CN8890 setup (the latter sucking here a bit: CPU utilization only 93% -> 8937/9600).

The Amazon setup really benfits from faster memory access and larger caches:

Tinymembench results for the ThunderX CN8890:

 standard memcpy                                      :   4184.6 MB/s
 standard memset                                      :  17134.4 MB/s

block size : single random read / dual random read, [MADV_NOHUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    9.5 ns          /    19.5 ns 
    131072 :   14.3 ns          /    29.3 ns 
    262144 :   20.9 ns          /    42.6 ns 
    524288 :   24.3 ns          /    49.3 ns 
   1048576 :   26.0 ns          /    52.9 ns 
   2097152 :   39.4 ns          /    79.6 ns 
   4194304 :   46.2 ns          /    92.9 ns 
   8388608 :   49.3 ns          /    99.6 ns 
  16777216 :   57.8 ns          /   116.4 ns 
  33554432 :   89.9 ns          /   179.6 ns 
  67108864 :  108.2 ns          /   217.0 ns 

block size : single random read / dual random read, [MADV_HUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    9.5 ns          /    19.5 ns 
    131072 :   14.3 ns          /    29.3 ns 
    262144 :   16.7 ns          /    34.2 ns 
    524288 :   17.9 ns          /    36.6 ns 
   1048576 :   18.5 ns          /    37.9 ns 
   2097152 :   18.7 ns          /    38.5 ns 
   4194304 :   18.9 ns          /    38.8 ns 
   8388608 :   19.4 ns          /    38.9 ns 
  16777216 :   19.3 ns          /    39.5 ns 
  33554432 :   56.1 ns          /   112.7 ns 
  67108864 :   74.1 ns          /   148.8 ns 

Amazon m6g.8xlarge:

 standard memcpy                                      :  16146.8 MB/s
 standard memset                                      :  39948.4 MB/s

block size : single random read / dual random read, [MADV_NOHUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    0.0 ns          /     0.0 ns 
    131072 :    1.4 ns          /     2.0 ns 
    262144 :    2.1 ns          /     2.6 ns 
    524288 :    3.7 ns          /     4.4 ns 
   1048576 :    5.8 ns          /     8.1 ns 
   2097152 :   19.0 ns          /    27.1 ns 
   4194304 :   26.4 ns          /    33.7 ns 
   8388608 :   30.5 ns          /    36.2 ns 
  16777216 :   38.1 ns          /    44.2 ns 
  33554432 :   49.7 ns          /    60.0 ns 
  67108864 :   80.9 ns          /   102.2 ns 

block size : single random read / dual random read, [MADV_HUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    0.0 ns          /     0.0 ns 
    131072 :    1.4 ns          /     2.0 ns 
    262144 :    2.2 ns          /     2.6 ns 
    524288 :    2.5 ns          /     2.8 ns 
   1048576 :    3.1 ns          /     3.4 ns 
   2097152 :   17.5 ns          /    25.5 ns 
   4194304 :   24.8 ns          /    32.0 ns 
   8388608 :   28.4 ns          /    34.1 ns 
  16777216 :   30.2 ns          /    34.8 ns 
  33554432 :   31.3 ns          /    35.5 ns 
  67108864 :   67.1 ns          /    87.2 ns 

@fifteenhex
Copy link
Author

Interesting. Memory performance might increase a bit in a week or so when I have modules to populate the two remaining channels for each socket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants