Benchmarks with more than one thread on an Apple M1

Maybe I was too stupid or not that involved in chess engine testing, but only after some small research in the source code of Stockfish 13 I figured out how to do the benchmark with more than one thread:

benchmark.cpp:

 95 /// setup_bench() builds a list of UCI commands to be run by bench. There
 96 /// are five parameters: TT size in MB, number of search threads that
 97 /// should be used, the limit value spent for each position, a file name
 98 /// where to look for positions in FEN format, the type of the limit:
 99 /// depth, perft, nodes and movetime (in millisecs), and evaluation type
100 /// mixed (default), classical, NNUE.
101 ///
102 /// bench -> search default positions up to depth 13
103 /// bench 64 1 15 -> search default positions up to depth 15 (TT = 64MB)
104 /// bench 64 4 5000 current movetime -> search current position with 4 threads for 5 sec
105 /// bench 64 1 100000 default nodes -> search default positions for 100K nodes each
106 /// bench 16 1 5 default perft -> run a perft 5 on default positions
bench 64 [Number of threads]

Update: A smart guy in the talkchess forum gave some advices, so I changed the settings. Surpressing the output and repeating the bench a few times should give more reliable results. I also deleted the lines with engines, where I wasn’t sure that these benchmarks are comparable. Though this test should actually only show where in the hardware ranking the M1 is located (roughly), my first try was probably too sloppy.

In addition I had some fun tonight digging up long forgotten knowledge. I built a small script running each bench ten times and calculating the average.

 1 #!/bin/zsh
 2 
 3 ########################################################
 4 # usage: bench.sh [engine] [number of threads]         #
 5 # for the Honey family:                                #
 6 # bench.sh [engine] [number of threads] 13 [true|false]#
 7 ########################################################
 8 
 9 test -e bench.tmp && rm bench.tmp
10 
11 i=0
12 
13 while [ $i -lt 10 ]; do
14     $1  bench 64 $2 $3 $4  1>/dev/null 2>>bench.tmp
15     let i=i+1
16 done
17 
18 test -e benchmarks/$2-threads-$1.txt && rm benchmarks/$2-threads-$1.txt
19 
20 grep 'second' bench.tmp | sed -e 's/.*[ \t]//'>>benchmarks/$2-threads-$1.txt
21 
22 sum=`cat benchmarks/$2-threads-$1.txt | awk '{sum+=$1} END{print sum}'`
23 
24 avg=$((sum / 10))
25 echo $avg
26 
27 rm bench.tmp
28

 

So here you go, the numbers are the nodes per second during the benchmark test:

Engine1 thread2 threads4 threads8 threads
amoeba-3.22,850,2255,569,50710,999,49214,946,973
Black-Diamond-v13 (classical)3,002,3596,147,95412,277,00017,303,000
Black-Diamond-v13 (nnue)1,962,5384,108,8658,159,83011,206,000
cfish-122,399,2134,766,5059,387,34812,803,333
cfish-170220212,715,2855,446,69310,684,91615,042,888
corchess-nnue-1.32,327,6364,630,7679,150,97812,950,410
crystal-3.12,149,7874,630,2959,382,96813,315,242
Honey-v13 (classical)2,024,407
Honey-v13 (nnue)1,317,226
Oki-Maguro (classical)3,205,2236,622,36313,317,00018,754,000
Oki-Maguro (nnue)2,189,1894,505,2029,075,12913,401,000
RubiChess-2.1-dev (classical)4,801,1757,664,48313,296,03719,547,936
RubiChess-2.1-dev (nnue)2,140,9183,628,2726,797,7389,257,131
stockfish-122,252,6534,517,4888,924,21812,138,502
stockfish-132,320,2524,557,0759,414,90813,183,130
stockfish-13-osx*1,464,6622,978,4816,003,4448,495,631
sugar-AI-1.502,290,7294,573,4599,050,25712,646,313
sugar-AI-ICCF-140a2,149,3964,424,4649,032,31813,010,932

*x86_64 binaries

For some engines I didn’t figure out yet how to do a multiple thread benchmark.

Ein Gedanke zu “Benchmarks with more than one thread on an Apple M1

Die Kommentarfunktion ist geschlossen.