Maybe I was too stupid or not that involved in chess engine testing, but only after some small research in the source code of Stockfish 13 I figured out how to do the benchmark with more than one thread:
benchmark.cpp: 95 /// setup_bench() builds a list of UCI commands to be run by bench. There 96 /// are five parameters: TT size in MB, number of search threads that 97 /// should be used, the limit value spent for each position, a file name 98 /// where to look for positions in FEN format, the type of the limit: 99 /// depth, perft, nodes and movetime (in millisecs), and evaluation type 100 /// mixed (default), classical, NNUE. 101 /// 102 /// bench -> search default positions up to depth 13 103 /// bench 64 1 15 -> search default positions up to depth 15 (TT = 64MB) 104 /// bench 64 4 5000 current movetime -> search current position with 4 threads for 5 sec 105 /// bench 64 1 100000 default nodes -> search default positions for 100K nodes each 106 /// bench 16 1 5 default perft -> run a perft 5 on default positions
bench 64 [Number of threads]
Update: A smart guy in the talkchess forum gave some advices, so I changed the settings. Surpressing the output and repeating the bench a few times should give more reliable results. I also deleted the lines with engines, where I wasn’t sure that these benchmarks are comparable. Though this test should actually only show where in the hardware ranking the M1 is located (roughly), my first try was probably too sloppy.
In addition I had some fun tonight digging up long forgotten knowledge. I built a small script running each bench ten times and calculating the average.
1 #!/bin/zsh 2 3 ######################################################## 4 # usage: bench.sh [engine] [number of threads] # 5 # for the Honey family: # 6 # bench.sh [engine] [number of threads] 13 [true|false]# 7 ######################################################## 8 9 test -e bench.tmp && rm bench.tmp 10 11 i=0 12 13 while [ $i -lt 10 ]; do 14 $1 bench 64 $2 $3 $4 1>/dev/null 2>>bench.tmp 15 let i=i+1 16 done 17 18 test -e benchmarks/$2-threads-$1.txt && rm benchmarks/$2-threads-$1.txt 19 20 grep 'second' bench.tmp | sed -e 's/.*[ \t]//'>>benchmarks/$2-threads-$1.txt 21 22 sum=`cat benchmarks/$2-threads-$1.txt | awk '{sum+=$1} END{print sum}'` 23 24 avg=$((sum / 10)) 25 echo $avg 26 27 rm bench.tmp 28
So here you go, the numbers are the nodes per second during the benchmark test:
Engine | 1 thread | 2 threads | 4 threads | 8 threads |
---|---|---|---|---|
amoeba-3.2 | 2,850,225 | 5,569,507 | 10,999,492 | 14,946,973 |
Black-Diamond-v13 (classical) | 3,002,359 | 6,147,954 | 12,277,000 | 17,303,000 |
Black-Diamond-v13 (nnue) | 1,962,538 | 4,108,865 | 8,159,830 | 11,206,000 |
cfish-12 | 2,399,213 | 4,766,505 | 9,387,348 | 12,803,333 |
cfish-17022021 | 2,715,285 | 5,446,693 | 10,684,916 | 15,042,888 |
corchess-nnue-1.3 | 2,327,636 | 4,630,767 | 9,150,978 | 12,950,410 |
crystal-3.1 | 2,149,787 | 4,630,295 | 9,382,968 | 13,315,242 |
Honey-v13 (classical) | 2,024,407 | |||
Honey-v13 (nnue) | 1,317,226 | |||
Oki-Maguro (classical) | 3,205,223 | 6,622,363 | 13,317,000 | 18,754,000 |
Oki-Maguro (nnue) | 2,189,189 | 4,505,202 | 9,075,129 | 13,401,000 |
RubiChess-2.1-dev (classical) | 4,801,175 | 7,664,483 | 13,296,037 | 19,547,936 |
RubiChess-2.1-dev (nnue) | 2,140,918 | 3,628,272 | 6,797,738 | 9,257,131 |
stockfish-12 | 2,252,653 | 4,517,488 | 8,924,218 | 12,138,502 |
stockfish-13 | 2,320,252 | 4,557,075 | 9,414,908 | 13,183,130 |
stockfish-13-osx* | 1,464,662 | 2,978,481 | 6,003,444 | 8,495,631 |
sugar-AI-1.50 | 2,290,729 | 4,573,459 | 9,050,257 | 12,646,313 |
sugar-AI-ICCF-140a | 2,149,396 | 4,424,464 | 9,032,318 | 13,010,932 |
*x86_64 binaries
For some engines I didn’t figure out yet how to do a multiple thread benchmark.
Ein Gedanke zu “Benchmarks with more than one thread on an Apple M1”
Die Kommentarfunktion ist geschlossen.