Why it is important to have own evaluating?
All the test I found have on specific:
- Specific hardware, which I have no at all.
- No specified hardware at all (Chess.COM competition - what hardware they use?)
- Engines I can't get, so why I should care about them?
- Too short or too long controls
It is good if engine cool on some powerful PC/CPU, I have no it. I need best engine on my home hardware, and in my case it is Mac Mini M2.
So I have started own tournaments. I got more than 80 engines which I can run on my Mac, did a lot of tournaments to range all of them. Finally I have created Super Leagues, 1st League and 2nd League, and then did tournaments for Super League of the most powerful engines I got.
Tournament format:
- Round Robin, 4 games for pair with swap sides
- 5 minutes a game + 3 seconds for a move
- Using openings
- up to 2 CPU cores for each engine
- 256 MB Hash size
My Tournament results:
Code: Select all
Cross table:
---------------------------------------------------------------------------------------
# name score games 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1. Stockfish 16 42.0 56 x ==== ==== 1=== ==1= ==11 1=== 1=11 =1=1 11== 1=1= 11=1 1111 1111 11=1
2. Stockfish 15.1 41.5 56 ==== x ==== ==== =1== 1=1= ==== ===1 1=1= =111 1=1= 1111 1111 1111 1111
3. Fat Fritz 2 40.0 56 ==== ==== x ==== =1== ==== 1=== 1111 1=1= =111 =1=1 1=1= 1=11 =111 11=1
4. Berserk 11.1 36.5 56 0=== ==== ==== x ==== ==== 1=== =111 1=1= =1=1 ==== =1=1 1=1= 1=1= 1111
5. RubiChess 20230407 (neon) 35.5 56 ==0= =0== =0== ==== x ==== =1== =1=1 ==1= ==1= 01=1 1111 11== 1=1= 1111
6. Koivisto 9.0 35.0 56 ==00 0=0= ==== ==== ==== x 0=== =11= ==11 ==1= 1=11 1111 1=1= 11=1 1=1=
7. Lc0 v0.29.0+git.dirty 32.0 56 0=== ==== 0=== 0=== =0== 1=== x =1=0 1=== ===1 ==1= ==== 1=1= =111 11=1
8. Halogen 11 26.0 56 0=00 ===0 0000 =000 =0=0 =00= =0=1 x ==1= ==== =1=1 11=1 ==11 11== ===1
9. Viridithas 8.1.0 24.5 56 =0=0 0=0= 0=0= 0=0= ==0= ==00 0=== ==0= x ==1= 1=0= 1=0= =1== 1=11 ===1
10. Komodo 14 23.5 56 00== =000 =000 =0=0 ==0= ==0= ===0 ==== ==0= x 1=== =1== ==1= 1=== ==1=
11. HIARCS 15.2 23.0 56 0=0= 0=0= =0=0 ==== 10=0 0=00 ==0= =0=0 0=1= 0=== x 1=== 1=== ===1 ==1=
12. Winter 2.0 SSE4.2 17.5 56 00=0 0000 0=0= =0=0 0000 0000 ==== 00=0 0=1= =0== 0=== x 1=== 1=10 ==1=
13. Stash v34.0 17.0 56 0000 0000 0=00 0=0= 00== 0=0= 0=0= ==00 =0== ==0= 0=== 0=== x ==== 11=1
14. pawn 1.0 14.5 56 0000 0000 =000 0=0= 0=0= 00=0 =000 00== 0=00 0=== ===0 0=01 ==== x ==11
15. Counter 5.0 11.5 56 00=0 0000 00=0 0000 0000 0=0= 00=0 ===0 ===0 ==0= ==0= ==0= 00=0 ==00 x
Code: Select all
Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
# name nodes/m NPS depth/m time/m moves time #fails
1. Stockfish 16 12229K 1969813 49.7 6.2 81.4 505.2
2. Stockfish 15.1 22178K 3550669 50.7 6.2 83.3 520.3
3. Fat Fritz 2 30943K 5099135 49.2 6.1 84.6 513.2
4. Berserk 11.1 34492K 6305783 36.3 5.5 104.6 572.2
5. RubiChess 20230407 (neon) 35980K 5427455 36.4 6.6 74.9 496.8
6. Koivisto 9.0 33770K 5789046 37.3 5.8 95.6 557.6
7. Lc0 v0.29.0+git.dirty 67K 10751 8.8 6.2 85.4 532.8 1
8. Halogen 11 41616K 6580212 31.2 6.3 81.9 518.2
9. Viridithas 8.1.0 14994K 2679851 28.4 5.6 78.0 436.6
10. Komodo 14 36823K 6208955 35.6 5.9 84.2 499.2
11. HIARCS 15.2 9316K 1582063 26.0 5.9 80.1 471.7
12. Winter 2.0 SSE4.2 17305K 2601970 31.6 6.7 72.3 481.0
13. Stash v34.0 55994K 8939063 39.9 6.3 80.3 502.9
14. pawn 1.0 35923K 5685854 33.9 6.3 81.2 513.0
15. Counter 5.0 6105K 988014 21.3 6.2 85.2 526.2
all --- 25412K 4264102 34.5 6.1 83.5 509.8 1