Priest321 wrote: ↑Fri Aug 04, 2023 9:04 pm
Looking forward to seeing the results of the testing.
Keith
Hi Keith, hi all
here you are with the Khmelnitsky test outcome!
I had to cancel the first run of the test after reaching 85% done: I already had a growing concern with the temporary results that looked way below my expectation (with regards to the high playing strength), and finally got evidences the displayed score is often not self-sufficient to assess a position. It looks to only "scratch the surface". Nevertheless, once few moves are actually played starting from a given position, the score converges towards relevant values, and gives a balanced evaluation of the initial position.
As the Khmelnitsky test leverages not only best move identification, but also questions the side having an edge or maybe winning, I had to find a workaround.
The key point is the blazing fast response time of the L6-v2. I now have played many games, I could only notice a couple of them with L6-v2 at level 22 (max) using one or two seconds for the whole game, according to the time display! Other games did not report a single whole second spent.
Other chess computers running the Khmelnitsky test are granted three minutes thinking time per position - therefore I considered fair enough to have the L6-v2 carry on several moves in a row (using in turn the hint feature then the usual computer move) in order to reveal sort of the principal variation and take note of the displayed scores. Usually 4 to 5 moves are enough to get a stabilized score (best is to reach a "quiescence position"). This process only costs around half a minute operator time, and the computing time used by the L6-v2 remains barely noticeable.
OK, the L6-v2 is not an analysis tool (as already stated about the L6-v1), anyway let's focus on the results now:
It achieved 1954 KT-Elo, which is close to the score achieved by the Fidelity 2265 Designer Mach III (1974).
According to the calibration graph I previously shared (p.4 within this thread), this reveals a potential strength above 2100 "computer Elo".
Here is the graph:
link (if not displayed above)
Comparison to the L6-v1:
link
Comparison to the Designer MIII:
link
Comparison to an average 1954 Elo human player:
link
It is a strong counterattacker, manages the opening fairly well, and rather unusually has good skills for strategy, even better than the average player has. Not a great attacher though, and a bit weak in tactics and sacrifice. The low spot in tactics of course relates to the extremely short thinking time used.
I am still running tournaments in order to evaluate the computer Elo thru real games, so far the L6-v2 appears to deserve the rough estimate over 2300/2350 mark. Maybe even more. New blitz monster?
Tibono