Search or Evaluation?

You can discuss all aspects of programming and technical matters here.

Moderators: Harvey Williamson, Watchman

User avatar
ed
Member
Posts: 77
Joined: Tue Oct 02, 2007 8:54 pm
Location: Netherlands
Contact:

Search or Evaluation?

Post by ed »

ed wrote:
Mark Uniacke wrote:Thanks George for finding that interesting article. However, it was NOT the one I was thinking of! :shock:

The one I am thinking of talks about how the users and testers perceive the programs play and the influence of search between the players on that perception. I think it was by Chris but I can't find it at the moment.

Steve, is right there was a number of articles in SelSearch Magazine, I may have to trawl back through those to find it (if it was in there) :(

Can anyone help?
Hi Mark!

Perhaps you meant this one?

Regards,

Ed

------------------------------------------------------------


If you're a slow (knowledge) program, you can beat a fast one by having essential chess knowledge. You maybe find some theme or weakness or king attack or whatever, go for it, sit on it, exploit it and maybe get a win from it. Also, you can find this stuff, but not be able to convert it.

If you're a slow program, and you get into a game where these exploitation possibilities don't exist for some reason, then, effectively, the game turns into slow bean-counter against fast bean-counter, with the inevitable conclusion.

We all see these games. In fact you don't need my program to show them , because they happen all the time in comp-comp. These game types are the norm for bean vs bean.

Take a scenario. Your program now, Ferret, against your program 4 years ago. Or even your program now against your own program on slow hardware. Result inevitable ? Probably. Game style and type ? Probably predictable like so:

Ferret(fast) will have 1,2,3,4 nominal plies on Ferret(slow). Game style and type will be strongly dependant on the nominal ply gap.

a) High gap. Ferret(slow) will likely go down into rapid material collapse. Ferret(fast) may even have some flashy pyrotechnics to demonstrate it. A naive reviewer could call ferret(fast) a spectacular attacking program. He could call ferret(slow) a stupid bean-counter, typical computer.

b) Medium gap. Ferret(fast) will slowly grind ferret(slow) down. Ferret slow will keep finding at its higher iterations, possible loss of material. It
will go panic time, find a way to avoid material loss by giving double pawns instead, or whatever. A naive reviewer will call Ferret(fast) a great
positional player. He'll call Ferret(slow) dumb, accuse it of not having simple knowledge like double pawns, or whatever.

c) Small gap. Probably you'll get reasonable games. The reviewer can't tell much, so, if he'll likely start making things up. Human style, or plays more interesting, or some other nonsense that says nothing.

What I'm trying to say to you, is that Ferret is none of these things. It has none of these 'naive reviewer properties'. The properties are all emergent from the search gap, and therefore depend on the opponent. It knows everything and nothing, all at the same time.

Which is why Genius was thought to be the greatest thing, and now you all think it is boring. It isn't either, or its both. Schrodinger's cat.

Which is why programs seem to keep making progress on the SSDF list. And why reviewers, either dumb, or with axes to grind, wax lyrical about the latest programs.

It's the search gap. Gettit ? Out of this search gap comes all the naive speculation and nonsense that gets written. The program has every style and no style, it has no consistency to play against, only materialism, you can't learn from it, tomorrow it will be different (found another mine in the search gap), only the difference is just a relection of - whoops, trod on another mine. What can you do with such a program ? Use the take-back key and try again ? - and imagine this helps you improve or learn ?

Now, I claim this search gap has no meaning or understanding possibilities for a human. That a human can't relate his heuristics to it. That you can't extract the knowledge out of it and represent it to a human. That you can't even extract the knowledge out of it and represent it to yourself. You can't get heuristics from it. So I call it counting beans - useless for us humans.

Now, take a knowledge program, you can play it and see the play style. You can try and work out what it does and why. There'll be a reason, based on human chess heuristics. The game has plan, and flow, and doesn't consist of hidden minefields. It won't grind you down by search, it will try speculative ideas which it might, or might not, be able to get to work. You can see the speculative ideas, and try them yourself. I think you can, as a human, relate to this type of program. If you know the programmer, maybe you can see patterns into the program that come from him, and so on. I think these types of programs are infused with some force, in so far as any chunk of silicon can be.

I hate materialists.

Chris Whittington
User avatar
Mark Uniacke
Hiarcs Author
Posts: 1458
Joined: Sun Jul 29, 2007 1:32 pm
Location: United Kingdom
Contact:

Post by Mark Uniacke »

Hi Ed,

Thanks Ed, that is interesting.

That article certainly starts to mention some of the stuff I was thinking of. The article I remember (or not so well!) talked about what the testers would say about one program having more knowledge (but really it searched deeper), it mentioned doubled pawns as one example of a concession the shallower searching program makes to the deeper searcher. I know I have read the article in question in the last couple of years but I just cannot find it.

I am pretty sure it came from Chris and I am sure we have both read it, but why I cannot find it is a little worrying for me, I wonder did I imagine it all?! :shock:
Best wishes,
Mark

https://www.hiarcs.com
User avatar
ed
Member
Posts: 77
Joined: Tue Oct 02, 2007 8:54 pm
Location: Netherlands
Contact:

Post by ed »

Mark Uniacke wrote:Hi Ed,

Thanks Ed, that is interesting.

That article certainly starts to mention some of the stuff I was thinking of. The article I remember (or not so well!) talked about what the testers would say about one program having more knowledge (but really it searched deeper), it mentioned doubled pawns as one example of a concession the shallower searching program makes to the deeper searcher. I know I have read the article in question in the last couple of years but I just cannot find it.

I am pretty sure it came from Chris and I am sure we have both read it, but why I cannot find it is a little worrying for me, I wonder did I imagine it all?! :shock:
Hi Mark,

Regarding topic: it's my understanding that search can't do without eval and vice versa but that in the long term chess knowledge will be decisive. And maybe we are already there, IMHO the success of Rybka and the gap it created can't be explained by superior search only, your engine and other programs have an excellent search as well and do not differ that much in tactics. It must be an evaluation issue.

I believe in Tord's wordings: the mean weakness in my program is it's evaluation function.

Ed
Uri Blass
Member
Posts: 82
Joined: Sun Aug 12, 2007 1:40 pm

Post by Uri Blass »

I think that in order to talk about playing style you simply need to get 50% result.

If one program is significantly stronger than give the weaker program more time.

After having 50% result you can watch games and use your observations to get an opinion about playing style of the programs.

I do not understand why all the testers that I know insist to give programs equal time control.
The result is that it is simply hard to say based on games which program has more knowledge in the evaluation and it may be hard to have unbiased opinion about positional weaknesses and advantages of the programs.

It is possible that some 2700 program has more evaluation knowledge than some 2900 program but when testers play equal time control they will never notice it.

It may be better to have a new organization that simply does not give chess programs equal time control and give stronger programs less time to give every program result that is near 50%.

My opinion is that when A scores 50% against B you can say that A is better than B positionally if you see that A has more than 50% in part of the games when both programs believe that they are better for some moves.

My opinion is that if A scores 80% against B then it is hard to know which program is better positionally based on games.

Uri
Uri Blass
Member
Posts: 82
Joined: Sun Aug 12, 2007 1:40 pm

Post by Uri Blass »

ed wrote:
Mark Uniacke wrote:Hi Ed,

Thanks Ed, that is interesting.

That article certainly starts to mention some of the stuff I was thinking of. The article I remember (or not so well!) talked about what the testers would say about one program having more knowledge (but really it searched deeper), it mentioned doubled pawns as one example of a concession the shallower searching program makes to the deeper searcher. I know I have read the article in question in the last couple of years but I just cannot find it.

I am pretty sure it came from Chris and I am sure we have both read it, but why I cannot find it is a little worrying for me, I wonder did I imagine it all?! :shock:
Hi Mark,

Regarding topic: it's my understanding that search can't do without eval and vice versa but that in the long term chess knowledge will be decisive. And maybe we are already there, IMHO the success of Rybka and the gap it created can't be explained by superior search only, your engine and other programs have an excellent search as well and do not differ that much in tactics. It must be an evaluation issue.

I believe in Tord's wordings: the mean weakness in my program is it's evaluation function.

Ed
Hi Ed,

I think that what you consider as tactics is misleading here.
My opinion is that
Rybka is not strong in finding forced lines but outsearching the opponent when there is no forced line,

Search clearly can explain superior performance even if the evaluation is the same.
I do not know if Prodeo is superior positionally relative to rybka but
the right test for you is to play prodeo against rybka with unequal time control and watch the games(you can use arena for that purpose and set rybka2.32a 32 bits strength to 8-10% in order to get result near 50%


Give prodeo 40 minutes/40 moves and play games.

Take part of the games when rybka and prodeo disagree and both programs consider themselves as better for some consecutive moves(not all games are like that and there are games when both programs agree about the evaluation).

The main question is which program win most of these games
My opinion is that the program that wins most of these games has the better evaluation.

Uri
User avatar
Mark Uniacke
Hiarcs Author
Posts: 1458
Joined: Sun Jul 29, 2007 1:32 pm
Location: United Kingdom
Contact:

Post by Mark Uniacke »

ed wrote:
Mark Uniacke wrote:Hi Ed,

Thanks Ed, that is interesting.

That article certainly starts to mention some of the stuff I was thinking of. The article I remember (or not so well!) talked about what the testers would say about one program having more knowledge (but really it searched deeper), it mentioned doubled pawns as one example of a concession the shallower searching program makes to the deeper searcher. I know I have read the article in question in the last couple of years but I just cannot find it.

I am pretty sure it came from Chris and I am sure we have both read it, but why I cannot find it is a little worrying for me, I wonder did I imagine it all?! :shock:
Hi Mark,

Regarding topic: it's my understanding that search can't do without eval and vice versa but that in the long term chess knowledge will be decisive. And maybe we are already there, IMHO the success of Rybka and the gap it created can't be explained by superior search only, your engine and other programs have an excellent search as well and do not differ that much in tactics. It must be an evaluation issue.

I believe in Tord's wordings: the mean weakness in my program is it's evaluation function.

Ed
Ed,

If I take almost any program and give one version an extra ply we see it outplay its shallower searching brother. I do not see it outplay its brother tactically but positionally! :shock:

Why?

Because the shallower searching version cannot keep up with the deeper searcher and so has to make concessions, these concessions are rarely big tactical ones, but rather smaller positional concessions it does not want to make but MUST in order to avoid some tactic the deeper searching opponent has seen deep in the tree and we the observers miss.

So we the viewer of the games gets the impression the deeper searcher is playing with better evaluation.

Ed, when we look back, don't you agree that the big jumps in chess programs have mainly come from new search enhancements, while the eval has been a steady progress but without huge jumps.

Sure when the programs were primitive big jumps in eval were possible but when we got to a certain level progress in eval was much harder. Even when I add new knowledge that helps greatly in some types of positions when it plays matches the improvements are nearly always much less than the expectation!
Best wishes,
Mark

https://www.hiarcs.com
User avatar
Mark Uniacke
Hiarcs Author
Posts: 1458
Joined: Sun Jul 29, 2007 1:32 pm
Location: United Kingdom
Contact:

Post by Mark Uniacke »

Uri Blass wrote:
ed wrote:
Mark Uniacke wrote:Hi Ed,

Thanks Ed, that is interesting.

That article certainly starts to mention some of the stuff I was thinking of. The article I remember (or not so well!) talked about what the testers would say about one program having more knowledge (but really it searched deeper), it mentioned doubled pawns as one example of a concession the shallower searching program makes to the deeper searcher. I know I have read the article in question in the last couple of years but I just cannot find it.

I am pretty sure it came from Chris and I am sure we have both read it, but why I cannot find it is a little worrying for me, I wonder did I imagine it all?! :shock:
Hi Mark,

Regarding topic: it's my understanding that search can't do without eval and vice versa but that in the long term chess knowledge will be decisive. And maybe we are already there, IMHO the success of Rybka and the gap it created can't be explained by superior search only, your engine and other programs have an excellent search as well and do not differ that much in tactics. It must be an evaluation issue.

I believe in Tord's wordings: the mean weakness in my program is it's evaluation function.

Ed
Hi Ed,

I think that what you consider as tactics is misleading here.
My opinion is that
Rybka is not strong in finding forced lines but outsearching the opponent when there is no forced line,

Search clearly can explain superior performance even if the evaluation is the same.
I do not know if Prodeo is superior positionally relative to rybka but
the right test for you is to play prodeo against rybka with unequal time control and watch the games(you can use arena for that purpose and set rybka2.32a 32 bits strength to 8-10% in order to get result near 50%


Give prodeo 40 minutes/40 moves and play games.

Take part of the games when rybka and prodeo disagree and both programs consider themselves as better for some consecutive moves(not all games are like that and there are games when both programs agree about the evaluation).

The main question is which program win most of these games
My opinion is that the program that wins most of these games has the better evaluation.

Uri
Uri,

A perceptive answer and in general I think I agree with you.

I believe Rybka plays the percentages, it searches where it helps in games but is not a finder program because it does not pursue some of the tactical themes deeply.

In chess it is more important not to make mistakes than to find a brilliant winning move. I think Genius also had this approach in the 80s and 90s due to Richard's unique search paradigm.
Best wishes,
Mark

https://www.hiarcs.com
User avatar
ed
Member
Posts: 77
Joined: Tue Oct 02, 2007 8:54 pm
Location: Netherlands
Contact:

Mexico - Zappa 5.5 v Rybka 4.5- Zappa wins $10,000

Post by ed »

Mark Uniacke wrote:
ed wrote:
Mark Uniacke wrote:Hi Ed,

Thanks Ed, that is interesting.

That article certainly starts to mention some of the stuff I was thinking of. The article I remember (or not so well!) talked about what the testers would say about one program having more knowledge (but really it searched deeper), it mentioned doubled pawns as one example of a concession the shallower searching program makes to the deeper searcher. I know I have read the article in question in the last couple of years but I just cannot find it.

I am pretty sure it came from Chris and I am sure we have both read it, but why I cannot find it is a little worrying for me, I wonder did I imagine it all?! :shock:
Hi Mark,

Regarding topic: it's my understanding that search can't do without eval and vice versa but that in the long term chess knowledge will be decisive. And maybe we are already there, IMHO the success of Rybka and the gap it created can't be explained by superior search only, your engine and other programs have an excellent search as well and do not differ that much in tactics. It must be an evaluation issue.

I believe in Tord's wordings: the mean weakness in my program is it's evaluation function.

Ed
Ed,

If I take almost any program and give one version an extra ply we see it outplay its shallower searching brother. I do not see it outplay its brother tactically but positionally! :shock:

Why?

Because the shallower searching version cannot keep up with the deeper searcher and so has to make concessions, these concessions are rarely big tactical ones, but rather smaller positional concessions it does not want to make but MUST in order to avoid some tactic the deeper searching opponent has seen deep in the tree and we the observers miss.

So we the viewer of the games gets the impression the deeper searcher is playing with better evaluation.

Ed, when we look back, don't you agree that the big jumps in chess programs have mainly come from new search enhancements, while the eval has been a steady progress but without huge jumps.

Sure when the programs were primitive big jumps in eval were possible but when we got to a certain level progress in eval was much harder. Even when I add new knowledge that helps greatly in some types of positions when it plays matches the improvements are nearly always much less than the expectation!
Hi Mark,

All self-understood. Deeper search is about 90-95% improving the quality of the best move so far or finding a better positional move and 5-10% about tactics. Agree so far?

Regarding depth, the last 10 years there have been 2 major breakthroughs regarding search:

1) Recursive Nullmove
2) History pruning (or LMR)

Both algorithms are fully described and discussed into detail and every engine programmer is able to add this immediately having a strong (2500+) chess engine. Agree so far?

So where is the difference coming from?

It must be in the positional area, it can't be otherwise.

Rybka, Zappa, Hiarcs, Shredder, Fritz all have an excellent search, with an equal eval they probably would score 50% against each other.

After a break of 3 years I have given engine programming another try, watching the games I saw the following patterns:

Against Fritz: losing too many games because Fritz has a better passed pawn evaluation. Search: Pro Deo usually searches deeper however it can not beat Fritz in a match.

Glaurung: hard fought battles, unclear why Pro Deo wins or loses, no pattern recognized. Both engines are about equal. Search: Glaurung obviously searches deeper.

Fruit: fixed pattern, Pro Deo wins it games mainly because of a better king safety evaluation. Search: Fruit obviously searches deeper.

Elsewhere you wrote:

Code: Select all

I believe Rybka plays the percentages, it searches where it helps in games but is not a finder program because it does not pursue some of the tactical themes deeply.
You make a very valid point here. The question is how to accomplish that, because it looks mission impossible. I have tried, you have tried, etc. etc. but never really succeeded. But if Vas has found a way then surely my proposition staggers.

BTW, it's all easy to test, we need Rybka, Zappa, Hiarcs, Shredder, Fritz versions that only contain equal PST as only evaluation criteria.

Ed
Uri Blass
Member
Posts: 82
Joined: Sun Aug 12, 2007 1:40 pm

Re: Mexico - Zappa 5.5 v Rybka 4.5- Zappa wins $10,000

Post by Uri Blass »

ed wrote:
Mark Uniacke wrote:
ed wrote:
Mark Uniacke wrote:Hi Ed,

Thanks Ed, that is interesting.

That article certainly starts to mention some of the stuff I was thinking of. The article I remember (or not so well!) talked about what the testers would say about one program having more knowledge (but really it searched deeper), it mentioned doubled pawns as one example of a concession the shallower searching program makes to the deeper searcher. I know I have read the article in question in the last couple of years but I just cannot find it.

I am pretty sure it came from Chris and I am sure we have both read it, but why I cannot find it is a little worrying for me, I wonder did I imagine it all?! :shock:
Hi Mark,

Regarding topic: it's my understanding that search can't do without eval and vice versa but that in the long term chess knowledge will be decisive. And maybe we are already there, IMHO the success of Rybka and the gap it created can't be explained by superior search only, your engine and other programs have an excellent search as well and do not differ that much in tactics. It must be an evaluation issue.

I believe in Tord's wordings: the mean weakness in my program is it's evaluation function.

Ed
Ed,

If I take almost any program and give one version an extra ply we see it outplay its shallower searching brother. I do not see it outplay its brother tactically but positionally! :shock:

Why?

Because the shallower searching version cannot keep up with the deeper searcher and so has to make concessions, these concessions are rarely big tactical ones, but rather smaller positional concessions it does not want to make but MUST in order to avoid some tactic the deeper searching opponent has seen deep in the tree and we the observers miss.

So we the viewer of the games gets the impression the deeper searcher is playing with better evaluation.

Ed, when we look back, don't you agree that the big jumps in chess programs have mainly come from new search enhancements, while the eval has been a steady progress but without huge jumps.

Sure when the programs were primitive big jumps in eval were possible but when we got to a certain level progress in eval was much harder. Even when I add new knowledge that helps greatly in some types of positions when it plays matches the improvements are nearly always much less than the expectation!
Hi Mark,

All self-understood. Deeper search is about 90-95% improving the quality of the best move so far or finding a better positional move and 5-10% about tactics. Agree so far?

Regarding depth, the last 10 years there have been 2 major breakthroughs regarding search:

1) Recursive Nullmove
2) History pruning (or LMR)

Both algorithms are fully described and discussed into detail and every engine programmer is able to add this immediately having a strong (2500+) chess engine. Agree so far?

So where is the difference coming from?

It must be in the positional area, it can't be otherwise.

Rybka, Zappa, Hiarcs, Shredder, Fritz all have an excellent search, with an equal eval they probably would score 50% against each other.

After a break of 3 years I have given engine programming another try, watching the games I saw the following patterns:

Against Fritz: losing too many games because Fritz has a better passed pawn evaluation. Search: Pro Deo usually searches deeper however it can not beat Fritz in a match.

Glaurung: hard fought battles, unclear why Pro Deo wins or loses, no pattern recognized. Both engines are about equal. Search: Glaurung obviously searches deeper.

Fruit: fixed pattern, Pro Deo wins it games mainly because of a better king safety evaluation. Search: Fruit obviously searches deeper.

Elsewhere you wrote:

Code: Select all

I believe Rybka plays the percentages, it searches where it helps in games but is not a finder program because it does not pursue some of the tactical themes deeply.
You make a very valid point here. The question is how to accomplish that, because it looks mission impossible. I have tried, you have tried, etc. etc. but never really succeeded. But if Vas has found a way then surely my proposition staggers.

BTW, it's all easy to test, we need Rybka, Zappa, Hiarcs, Shredder, Fritz versions that only contain equal PST as only evaluation criteria.

Ed
Ed,Things are not so simple about search.

1)When we talk about history based pruning different programmers have different implementation of it and one implementation may be 50 elo better than the second implementation.
2)When we talk about search then speed is also important and it may be possible that rybka simply searchs 3 times faster relative to prodeo with the same quality of evaluation thanks to better data structure or thanks to not having counter productive parts in the evaluation.

Imagine 2 programs that have the same quality of evaluation.

one has relatively simple evaluation and the second program has more
complex evaluation but the quality of the evaluation when you ignore speed is the same.

Practically speed is a factor and the slower program is going to be weaker.


Uri
User avatar
mclane
Senior Member
Posts: 1600
Joined: Sun Jul 29, 2007 9:04 am
Location: Luenen, germany, US of europe
Contact:

Post by mclane »

what if rybka has a table for piece-square-tables depending on the chosen opening ?
so that it has a natural "plan" when there is no tactics on board.
the eco code differentiates the position. the values have been taken from
grandmaster games. the big table for each eco is the reason rybka's size is so big.
What seems like a fairy tale today may be reality tomorrow.
Here we have a fairy tale of the day after tomorrow....
Uri Blass
Member
Posts: 82
Joined: Sun Aug 12, 2007 1:40 pm

Post by Uri Blass »

mclane wrote:what if rybka has a table for piece-square-tables depending on the chosen opening ?
so that it has a natural "plan" when there is no tactics on board.
the eco code differentiates the position. the values have been taken from
grandmaster games. the big table for each eco is the reason rybka's size is so big.
Strelka1.8 is very similiar to rybka beta and does not seem to have very big tables and strelka is clearly smaller than rybka.

I suspect that the reason for the big code may be that rybka tries to hide the source to make reverse engineering harder.

Programmers have no time to write big tables for piece square table depending on the chosen opening and I doubt if doing it automatically based on statistics about pgn is a good idea.

Uri
User avatar
mclane
Senior Member
Posts: 1600
Joined: Sun Jul 29, 2007 9:04 am
Location: Luenen, germany, US of europe
Contact:

Post by mclane »

IMO this can be done automatically.
make an array for any piece.
differenciate via ECO.
replay each game and fill the arrays the way each piece movement
is registered for each ECO.
you now have piece square tables from grandmaster games.
you only have to load them in the moment YOUR position comes into the ECO position.
lets say you have 1.000.000 GM games with games >2400 ELO.
What seems like a fairy tale today may be reality tomorrow.
Here we have a fairy tale of the day after tomorrow....
User avatar
ed
Member
Posts: 77
Joined: Tue Oct 02, 2007 8:54 pm
Location: Netherlands
Contact:

Re: Mexico - Zappa 5.5 v Rybka 4.5- Zappa wins $10,000

Post by ed »

Uri Blass wrote: Ed,Things are not so simple about search.
Hi Uri,
Uri Blass wrote:1)When we talk about history based pruning different programmers have different implementation of it and one implementation may be 50 elo better than the second implementation.
That's highly unlikely. 50 elo only can be gained by a new fresh idea, like nullmove, like history pruning.
Uri Blass wrote:2)When we talk about search then speed is also important and it may be possible that rybka simply searchs 3 times faster relative to prodeo with the same quality of evaluation thanks to better data structure or thanks to not having counter productive parts in the evaluation.

Imagine 2 programs that have the same quality of evaluation.

one has relatively simple evaluation and the second program has more
complex evaluation but the quality of the evaluation when you ignore speed is the same.

Practically speed is a factor and the slower program is going to be weaker.
Sure, everything in theory is possible, it's just highly unlikely. All chess programmers are good programmers else you can't write a decent chess program at all. Good programmers write good data structures and good code. Between the top engines there are not big differences regarding the issue you raise. Why? Else they would not belong to the top.

Ed
Uri Blass
Member
Posts: 82
Joined: Sun Aug 12, 2007 1:40 pm

Re: Mexico - Zappa 5.5 v Rybka 4.5- Zappa wins $10,000

Post by Uri Blass »

ed wrote:
Uri Blass wrote: Ed,Things are not so simple about search.
Hi Uri,
Uri Blass wrote:1)When we talk about history based pruning different programmers have different implementation of it and one implementation may be 50 elo better than the second implementation.
That's highly unlikely. 50 elo only can be gained by a new fresh idea, like nullmove, like history pruning.
Uri Blass wrote:2)When we talk about search then speed is also important and it may be possible that rybka simply searchs 3 times faster relative to prodeo with the same quality of evaluation thanks to better data structure or thanks to not having counter productive parts in the evaluation.

Imagine 2 programs that have the same quality of evaluation.

one has relatively simple evaluation and the second program has more
complex evaluation but the quality of the evaluation when you ignore speed is the same.

Practically speed is a factor and the slower program is going to be weaker.
Sure, everything in theory is possible, it's just highly unlikely. All chess programmers are good programmers else you can't write a decent chess program at all. Good programmers write good data structures and good code. Between the top engines there are not big differences regarding the issue you raise. Why? Else they would not belong to the top.

Ed
Hi Ed,
I do not see it as unlikely that there are big difference between programmers in speed and in search implementation.

I know that movei has slightly higher rating than Prodeo based on the ccrl list and I think that it may be possible to improve speed at least by a factor of 2 by better data structure.

Another problem of movei is expensive evaluation when part of it is probably counter productive when I did not investigate exactly which parts are counter productive.

If you are 20 times slower than the opponents you may have problems to get to the top but if you are 3 times slower than the opponents than things are different and even rybka beta slowed down by a factor of 3 is better than movei and prodeo(and I think that it is even correct for the 32 bit version of tybka) .

When we talk about late move reductions I do not believe about constant improvement from it and it is possible that it may be possible to earn even 200 elo from late move reduction if you find the right conditions to reduce.

I got the code of strelka that is very similiar to rybka beta.
It is clear that strelka is small and fast relative to movei.

It is also clear that strelka's evaluation is small and I simply doubt if this small evaluation is better than prodeo's evaluation or movei's evaluation
if you assume same speed.


Strelka is clearly faster than movei and it is almost 3 times faster in nodes per second based on my tests.

I guess the relative advantages of strelka relative to movei is basically in better data structure and some search implementations(search implementation can be other things than history reductions and it is is clearly easier to know what you do wrong then to do it right).

Uri
User avatar
Mark Uniacke
Hiarcs Author
Posts: 1458
Joined: Sun Jul 29, 2007 1:32 pm
Location: United Kingdom
Contact:

Post by Mark Uniacke »

ed wrote:All self-understood. Deeper search is about 90-95% improving the quality of the best move so far or finding a better positional move and 5-10% about tactics. Agree so far?
Yes agreed, it really is a matter of definition of course. Finding tactical moves are clear cut because the material margins are clear but positional moves are effectively long term tactics. The point is of course that the search helps nearly all categories of moves, with some small exceptions for very long term static structures.

ed wrote:Regarding depth, the last 10 years there have been 2 major breakthroughs regarding search:

1) Recursive Nullmove
2) History pruning (or LMR)

Both algorithms are fully described and discussed into detail and every engine programmer is able to add this immediately having a strong (2500+) chess engine. Agree so far?

So where is the difference coming from?

It must be in the positional area, it can't be otherwise.
I agree they are the two general purpose major advances although there are other ones that also offer extra Elo, like futility or even eliminating losing captures from the quiescence search. There are also many other search improvements which overlap with 1 & 2 and hence are much less effective, but without 1&2 existing then these other search improvements would be effective.

Also although not a general technique I have found search extensions to be extremely effective and I can see how implementation of search extensions could have another big impact on the strength (or otherwise) of a chess program.

I don't think it is possible to rely on the displayed search depths because some programs don't display their true depths. Additionally as you well know there are many factors, sometimes it is not the iteration depth but the importance of not pruning a critical new line of play that is key.

15 years ago a number of us were guilty of tuning against position test sets which were usually tactical. This was compounded because important publications like CSS used to run features on new programs performing against the BT test or the LCT2 test or ...

So it also became even commercially important to do well in these test sets. Of course there is a very loose relationship between test positions and chess strength in games and I think understanding the reasons for that are very important in understanding where chess strength in games comes from.

It is your last statement where we disagree. It is clear to me that the big + and - in Elo comes from search and the gradual accumulation of rating points comes from eval. I believe the two mentioned search breakthroughs above can be implemented in various different ways and the range of strength improvement is still quite large.

So I think there is plenty of room still to improve chess programs in both search and eval but I believe ultimately more strength comes from the search than it does from the eval.
Best wishes,
Mark

https://www.hiarcs.com
Post Reply