GM Joel Benjamin vs Leela Knight Odds Classical match

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

lkaufman
Posts: 6215
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: GM Joel Benjamin vs Leela Knight Odds Classical match

Post by lkaufman »

Uri Blass wrote: Wed Jan 29, 2025 7:41 am I do not see a reason to assume that the best player in normal chess is going to be the best player with huge odds at bullet,

I will not be surprised if some player with fide rating 2600 may perform better than carlsen at bullet with queen odds against lc0.

I consider even without odds bullet and normal chess to be different games and I will not be surprised to see in the future a world champion in normal chess that is not a top bullet player at 1+0.

It is not important to know to move the mouse fast in long time control.
Of course the best Classical player need not be a great bullet player, in fact I don't think Gukesh is particularly strong at bullet chess. But the best bullet player in the world will always be a strong GM I think, because some top players are also bullet specialists, so they should be better at it than ordinary GMs even if they only play bullet. Currently Hikaru is still the world's best bullet player, and he is number 2 in the live classical ratings. Carlsen is probably in top 3 in bullet. Skill at queen odds may differ from normal chess, but if we are talking about bullet chess then I think bullet chess skill is the dominant factor. Maybe Naroditsky, who is at least close to Carlsen in bullet strength, might be as good at this particular challenge, but I would bet on Carlsen to do better than anyone except perhaps Hikaru.
Komodo rules!
Jouni
Posts: 3614
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: GM Joel Benjamin vs Leela Knight Odds Classical match

Post by Jouni »

So misleading title Knight Odds? Better title Knight + 1 vs 100 time Odds!
Jouni
lkaufman
Posts: 6215
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: GM Joel Benjamin vs Leela Knight Odds Classical match

Post by lkaufman »

Jouni wrote: Wed Jan 29, 2025 7:40 pm So misleading title Knight Odds? Better title Knight + 1 vs 100 time Odds!
True, but taking more time would not have helped Leela, testing shows it would have hurt!
Komodo rules!
royb
Posts: 557
Joined: Thu Mar 09, 2006 12:53 am

Re: GM Joel Benjamin vs Leela Knight Odds Classical match

Post by royb »

lkaufman wrote: Wed Jan 29, 2025 8:22 pm
Jouni wrote: Wed Jan 29, 2025 7:40 pm So misleading title Knight Odds? Better title Knight + 1 vs 100 time Odds!
True, but taking more time would not have helped Leela, testing shows it would have hurt!
Wait ... if you gave Leela more time (more nodes to search I guess in the case of the Joel Benjamin match) that it would have played worse? I'm probably misunderstanding but I need to ask to clarify.

Thanks.
lkaufman
Posts: 6215
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: GM Joel Benjamin vs Leela Knight Odds Classical match

Post by lkaufman »

royb wrote: Thu Jan 30, 2025 4:29 am
lkaufman wrote: Wed Jan 29, 2025 8:22 pm
Jouni wrote: Wed Jan 29, 2025 7:40 pm So misleading title Knight Odds? Better title Knight + 1 vs 100 time Odds!
True, but taking more time would not have helped Leela, testing shows it would have hurt!
Wait ... if you gave Leela more time (more nodes to search I guess in the case of the Joel Benjamin match) that it would have played worse? I'm probably misunderstanding but I need to ask to clarify.

Thanks.
Yes, strange as it may seem. There is some optimum number of nodes for each handicap, because searching deeper just convinces the engine that everything is hopeless and it chooses the move that would last longest against say Stockfish. The reason the bot is so effective against humans is that it doesn't search too much deeper than the human is likely to be able to see, so it won't be afraid to try a move that will work better against the assumed human level but would be worse against an engine. But the problem is that it doesn't know who the opponent is or what his time remaining (or increment) is. So we assume that the opponent is an engine that simulates a GM playing Rapid, normally the strongest opponent we would face on LiChess, and determined by testing that 20,000 nodes (roughly a one second think) is about optimal for knight odds (we use only a tiny 800 nodes for queen odds!), so we started the match that way. But after the first day, when we did poorly, we concluded that we should raise it since Joel was playing at a Classical time control and would see more than the assumed opponent. So we raised it to 30,000 for the remaining games, and we did "win" that portion of the match by one game, though there could be other reasons for this, including luck. Perhaps we could have gone a bit higher, but we had no data and wanted to keep to the stated "bullet chess" speed. Since the raise, which applied to all the games played by that bot, not just against Joel, results have improved against strong players, including an incredible 8.5 out of 9 in 3'2" blitz vs Caruana, and a 137.5 to 2.5 unbeaten score at that time control against all opponents rated 2600 to 2800 on LiChess blitz! So we are keeping the increase, but results are now so good that I don't want to risk going even higher. I can add that it should be beneficial to raise the node limit once we are no longer losing, a change that we will probably see in the near future.
Komodo rules!
Uri Blass
Posts: 10773
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: GM Joel Benjamin vs Leela Knight Odds Classical match

Post by Uri Blass »

lkaufman wrote: Thu Jan 30, 2025 5:58 am
royb wrote: Thu Jan 30, 2025 4:29 am
lkaufman wrote: Wed Jan 29, 2025 8:22 pm
Jouni wrote: Wed Jan 29, 2025 7:40 pm So misleading title Knight Odds? Better title Knight + 1 vs 100 time Odds!
True, but taking more time would not have helped Leela, testing shows it would have hurt!
Wait ... if you gave Leela more time (more nodes to search I guess in the case of the Joel Benjamin match) that it would have played worse? I'm probably misunderstanding but I need to ask to clarify.

Thanks.
Yes, strange as it may seem. There is some optimum number of nodes for each handicap, because searching deeper just convinces the engine that everything is hopeless and it chooses the move that would last longest against say Stockfish. The reason the bot is so effective against humans is that it doesn't search too much deeper than the human is likely to be able to see, so it won't be afraid to try a move that will work better against the assumed human level but would be worse against an engine. But the problem is that it doesn't know who the opponent is or what his time remaining (or increment) is. So we assume that the opponent is an engine that simulates a GM playing Rapid, normally the strongest opponent we would face on LiChess, and determined by testing that 20,000 nodes (roughly a one second think) is about optimal for knight odds (we use only a tiny 800 nodes for queen odds!), so we started the match that way. But after the first day, when we did poorly, we concluded that we should raise it since Joel was playing at a Classical time control and would see more than the assumed opponent. So we raised it to 30,000 for the remaining games, and we did "win" that portion of the match by one game, though there could be other reasons for this, including luck. Perhaps we could have gone a bit higher, but we had no data and wanted to keep to the stated "bullet chess" speed. Since the raise, which applied to all the games played by that bot, not just against Joel, results have improved against strong players, including an incredible 8.5 out of 9 in 3'2" blitz vs Caruana, and a 137.5 to 2.5 unbeaten score at that time control against all opponents rated 2600 to 2800 on LiChess blitz! So we are keeping the increase, but results are now so good that I don't want to risk going even higher. I can add that it should be beneficial to raise the node limit once we are no longer losing, a change that we will probably see in the near future.
1)The search strategy is wrong if searching deeper convince the engine that everything is losing when practically the engine is winning.
Cprrect search should give probability for opponent moves and calculate expected result based on probabilities.

2)It seems that at least 30000 is better than 20000 for knight odds based on the results that you give and I wonder what is the reason that you thought earlier that 20000 is better.

Did you find that 20000 is better than 30000 against engines and if yes did you find it is better against engines that Lc0 score close to 50% against them?
lkaufman
Posts: 6215
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: GM Joel Benjamin vs Leela Knight Odds Classical match

Post by lkaufman »

Uri Blass wrote: Thu Jan 30, 2025 6:22 am
lkaufman wrote: Thu Jan 30, 2025 5:58 am
royb wrote: Thu Jan 30, 2025 4:29 am
lkaufman wrote: Wed Jan 29, 2025 8:22 pm
Jouni wrote: Wed Jan 29, 2025 7:40 pm So misleading title Knight Odds? Better title Knight + 1 vs 100 time Odds!
True, but taking more time would not have helped Leela, testing shows it would have hurt!
Wait ... if you gave Leela more time (more nodes to search I guess in the case of the Joel Benjamin match) that it would have played worse? I'm probably misunderstanding but I need to ask to clarify.

Thanks.
Yes, strange as it may seem. There is some optimum number of nodes for each handicap, because searching deeper just convinces the engine that everything is hopeless and it chooses the move that would last longest against say Stockfish. The reason the bot is so effective against humans is that it doesn't search too much deeper than the human is likely to be able to see, so it won't be afraid to try a move that will work better against the assumed human level but would be worse against an engine. But the problem is that it doesn't know who the opponent is or what his time remaining (or increment) is. So we assume that the opponent is an engine that simulates a GM playing Rapid, normally the strongest opponent we would face on LiChess, and determined by testing that 20,000 nodes (roughly a one second think) is about optimal for knight odds (we use only a tiny 800 nodes for queen odds!), so we started the match that way. But after the first day, when we did poorly, we concluded that we should raise it since Joel was playing at a Classical time control and would see more than the assumed opponent. So we raised it to 30,000 for the remaining games, and we did "win" that portion of the match by one game, though there could be other reasons for this, including luck. Perhaps we could have gone a bit higher, but we had no data and wanted to keep to the stated "bullet chess" speed. Since the raise, which applied to all the games played by that bot, not just against Joel, results have improved against strong players, including an incredible 8.5 out of 9 in 3'2" blitz vs Caruana, and a 137.5 to 2.5 unbeaten score at that time control against all opponents rated 2600 to 2800 on LiChess blitz! So we are keeping the increase, but results are now so good that I don't want to risk going even higher. I can add that it should be beneficial to raise the node limit once we are no longer losing, a change that we will probably see in the near future.
1)The search strategy is wrong if searching deeper convince the engine that everything is losing when practically the engine is winning.
Cprrect search should give probability for opponent moves and calculate expected result based on probabilities.

2)It seems that at least 30000 is better than 20000 for knight odds based on the results that you give and I wonder what is the reason that you thought earlier that 20000 is better.

Did you find that 20000 is better than 30000 against engines and if yes did you find it is better against engines that Lc0 score close to 50% against them?
I agree with point 1. However no one has found a better search strategy yet for these odds games. The search used by Lc0 is still rather new and likely to be improved, even for normal chess, unlike the situation with Alpha-Beta search. It is possible that normal alpha-beta engines with NNUE like Stockfish or Torch would be even better at odds play with properly trained nets for the odds, but no one has done this yet so we don't know. I think they would be more "stable" in terms of the eval not collapsing with depth, but may we worse in other ways, for example minimax inherently assumes best play by the opponent, unlike MCTS and variants of it.

We test against engines that are trained to play like human GMs playing fast games, with settings calibrated to give fairly even chances to each side. With such an opponent, 15000 nodes tested as best, and we decided on 20,000 as it is much safer to be above rather than below the optimal number. I don't know why 30,000 seems better against actual humans, but it seems to be true. Excluding three games lost on time due to rebooting or crashing from excessive demand to play the bots, LeelaKnightOdds has only lost two games since the increase out of nearly 1700, and one of those was to a strong GM who played it nearly forty blitz games until he won one! Today it played four blitz (3'2") games with prodigy IM Faustino Oro, a super-strong blitz player rated 2923 Lichess blitz, and despite the additional handicap of the move (Leela playing Black every game instead of the usual White), Leela won all four games. I think there is still plenty to improve, but we won't be able to tell from blitz games at knight odds, we need more Rapid games with top players to challenge the bot.
Komodo rules!
cbash
Posts: 4
Joined: Fri Nov 22, 2024 7:12 pm
Full name: Caleb Bash

Re: GM Joel Benjamin vs Leela Knight Odds Classical match

Post by cbash »

Very recently someone made this https://github.com/amjshl/lc0_v31_sc a modification to search meant to improve play at odds using an asymmetrical variant of mcts. On discord they claimed to get improvements at queen over the standard configuration while using 20000 nodes and various levels of "search contempt" to limit updates of opponent side policy distribution. Although they only ran a few games the results look promising. Very interested to see if this could help at knight odds.
Father
Posts: 1793
Joined: Sun Mar 19, 2006 4:39 am
Location: Colombia
Full name: Pablo Ignacio Restrepo

Re: GM Joel Benjamin vs Leela Knight Odds Classical match

Post by Father »

lkaufman wrote: Thu Jan 30, 2025 6:56 pm
Uri Blass wrote: Thu Jan 30, 2025 6:22 am
lkaufman wrote: Thu Jan 30, 2025 5:58 am
royb wrote: Thu Jan 30, 2025 4:29 am
lkaufman wrote: Wed Jan 29, 2025 8:22 pm
Jouni wrote: Wed Jan 29, 2025 7:40 pm So misleading title Knight Odds? Better title Knight + 1 vs 100 time Odds!
True, but taking more time would not have helped Leela, testing shows it would have hurt!
Wait ... if you gave Leela more time (more nodes to search I guess in the case of the Joel Benjamin match) that it would have played worse? I'm probably misunderstanding but I need to ask to clarify.

Thanks.
Yes, strange as it may seem. There is some optimum number of nodes for each handicap, because searching deeper just convinces the engine that everything is hopeless and it chooses the move that would last longest against say Stockfish. The reason the bot is so effective against humans is that it doesn't search too much deeper than the human is likely to be able to see, so it won't be afraid to try a move that will work better against the assumed human level but would be worse against an engine. But the problem is that it doesn't know who the opponent is or what his time remaining (or increment) is. So we assume that the opponent is an engine that simulates a GM playing Rapid, normally the strongest opponent we would face on LiChess, and determined by testing that 20,000 nodes (roughly a one second think) is about optimal for knight odds (we use only a tiny 800 nodes for queen odds!), so we started the match that way. But after the first day, when we did poorly, we concluded that we should raise it since Joel was playing at a Classical time control and would see more than the assumed opponent. So we raised it to 30,000 for the remaining games, and we did "win" that portion of the match by one game, though there could be other reasons for this, including luck. Perhaps we could have gone a bit higher, but we had no data and wanted to keep to the stated "bullet chess" speed. Since the raise, which applied to all the games played by that bot, not just against Joel, results have improved against strong players, including an incredible 8.5 out of 9 in 3'2" blitz vs Caruana, and a 137.5 to 2.5 unbeaten score at that time control against all opponents rated 2600 to 2800 on LiChess blitz! So we are keeping the increase, but results are now so good that I don't want to risk going even higher. I can add that it should be beneficial to raise the node limit once we are no longer losing, a change that we will probably see in the near future.
1)The search strategy is wrong if searching deeper convince the engine that everything is losing when practically the engine is winning.
Cprrect search should give probability for opponent moves and calculate expected result based on probabilities.

2)It seems that at least 30000 is better than 20000 for knight odds based on the results that you give and I wonder what is the reason that you thought earlier that 20000 is better.

Did you find that 20000 is better than 30000 against engines and if yes did you find it is better against engines that Lc0 score close to 50% against them?
I agree with point 1. However no one has found a better search strategy yet for these odds games. The search used by Lc0 is still rather new and likely to be improved, even for normal chess, unlike the situation with Alpha-Beta search. It is possible that normal alpha-beta engines with NNUE like Stockfish or Torch would be even better at odds play with properly trained nets for the odds, but no one has done this yet so we don't know. I think they would be more "stable" in terms of the eval not collapsing with depth, but may we worse in other ways, for example minimax inherently assumes best play by the opponent, unlike MCTS and variants of it.

We test against engines that are trained to play like human GMs playing fast games, with settings calibrated to give fairly even chances to each side. With such an opponent, 15000 nodes tested as best, and we decided on 20,000 as it is much safer to be above rather than below the optimal number. I don't know why 30,000 seems better against actual humans, but it seems to be true. Excluding three games lost on time due to rebooting or crashing from excessive demand to play the bots, LeelaKnightOdds has only lost two games since the increase out of nearly 1700, and one of those was to a strong GM who played it nearly forty blitz games until he won one! Today it played four blitz (3'2") games with prodigy IM Faustino Oro, a super-strong blitz player rated 2923 Lichess blitz, and despite the additional handicap of the move (Leela playing Black every game instead of the usual White), Leela won all four games. I think there is still plenty to improve, but we won't be able to tell from blitz games at knight odds, we need more Rapid games with top players to challenge the bot.
Good afternoon Mr. Larry Kaufman. I have played some games with black pieces against LeelaKnightOdds and the computer has always rejected me and prevented me from playing against LeeelaKnight with white pieces. It has always been the same, he never allows me to play with white and I have prepared a repertoire that I believe will make the horse stone. Didn't the computer agree to play against a human again, with the human driving the white horse? If you can help me I would be very happy. Thank you.
I am thinking chess is in a coin.Human beings for ever playing in one face.Now I am playing in the other face:"Antichess". Computers are as a fortres where owner forgot to close a little door behind. You must enter across this door.Forget the front.
lkaufman
Posts: 6215
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: GM Joel Benjamin vs Leela Knight Odds Classical match

Post by lkaufman »

Father wrote: Thu Jan 30, 2025 9:28 pm
Good afternoon Mr. Larry Kaufman. I have played some games with black pieces against LeelaKnightOdds and the computer has always rejected me and prevented me from playing against LeeelaKnight with white pieces. It has always been the same, he never allows me to play with white and I have prepared a repertoire that I believe will make the horse stone. Didn't the computer agree to play against a human again, with the human driving the white horse? If you can help me I would be very happy. Thank you.
For playing White at knight odds, you have to follow a different procedure. Just go to the link under the bot name for "Knight and Move" odds and follow the instructions there. When the bot was initially created, it could only play White; later we added the option to force Leela to play Black. There is no way I can reject a player from playing one particular color, as long as you follow the instructions. I can ban players from playing for suspected cheating, for starting new games without resigning, and other obvious abuses, but you have never done anything wrong.
Komodo rules!