How does Monte Carlo analysis work?
Moderator: Ras
-
- Posts: 512
- Joined: Thu Dec 27, 2007 9:34 pm
How does Monte Carlo analysis work?
How does Monte Carlo analysis work? Does it use the evaluation function or does it simply play random moves in a given position without evaluating them?
Re: How does Monte Carlo analysis work?
Monte Carlo can surely be used in a lot of ways.
If you play random games then the Monte Carlo search (ie the outcome of games) is your evaluation function. There are some good papers on how to backup the result of the mc search to combine it with alpha beta.
I suppose the purpose of mc is to play random games, if you use an evaluation function there won't be any randomness anymore. Anyway an eval function is useless if you play the full game because you will know the true score (win/draw/loss) of your mc game.
HJ.
If you play random games then the Monte Carlo search (ie the outcome of games) is your evaluation function. There are some good papers on how to backup the result of the mc search to combine it with alpha beta.
I suppose the purpose of mc is to play random games, if you use an evaluation function there won't be any randomness anymore. Anyway an eval function is useless if you play the full game because you will know the true score (win/draw/loss) of your mc game.
HJ.
-
- Posts: 4562
- Joined: Tue Jul 03, 2007 4:30 am
Re: How does Monte Carlo analysis work?
The way Rybka does it is changing the evaluation of the moves to get sure of never repeating the same game, that way you get a montecarlo in where the games played have the quality of chess masters' games.Harald Johnsen wrote:I suppose the purpose of mc is to play random games, if you use an evaluation function there won't be any randomness anymore.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: How does Monte Carlo analysis work?
The basic idea for Monte Carlo simulations is the following:Uri wrote:How does Monte Carlo analysis work? Does it use the evaluation function or does it simply play random moves in a given position without evaluating them?
Suppose you have an odd area painted on a wall and you want to know how many square inches the area covers. But it is not possible to measure it and compute because of the odd shape. So you bound the area by drawing a square or rectangle around it. Now you can measure this rectangle and compute its area, and you know the area you want to measure is less than this but how much?
Now you start the simulation and randomly shoot arrows into that rectangle. Each X,Y coordinate where an arrow will "hit" is comprised of two random numbers. Suppose you decide to shoot 1,000,000 arrows for starters. Then you count the number that hit _inside_ the shape you want to measure. And you get exactly 500,000. So the area of the shape is roughly 1/2 the known area of the rectangle. The more accuracy you want, the more arrows you shoot.
Vas probably does something similar to assess the value of a position, although it is _not_ going to be very accurate. All you can do to "shoot the arrows" is to play a very fast game. And if you play 1,000,000 games from a particular position, and you win 700,000 of them, then you could say that position has a .70 probability of a win.
Personally I do not think this is worth much. Why? Because to play enough games, you have to play very _fast_ games. Which means low-quality but numerous games. Computers miss plenty of tactics at such speeds so that winning probability is going to be based on weak data. I suppose the idea is to play enough games to average out such mistakes, but whether it will work or not I am not sure. I've run tons of matches on our cluster and discovered early on that matches with very fast time controls often end up significantly different when played at longer time controls, so I am not sure what this kind of analysis is going to prove...
-
- Posts: 216
- Joined: Thu Mar 09, 2006 9:54 pm
Re: How does Monte Carlo analysis work?
Speaking of fast games, I understand that the Rybka team tests each evaluation change with something like 80,000 games at game in 1 second to measure increases as small as 1 elo. I was surprised that differences of only 1 elo could even be measured, but it looks like it works out well based on the success of Rybka.bob wrote:The basic idea for Monte Carlo simulations is the following:Uri wrote:How does Monte Carlo analysis work? Does it use the evaluation function or does it simply play random moves in a given position without evaluating them?
Suppose you have an odd area painted on a wall and you want to know how many square inches the area covers. But it is not possible to measure it and compute because of the odd shape. So you bound the area by drawing a square or rectangle around it. Now you can measure this rectangle and compute its area, and you know the area you want to measure is less than this but how much?
Now you start the simulation and randomly shoot arrows into that rectangle. Each X,Y coordinate where an arrow will "hit" is comprised of two random numbers. Suppose you decide to shoot 1,000,000 arrows for starters. Then you count the number that hit _inside_ the shape you want to measure. And you get exactly 500,000. So the area of the shape is roughly 1/2 the known area of the rectangle. The more accuracy you want, the more arrows you shoot.
Vas probably does something similar to assess the value of a position, although it is _not_ going to be very accurate. All you can do to "shoot the arrows" is to play a very fast game. And if you play 1,000,000 games from a particular position, and you win 700,000 of them, then you could say that position has a .70 probability of a win.
Personally I do not think this is worth much. Why? Because to play enough games, you have to play very _fast_ games. Which means low-quality but numerous games. Computers miss plenty of tactics at such speeds so that winning probability is going to be based on weak data. I suppose the idea is to play enough games to average out such mistakes, but whether it will work or not I am not sure. I've run tons of matches on our cluster and discovered early on that matches with very fast time controls often end up significantly different when played at longer time controls, so I am not sure what this kind of analysis is going to prove...
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: How does Monte Carlo analysis work?
They might think they are measuring changes as small as one Elo with 80K games, but they aren't... But 80K games can give you a very accurate idea of whether A' is better than A, even if A' is only different from A in a minor way (that is, the difference between A and A' is very small).Mark wrote:Speaking of fast games, I understand that the Rybka team tests each evaluation change with something like 80,000 games at game in 1 second to measure increases as small as 1 elo. I was surprised that differences of only 1 elo could even be measured, but it looks like it works out well based on the success of Rybka.bob wrote:The basic idea for Monte Carlo simulations is the following:Uri wrote:How does Monte Carlo analysis work? Does it use the evaluation function or does it simply play random moves in a given position without evaluating them?
Suppose you have an odd area painted on a wall and you want to know how many square inches the area covers. But it is not possible to measure it and compute because of the odd shape. So you bound the area by drawing a square or rectangle around it. Now you can measure this rectangle and compute its area, and you know the area you want to measure is less than this but how much?
Now you start the simulation and randomly shoot arrows into that rectangle. Each X,Y coordinate where an arrow will "hit" is comprised of two random numbers. Suppose you decide to shoot 1,000,000 arrows for starters. Then you count the number that hit _inside_ the shape you want to measure. And you get exactly 500,000. So the area of the shape is roughly 1/2 the known area of the rectangle. The more accuracy you want, the more arrows you shoot.
Vas probably does something similar to assess the value of a position, although it is _not_ going to be very accurate. All you can do to "shoot the arrows" is to play a very fast game. And if you play 1,000,000 games from a particular position, and you win 700,000 of them, then you could say that position has a .70 probability of a win.
Personally I do not think this is worth much. Why? Because to play enough games, you have to play very _fast_ games. Which means low-quality but numerous games. Computers miss plenty of tactics at such speeds so that winning probability is going to be based on weak data. I suppose the idea is to play enough games to average out such mistakes, but whether it will work or not I am not sure. I've run tons of matches on our cluster and discovered early on that matches with very fast time controls often end up significantly different when played at longer time controls, so I am not sure what this kind of analysis is going to prove...
Some of what they do is probably less about proving as opposed to disproving. For example, both are strong chess players and probably have good ideas about what should or should not be in an eval. So when they make a change, they are already pretty sure it is a good one, and then 80K games can be viewed more of "did this make it play weaker?" rather than "did this make it play stronger?" Then, so long as it wasn't weaker, the change can be kept as testing verified what human judgement said was reasonable. That's a bit different from trying to test a range of guesses about something, because then you are trying to prove which is best, not is one worse...
-
- Posts: 28395
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: How does Monte Carlo analysis work?
Well, truly strategic features of the evaluation should, by definition, be quite independent of search depth. What you have to be careful with is changes in your search.
I use Monte-Carlo sampling of win probabilities quite often, to determine piece values, and I have observed that the results are usually quite independent of search depth. They can be dependent on evaluation, though. If I play a Bishop pair against two Knights and a Pawn (all other pieces being present as well) from shuffled opening setups, the score (between exactly equal engines) is the same within statistical erros from 40/10 to 40/0:10. But other engines might get a different score. This difference largely disappears when you normalize the excess score on the pawn-odds advantage the various engine are scoring.
This is almost independent of how strong the engine is, in particular what it thinks the piece values should be. Even if you make a Rook worth less than a Bishop, a Rook will still systematically beat the Bishop, as the side with the Bishop will also value the Bishop higher, and avoid trading it for the Rook until it is too late.
This all within reason: if you would play engines that are so damaged that they, say, cannot generate Knight moves, obviously a Knight would be of zero value to them. So the engines must play something that at least remotely resembles proper Chess, for the piece values to translate to scores.
I use Monte-Carlo sampling of win probabilities quite often, to determine piece values, and I have observed that the results are usually quite independent of search depth. They can be dependent on evaluation, though. If I play a Bishop pair against two Knights and a Pawn (all other pieces being present as well) from shuffled opening setups, the score (between exactly equal engines) is the same within statistical erros from 40/10 to 40/0:10. But other engines might get a different score. This difference largely disappears when you normalize the excess score on the pawn-odds advantage the various engine are scoring.
This is almost independent of how strong the engine is, in particular what it thinks the piece values should be. Even if you make a Rook worth less than a Bishop, a Rook will still systematically beat the Bishop, as the side with the Bishop will also value the Bishop higher, and avoid trading it for the Rook until it is too late.
This all within reason: if you would play engines that are so damaged that they, say, cannot generate Knight moves, obviously a Knight would be of zero value to them. So the engines must play something that at least remotely resembles proper Chess, for the piece values to translate to scores.
-
- Posts: 10903
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: How does Monte Carlo analysis work?
I believe that very fast time control with relatively weak engines can lead to more draws.hgm wrote:Well, truly strategic features of the evaluation should, by definition, be quite independent of search depth. What you have to be careful with is changes in your search.
I use Monte-Carlo sampling of win probabilities quite often, to determine piece values, and I have observed that the results are usually quite independent of search depth. They can be dependent on evaluation, though. If I play a Bishop pair against two Knights and a Pawn (all other pieces being present as well) from shuffled opening setups, the score (between exactly equal engines) is the same within statistical erros from 40/10 to 40/0:10. But other engines might get a different score. This difference largely disappears when you normalize the excess score on the pawn-odds advantage the various engine are scoring.
This is almost independent of how strong the engine is, in particular what it thinks the piece values should be. Even if you make a Rook worth less than a Bishop, a Rook will still systematically beat the Bishop, as the side with the Bishop will also value the Bishop higher, and avoid trading it for the Rook until it is too late.
This all within reason: if you would play engines that are so damaged that they, say, cannot generate Knight moves, obviously a Knight would be of zero value to them. So the engines must play something that at least remotely resembles proper Chess, for the piece values to translate to scores.
The engine may be unable to win KQ vs K and if the engine is very weak it may make stalemates even in better positions because it gets only depth 1 and has nothing in the evaluation to prevent stalemate.
Another problem may be repetitions that lead to draw.
All of this is not relevant to the rybka team because rybka at 1 second per game has no problem to get to bigger depths.
Uri
-
- Posts: 28395
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: How does Monte Carlo analysis work?
Well, I am not talking here about only doing 1-ply search or buggy engines. Even at 40/0:10, a typical engine will search over 100,000 nodes, good for 4-ply even with a branching factor of 10.
And even if they cannot win KQK, and are much more draw prone: the Pawn odds score suffers almost equally for this than any other material advantage. So if a strong engine would reach 68% with Pawn odds, and 59% with NNP vs BB imbalances, a weaker engine might only convert those advantages to 62% and 56%, respectively. But (56-50)/(62-50) would still be equal to (59-50)/(68-50), namely 0.5. This is what I see in practice.
And even if they cannot win KQK, and are much more draw prone: the Pawn odds score suffers almost equally for this than any other material advantage. So if a strong engine would reach 68% with Pawn odds, and 59% with NNP vs BB imbalances, a weaker engine might only convert those advantages to 62% and 56%, respectively. But (56-50)/(62-50) would still be equal to (59-50)/(68-50), namely 0.5. This is what I see in practice.
-
- Posts: 2053
- Joined: Wed Mar 08, 2006 8:30 pm
Re: How does Monte Carlo analysis work?
Theoretically, 80K games, with 1/3 draws, gives a standard deviation (SD) of 0.144%, that is (x 7) 1 Elo.bob wrote:[They might think they are measuring changes as small as one Elo with 80K games, but they aren't...
And 2 SD (95% probability) is 2 Elo