1 Game without the b1 knight and the other without the g1 knight. I really don't see the GM winning a 6 games match on your system (Save the Armageddon scoring).
Why do you believe that those four engines are a reasonable simulation of a human 2500 GM? I'm not saying you are wrong, just want to know your reasoning.
I am not sure what is reasonable for a 2500 GM, however these are 4 low rated random engines that have given humans trouble in the past on 1 thread so I really don't think that your system would be easier.
Can you provide some specifics as to what those four engines have accomplished vs. humans? Are we talking about GMs (or near-GMs), and were the games Rapid or Classical as opposed to blitz?
What do they need to 'accomplish' to give a human GM or lower a hard time at rapid time controls that they have not already accomplished?
Okay, those are two good examples for those two engines. What about Joker and Colossus, do they have similar accomplishments vs. strong human players?
As I said before, these are just 4 'random' low rated engines by todays standard. Colossus 2021b just had a plus +117 Elo test result versus Fruit (EST 2750) and Joker is rated CCRL 2300
I only picked them at random because they are low rated by todays standard, some use endgame bases and some do not. They do have differences as to each engines positive and negative traits, and them having a record vs humans was not a requirement, just that they might give some humans a hard time was good enough for me. Dragon is much stronger than any of them on 1 thread going head to head. So I wanted to see how they would fair at the match time controls of your event.
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
1 Game without the b1 knight and the other without the g1 knight. I really don't see the GM winning a 6 games match on your system (Save the Armageddon scoring).
Why do you believe that those four engines are a reasonable simulation of a human 2500 GM? I'm not saying you are wrong, just want to know your reasoning.
I am not sure what is reasonable for a 2500 GM, however these are 4 low rated random engines that have given humans trouble in the past on 1 thread so I really don't think that your system would be easier.
Can you provide some specifics as to what those four engines have accomplished vs. humans? Are we talking about GMs (or near-GMs), and were the games Rapid or Classical as opposed to blitz?
What do they need to 'accomplish' to give a human GM or lower a hard time at rapid time controls that they have not already accomplished?
Okay, those are two good examples for those two engines. What about Joker and Colossus, do they have similar accomplishments vs. strong human players?
As I said before, these are just 4 'random' low rated engines by todays standard. Colossus 2021b just had a plus +117 Elo test result versus Fruit (EST 2750) and Joker is rated CCRL 2300
I only picked them at random because they are low rated by todays standard, some use endgame bases and some do not. They do have differences as to each engines positive and negative traits, and them having a record vs humans was not a requirement, just that they might give some humans a hard time was good enough for me. Dragon is much stronger than any of them on 1 thread going head to head. So I wanted to see how they would fair at the match time controls of your event.
Okay, I'll agree that Zarkov, Colossus, and Tiger are all reasonable choices to simulate the GM, but Joker seems to be of a much lower class based on its rating, and the 1 to 1 score vs. Dragon with knight odds worries me a bit. I think that the main problem with these simulations is that most if not all of these GM-level engines don't fully appreciate the importance of trading pieces when up a piece or so. I generally find that using Komodo versions with a sizable negative Contempt and an appropriate Skill level is a better human simulation for odds matches; based on this approach I expect a close match, although even this simulation is far from ideal.
1 Game without the b1 knight and the other without the g1 knight. I really don't see the GM winning a 6 games match on your system (Save the Armageddon scoring).
Why do you believe that those four engines are a reasonable simulation of a human 2500 GM? I'm not saying you are wrong, just want to know your reasoning.
I am not sure what is reasonable for a 2500 GM, however these are 4 low rated random engines that have given humans trouble in the past on 1 thread so I really don't think that your system would be easier.
Can you provide some specifics as to what those four engines have accomplished vs. humans? Are we talking about GMs (or near-GMs), and were the games Rapid or Classical as opposed to blitz?
What do they need to 'accomplish' to give a human GM or lower a hard time at rapid time controls that they have not already accomplished?
Okay, those are two good examples for those two engines. What about Joker and Colossus, do they have similar accomplishments vs. strong human players?
As I said before, these are just 4 'random' low rated engines by todays standard. Colossus 2021b just had a plus +117 Elo test result versus Fruit (EST 2750) and Joker is rated CCRL 2300
I only picked them at random because they are low rated by todays standard, some use endgame bases and some do not. They do have differences as to each engines positive and negative traits, and them having a record vs humans was not a requirement, just that they might give some humans a hard time was good enough for me. Dragon is much stronger than any of them on 1 thread going head to head. So I wanted to see how they would fair at the match time controls of your event.
I did some checking of engines in the CCRL 2500 -2600 Rapid range to find one that clearly understands the value of trading down when a piece up, and Gaviota 0.84 32 bit seems to really fit the specs. If I start with the knight odds position, get the score, and then remove pairs of pieces (knights, rooks, queens, not cumulatively) the eval drops markedly, as it should. There may be other engines in that range that show this behavior, but most do not. So this seems an ideal choice for simulating a human GM playing Rapid at knight odds. Since it is 2570 CCRL and the CCRL engines in that range are clearly stronger on modern hardware than similarly rated human GMs playing Rapid, it might actually be about right to have both engines play 3' + 2" blitz, I would estimate that this would bring Gaviota down to about the level of a human 2500 GM playing Rapid.
1 Game without the b1 knight and the other without the g1 knight. I really don't see the GM winning a 6 games match on your system (Save the Armageddon scoring).
Why do you believe that those four engines are a reasonable simulation of a human 2500 GM? I'm not saying you are wrong, just want to know your reasoning.
I am not sure what is reasonable for a 2500 GM, however these are 4 low rated random engines that have given humans trouble in the past on 1 thread so I really don't think that your system would be easier.
Can you provide some specifics as to what those four engines have accomplished vs. humans? Are we talking about GMs (or near-GMs), and were the games Rapid or Classical as opposed to blitz?
What do they need to 'accomplish' to give a human GM or lower a hard time at rapid time controls that they have not already accomplished?
Okay, those are two good examples for those two engines. What about Joker and Colossus, do they have similar accomplishments vs. strong human players?
As I said before, these are just 4 'random' low rated engines by todays standard. Colossus 2021b just had a plus +117 Elo test result versus Fruit (EST 2750) and Joker is rated CCRL 2300
I only picked them at random because they are low rated by todays standard, some use endgame bases and some do not. They do have differences as to each engines positive and negative traits, and them having a record vs humans was not a requirement, just that they might give some humans a hard time was good enough for me. Dragon is much stronger than any of them on 1 thread going head to head. So I wanted to see how they would fair at the match time controls of your event.
I did some checking of engines in the CCRL 2500 -2600 Rapid range to find one that clearly understands the value of trading down when a piece up, and Gaviota 0.84 32 bit seems to really fit the specs. If I start with the knight odds position, get the score, and then remove pairs of pieces (knights, rooks, queens, not cumulatively) the eval drops markedly, as it should. There may be other engines in that range that show this behavior, but most do not. So this seems an ideal choice for simulating a human GM playing Rapid at knight odds. Since it is 2570 CCRL and the CCRL engines in that range are clearly stronger on modern hardware than similarly rated human GMs playing Rapid, it might actually be about right to have both engines play 3' + 2" blitz, I would estimate that this would bring Gaviota down to about the level of a human 2500 GM playing Rapid.
That is good to hear. Question: Do you having Gavaiota using it's Gaviota Tablebases? I would lie to set it up the same here.
Thanks
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
AdminX wrote: ↑Fri Oct 22, 2021 8:38 am
That is good to hear. Question: Do you having Gavaiota using it's Gaviota Tablebases? I would lie to set it up the same here.
Thanks
Correction: That is good to hear. Question: Do you having Gavaiota using it's Gaviota Tablebases? I would like to set it up the same here.
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
AdminX wrote: ↑Fri Oct 22, 2021 8:38 am
That is good to hear. Question: Do you having Gavaiota using it's Gaviota Tablebases? I would lie to set it up the same here.
Thanks
Correction: That is good to hear. Question: Do you having Gavaiota using it's Gaviota Tablebases? I would like to set it up the same here.
I just downloaded the engine and left default settings unchanged, so I assume that means no Tablebases. Probably this is best for the purpose, since humans don't use tablebases.
AdminX wrote: ↑Fri Oct 22, 2021 8:38 am
That is good to hear. Question: Do you having Gavaiota using it's Gaviota Tablebases? I would lie to set it up the same here.
Thanks
Correction: That is good to hear. Question: Do you having Gavaiota using it's Gaviota Tablebases? I would like to set it up the same here.
I just downloaded the engine and left default settings unchanged, so I assume that means no Tablebases. Probably this is best for the purpose, since humans don't use tablebases.
I do NOT believe that GM Perelshteyn can play as good as Gaviota version 0.84 with a Knight Odds both engines using the same CPU, but with Mr. Kaufman super 32 core Threadripper, there is more reason to believe that he will Not stand a chance, anyway, here is the 1st first game.
AdminX wrote: ↑Fri Oct 22, 2021 8:38 am
That is good to hear. Question: Do you having Gavaiota using it's Gaviota Tablebases? I would lie to set it up the same here.
Thanks
Correction: That is good to hear. Question: Do you having Gavaiota using it's Gaviota Tablebases? I would like to set it up the same here.
I just downloaded the engine and left default settings unchanged, so I assume that means no Tablebases. Probably this is best for the purpose, since humans don't use tablebases.
I do NOT believe that GM Perelshteyn can play as good as Gaviota version 0.84 with a Knight Odds both engines using the same CPU, but with Mr. Kaufman super 32 core Threadripper, there is more reason to believe that he will Not stand a chance, anyway, here is the 1st first game. I know that 1.d4 is NOT a good opening for White with the Knight b1 missing, I will see what happen with 1.e4.... Komodo Dragon 2.5x should at least get a draw opening with 1.e4
AdminX wrote: ↑Fri Oct 22, 2021 8:38 am
That is good to hear. Question: Do you having Gavaiota using it's Gaviota Tablebases? I would lie to set it up the same here.
Thanks
Correction: That is good to hear. Question: Do you having Gavaiota using it's Gaviota Tablebases? I would like to set it up the same here.
I just downloaded the engine and left default settings unchanged, so I assume that means no Tablebases. Probably this is best for the purpose, since humans don't use tablebases.
I do NOT believe that GM Perelshteyn can play as good as Gaviota version 0.84 with a Knight Odds both engines using the same CPU, but with Mr. Kaufman super 32 core Threadripper, there is more reason to believe that he will Not stand a chance, anyway, here is the 1st first game. I know that 1.d4 is NOT a good opening for White with the Knight b1 missing, I will see what happen with 1.e4.... Komodo Dragon 2.5x should at least get a draw opening with 1.e4
You don't state the time controls, was this with the simulated match conditions (3' + 2" for Dragon, 15' + 10" for Gaviota)? If so, then I agree, Gaviota 0.84 32 bit is stronger in Rapid than a 2500 human GM. My latest studies suggest the following conversion formula for estimating human FIDE elo in Rapid (15' + 10") play from CCRL Rapid ratings: human FIDE = 2/3 (CCRL) + 1000. So with Gaviota at 2570 CCRL Rapid, this gives 2713 as the estimated human FIDE rating that should be needed to score 50% vs. Gaviota 0.84 32 bit on one core of modern i7 at Rapid. That's why I suggested having both engines play with 3' + 2"; that would reduce the estimated strength of Gaviota to a level much closer to human 2500 (still above it, but not way above it). Regarding hardware, my latest simulations suggest that using more than 16 threads for this type of match is actually counterproductive, so I'll probably limit Dragon to 16 threads unless I get new data. Regarding opening, it is well known that with b1 missing, 1e4 is dubious due to 1...d5! It's very difficult for White to get any opening advantage (ignoring the missing piece) when giving b1 odds. Human GMs will at least work out the best initial 3 or 4 moves against reasonable opening tries by White at knight odds, that's one reason they have a better chance than similar strength engines with no book.