We are the Computer Chess Community. We control ourselves.

towforce · Post by **towforce** » Tue Feb 23, 2021 10:51 am

AndrewGrant wrote: ↑Tue Feb 23, 2021 6:48 amI love people who do work. Great people. Best people.

Some people who "do work" harm other people - sometimes intentionally, sometimes not. Other virtues of value:

* willingness to sacrifice oneself for others

* helping others to grow

JohnWoe · Post by **JohnWoe** » Tue Feb 23, 2021 11:48 am

Somewhere buried amongst the vicious internet hyenas.

That chess programmers are so stupid. That they need somebody to tell them how strong their engine is???
Really? That's ridiculous. I know exactly how strong my engines are!

If I could send 10 different versions to these lists and pick the strongest one after 1 billion games. Loop ad nauseam.
Then I could see some benefit.
As a programmer. Not consumer.

flok · Post by **flok** » Tue Feb 23, 2021 1:42 pm

JohnWoe wrote: ↑Tue Feb 23, 2021 11:48 am Somewhere buried amongst the vicious internet hyenas.

That chess programmers are so stupid. That they need somebody to tell them how strong their engine is???
Really? That's ridiculous. I know exactly how strong my engines are!

If I could send 10 different versions to these lists and pick the strongest one after 1 billion games. Loop ad nauseam.
Then I could see some benefit.
As a programmer. Not consumer.

I *love* how CCRL takes the trouble of determing the strength of all those chess programs. They do it properly, they're friendly and helpful to the developers. Also, they do not get paid for it.
So I'm puzzled why others are so mean to them if THEY decide also to include possible clones or not. It is THEIR project. They spend all their free time in it.
If you have a different opinion: well, then start your own statistics project? Good luck with that!

gonzochess75 · Post by **gonzochess75** » Tue Feb 23, 2021 2:14 pm

dkappe wrote: ↑Tue Feb 23, 2021 5:07 am
noobpwnftw wrote: ↑Tue Feb 23, 2021 4:21 am
dkappe wrote: ↑Tue Feb 23, 2021 4:11 am Or are you just going to attack anyone who doesn’t agree with you?
You mean like what you do?
I don’t attack just anyone. I’m standing up to you and your bullying tactics.

In this case you're actually attacking the reputation of Vondele the stockfish maintainer who published that fishtest result.

Peter Berger · Post by **Peter Berger** » Tue Feb 23, 2021 3:34 pm

connor_mcmonigle wrote: ↑Tue Feb 23, 2021 4:20 am These results are consistent with many other testers' results. Prematurely labeling this testing a "scam" is unnecessarily incendiary and immature. In fact, the only testing that's inconsistent with the general consensus which has formed as to the FF2 network's strength (-5 to -20 elo to SF 13) is Albert's published testing on ChessBase. Why aren't you so vocally decrying Albert's testing as a scam?

Because Albert has actually desribed his testing procedure, which makes some sense, but is slightly unusual.

From a chessplayer's point of view you could create some kind of meaningful statement from it.

Starting with lc0: lc0 is heavily trained in the few openings it is actually playing actively, I guess we can all go with this one if we have payed attention to its games.

Likewise Stockfish ( with NNUE) may be preferring certain setups over others of similar quality, and maybe we won't even recognize.

Albert's procedure ( as I understand) is a bit similar to what I have read in posts by Frank Quisinsky before: I want my chessprogram to make sense in all kinds of setups ( you can do this by forcing all ECO codes e.g.). And he has trained it with human games and games played against itself starting from a similar approach.

What is so unusual and illogical in assuming that this may result in different results?

Actually this is what bookcooks have been doing for like forever, though not really in a very scientific way. Yay, I understand the ethical implications, this is not rocket science anyway - probably this is just some Stockfish with very minor changes - and now sell it for 100 dollars? Bad guy. But he is some single guy and he produced sth of at least minor interest against the community - there have to be logical reasons for it, if FF2.0 is of similar strength as Stockfish, but still somehow different, how come? When it is about evil ChessBase or evil FatFritz ( what an ugly name btw) - so be it. But can't we slowly start to think of potential benefits for the general development of the freeforce crew?

connor_mcmonigle · Post by **connor_mcmonigle** » Tue Feb 23, 2021 5:33 pm

Peter Berger wrote: ↑Tue Feb 23, 2021 3:34 pm
connor_mcmonigle wrote: ↑Tue Feb 23, 2021 4:20 am These results are consistent with many other testers' results. Prematurely labeling this testing a "scam" is unnecessarily incendiary and immature. In fact, the only testing that's inconsistent with the general consensus which has formed as to the FF2 network's strength (-5 to -20 elo to SF 13) is Albert's published testing on ChessBase. Why aren't you so vocally decrying Albert's testing as a scam?
...
Because Albert has actually desribed his testing procedure, which makes some sense, but is slightly unusual.
...

As far as I am aware, Albert has not been remotely transparent about his testing procedures. If he had provided the opening book he used, the PGNs from the match, bench numbers and other information essential to the replication of his results, then this would be a different story. Without this information, independent verification of his outlying results is an impossibility. The Stockfish team is supplying all the relevant information along with their test results. Additionally, their results are consistent, statistically, with other testers' results. This is why I find Dietrich labeling the Stockfish team's testing of FF2 a scam so illogical and immature.

Peter Berger wrote: ↑Tue Feb 23, 2021 3:34 pm ...
Albert's procedure ( as I understand) is a bit similar to what I have read in posts by Frank Quisinsky before: I want my chessprogram to make sense in all kinds of setups ( you can do this by forcing all ECO codes e.g.). And he has trained it with human games and games played against itself starting from a similar approach.

What is so unusual and illogical in assuming that this may result in different results?
...

Note that I am not the one claiming anyone's testing is a scam - that's Dietrich. My point was that, if anyone's testing is a scam, it's Albert's. It's conceivable Albert just used an extremely biased book for his testing that resulted in the results of his testing differing by +15 - +40 elo in favor of the network he trained. Albert's testing's draw rate is unusually low for 90s+1 time control. This behavior is still unethical, but perhaps not a "scam". Again, it would be helpful in assessing just how biased the book Albert used is if he actually provided the book along with his test results...

Peter Berger wrote: ↑Tue Feb 23, 2021 3:34 pm Actually this is what bookcooks have been doing for like forever, though not really in a very scientific way. Yay, I understand the ethical implications, this is not rocket science anyway - probably this is just some Stockfish with very minor changes - and now sell it for 100 dollars? Bad guy. But he is some single guy and he produced sth of at least minor interest against the community - there have to be logical reasons for it, if FF2.0 is of similar strength as Stockfish, but still somehow different, how come? When it is about evil ChessBase or evil FatFritz ( what an ugly name btw) - so be it. But can't we slowly start to think of potential benefits for the general development of the freeforce crew?

For any two somewhat competitive networks, trained on different data, it is possible to construct a cooked book where one will erroneously outperform the other. Assuming that Albert's test results, for some biased book, are honest, the results still aren't very interesting. I see zero potential benefits for the computer chess community. Albert has contributed nothing meaningful to the advancement of computer chess, though he'll gladly take credit for the hard work of others in lengthy ChessBase "articles".

Madeleine Birchfield · Tue Feb 23, 2021 5:43 pm

Why did the rating lists not put Houdini in the Ippolit/Robbolito family of engines when it first came out a decade ago? Why didn't the rating lists take Rybka off the rating lists a decade ago when it was discovered to be a Fruit derivative?

Until these questions are answered by those who run the rating lists, I do not see how anything is going to change regarding Fat Fritz 2, because the rating lists have been behaving in exactly the same manner as they always do, catering to the interests of commercial engines at the expense of the truth.

the_real_greco · Post by **the_real_greco** » Tue Feb 23, 2021 6:21 pm

JohnWoe wrote: ↑Tue Feb 23, 2021 11:48 am Somewhere buried amongst the vicious internet hyenas.

That chess programmers are so stupid. That they need somebody to tell them how strong their engine is???
Really? That's ridiculous. I know exactly how strong my engines are!

If I could send 10 different versions to these lists and pick the strongest one after 1 billion games. Loop ad nauseam.
Then I could see some benefit.
As a programmer. Not consumer.

As the CCC TD, I heard claims from authors about how strong their engines were all the time. This new net gave +X Elo; the latest patches finally pushed it past Y other engine.

I could count on one hand the number of times those claims were actually true. But, it's completely natural to see your engine through rose-tinted glasses.

Rating lists are nice because they don't care about any engine in particular. They have established procedures that everyone knows are reasonable, and that you can actually trust.

JohnWoe · Post by **JohnWoe** » Sat Feb 27, 2021 8:28 pm

the_real_greco wrote: ↑Tue Feb 23, 2021 6:21 pm
JohnWoe wrote: ↑Tue Feb 23, 2021 11:48 am Somewhere buried amongst the vicious internet hyenas.

That chess programmers are so stupid. That they need somebody to tell them how strong their engine is???
Really? That's ridiculous. I know exactly how strong my engines are!

If I could send 10 different versions to these lists and pick the strongest one after 1 billion games. Loop ad nauseam.
Then I could see some benefit.
As a programmer. Not consumer.
As the CCC TD, I heard claims from authors about how strong their engines were all the time. This new net gave +X Elo; the latest patches finally pushed it past Y other engine.

I could count on one hand the number of times those claims were actually true. But, it's completely natural to see your engine through rose-tinted glasses.

Rating lists are nice because they don't care about any engine in particular. They have established procedures that everyone knows are reasonable, and that you can actually trust.

Testing is really hard. I'm grateful for CCRL testing Sapeli when I was actively developing it. I dropped it long time ago. I only gained like +150 ELO. Ended up 1900 ELO? If I remember correctly. Not even 2000 ELO.

On the other hand. My only motivation comes from improving my programs not sitting on rating lists.
Only fools don't listen to others. Although I think I'm genius and others are idiots. I still listen to others opinions/thoughts.

Although w/ Sapeli I was just happy it worked well. Not illegal moves/time losses. The bar was set low.

With Mayhem I'm ready to copy-paste anything that gives ELO. Anything goes really. I can't close the source code because copy-paste.

We are the Computer Chess Community. We control ourselves.

Re: We are the Computer Chess Community. We control ourselves.

Re: We are the Computer Chess Community. We control ourselves.

Re: We are the Computer Chess Community. We control ourselves.

Re: We are the Computer Chess Community. We control ourselves.

Re: We are the Computer Chess Community. We control ourselves.

Re: We are the Computer Chess Community. We control ourselves.

Re: We are the Computer Chess Community. We control ourselves.

Re: We are the Computer Chess Community. We control ourselves.

Re: We are the Computer Chess Community. We control ourselves.