Anti-cheating

Ferdy · Post by **Ferdy** » Mon Sep 13, 2021 6:23 am

scchess wrote: ↑Mon Sep 13, 2021 2:38 am Another issue is how to estimate the complexity of a position. This is a very important feature for anti-cheating but it's hard. Chess.com definitely has a way to estimate the complexity. I remember there was an old paper using the frequency of PV switching during iterative deepening as an estimation. Also, the depth of the search and width of the search could also be used.

The Maia paper published last year has some insights. I quote:

... One way to measure this complexity is to consider the difference in quality between the best and second best moves in the position. If this difference is small there are at least two good moves to choose from, and if it’s large there is only
one good move. ...
The ability not to blunder, the ability to follow computer PV lines as the complexity of the position vary with time pressure is going to be a key factor in anti-cheating. There should be a correlation, we should be able to fit a curve like how Stockfish fits its WDL sigmoid curve. Anyone going outside this curve statistically could be a cheater.

Complexity of position from Guid and Bratko. It uses the eval difference to get the complexity value. I did an experiment using the depth.

There is also a complexity metrics by NN.

The time pressure by checking the time left is indeed a good feature to include.

scchess · Post by **scchess** » Mon Sep 13, 2021 7:46 am

Ferdy wrote: ↑Mon Sep 13, 2021 6:23 am
scchess wrote: ↑Mon Sep 13, 2021 2:38 am Another issue is how to estimate the complexity of a position. This is a very important feature for anti-cheating but it's hard. Chess.com definitely has a way to estimate the complexity. I remember there was an old paper using the frequency of PV switching during iterative deepening as an estimation. Also, the depth of the search and width of the search could also be used.

The Maia paper published last year has some insights. I quote:

... One way to measure this complexity is to consider the difference in quality between the best and second best moves in the position. If this difference is small there are at least two good moves to choose from, and if it’s large there is only
one good move. ...
The ability not to blunder, the ability to follow computer PV lines as the complexity of the position vary with time pressure is going to be a key factor in anti-cheating. There should be a correlation, we should be able to fit a curve like how Stockfish fits its WDL sigmoid curve. Anyone going outside this curve statistically could be a cheater.
Complexity of position from Guid and Bratko. It uses the eval difference to get the complexity value. I did an experiment using the depth.

There is also a complexity metrics by NN.

The time pressure by checking the time left is indeed a good feature to include.

Exactly! By data mining everybody in your rating group, it should be obvious how someone in your rating group can perform against an opponent of another rating group. In terms of how likely you can outplay against a certain rating group. Carlsen will certainly have the best performance here, just because you have an equal position with him in an equal game that doesn't mean he won't be able to outplay you. Now, your own performance can be compared to the pre-computed statistics, any outlier can be a cheater.

Ferdy · Post by **Ferdy** » Mon Sep 13, 2021 9:26 am

scchess wrote: ↑Mon Sep 13, 2021 3:03 am
scchess wrote: ↑Mon Sep 13, 2021 2:11 am
Guenther wrote: ↑Sun Sep 12, 2021 7:38 pm
cyrill57 wrote: ↑Sun Sep 12, 2021 7:23 pm I think the interesting idea is to train a neural network to detect engine/human games. After you train it on a large enough dataset, it will be relatively easy to check statistics for false positives for such a network.
Well, that's exactly what they did for lichess.
https://github.com/clarkerubber/irwin
Correct. Lichess, true to the spirit, did publish the source code open. However, it's useless to the outside world in practice. There is no documentation the input features and what exactly the model is predicting. More importantly, there is no mention at the data format.
While I don't have lichess's data. It ---- looks ---- like the model is predicting if someone is a cheater or not, a binary decision. For example, https://github.com/clarkerubber/irwin/b ... ng.py#L105 looks like forming an array of binary labels. The loss function is binary_crossentropy (https://github.com/clarkerubber/irwin/b ... del.py#L81) so the assumption looks correct.

By far the most important in the project is the input data. It looks like the project requires a connection to lichess' labelled cheater in their DB (https://github.com/clarkerubber/irwin/b ... ing.py#L42). The input data seems to be analyzed PGN data from Fishnet (https://github.com/clarkerubber/irwin/b ... me.py#L163).

For the actual model, it looks like an LSTM model. https://github.com/clarkerubber/irwin/b ... meModel.py. The model looks alright if the input data is game moves, because the game moves indeed form a sequence.

Overall, it looks like the project takes game analysis data from Fishnet to form a binary classification. Doesn't look like it takes the player's performance history, but I could be wrong. There's definitely more cheating code somewhere else in the main part of the lichess project.

That repo is interesting I have not read the code, but what I have in mind is to identify the features that will be used to detect cheaters. So the usual evaluation errors must be there, the stats on top 3 best moves or so, the number of blunders, mistakes, dubious moves, the number of difficult/complex positions where the best moves are found, high number of good moves found in complex positions with low time left and others. Then at the end we need a label of whether this set of features has 1 for a cheater or 0 for non cheater or a range of values from 1 to 0, where closer to 1 has a higher probability of cheating, then build a model from it, extract the weights of those features and apply those weights to a game (to be analyzed to get the features) that will be under examination for cheating.

Another idea is to create models based on rating range, so there can be a model to evaluate players in the rating range 1000 to 1500 or so.

Ferdy · Post by **Ferdy** » Mon Sep 13, 2021 11:23 am

There was a recent embarrassing incident where the team from my country - Philippines had been disqualified in FIDE Online Olympiad because one player had been found to be violating the fair play rules.

Made a quick analysis of suspected 4 games with some basic stats.

Error - the difference between the engine and human move score in cp when the position is not yet losing or winning, the winning score is set at 300cp or more, the losing score is set at -300cp or less.
Blunder - a move that brings the non-losing position to losing
Mistake - a move with an error of 100cp or more, but the position is not losing and not winning even after this mistake
Dubious - a move with an error of 15cp or more but less than 100cp, with same condition in the Mistake
top1 - the percentage rate of moves that matches the top 1 move of the engine

Analyzing engine: SF14 at depth 20 starting at move 12. The subject is the player named aaa. The game is played under rapid or TC=15m+5s.

Observations:
* This young girl does not blunder, possible if opponent is weak or opponent blundered first or opponent commit mistakes first.
* Game 2 is interesting, only 4 dubious moves with 2.2cp error, the game lasted for 47 moves, her opponent is rated higher than her.
* Her average top1 feature is around 63%.

Code: Select all

2021 FIDE Online Olympiad


game: 1
=======

white: Putar, Lara
result: 0
mean error: 65.1 cp
blunder: 1
mistake: 2
dubious: 6
top1: 38.2%

black: aaa
result: 1
mean error: 41.8 cp
blunder: 0
mistake: 2
dubious: 3
top1: 73.5%


game: 2
=======

white: aaa
result: 1
mean error: 2.2 cp
blunder: 0
mistake: 0
dubious: 4
top1: 63.9%

black: Sovetbekova, Nurai
result: 0
mean error: 30.2 cp
blunder: 0
mistake: 2
dubious: 5
top1: 48.6%


game: 3
=======

white: Li, Xinyu
result: 0.5
mean error: 26.2 cp
blunder: 2
mistake: 1
dubious: 6
top1: 40.0%

black: aaa
result: 0.5
mean error: 23.4 cp
blunder: 0
mistake: 1
dubious: 4
top1: 62.5%


game: 4
=======

white: aaa
result: 1
mean error: 20.3 cp
blunder: 0
mistake: 2
dubious: 2
top1: 57.9%

black: Noshin, Anjum
result: 0
mean error: 43.3 cp
blunder: 2
mistake: 3
dubious: 3
top1: 44.4%

I also checked her 9 recent older games (rapid TC) in U15. Performance is also strong. That game 2 is perfect (that I begin doubting my analysis tool

but upon manual game examination running sf14 in background, the moves are really good), perhaps this is under preparation, the game lasted for 28 moves against a higher rated opponent. Mean of top1 rate is lower than in the olympics games. Surely FIDE also looked at this performance.

Code: Select all

Asian Schools Chess Championships - Under 15 Girls


game: 1
=======

white: Bayasgalan, Khishigbaatar
result: 0
mean error: 26 cp
blunder: 2
mistake: 3
dubious: 3
top1: 51.9%

black: aaa
result: 1
mean error: 32 cp
blunder: 0
mistake: 2
dubious: 5
top1: 59.6%


game: 2
=======

white: JOHARI, Atena
result: 0
mean error: 40 cp
blunder: 2
mistake: 2
dubious: 1
top1: 47.1%

black: aaa
result: 1
mean error: 0 cp
blunder: 0
mistake: 0
dubious: 0
top1: 76.5%


game: 3
=======

white: aaa
result: 1
mean error: 18 cp
blunder: 0
mistake: 2
dubious: 6
top1: 59.5%

black: Amin-Erdene, Bayanmunkh
result: 0
mean error: 19 cp
blunder: 1
mistake: 2
dubious: 9
top1: 53.7%


game: 4
=======

white: SANUDULA, K M Dahamdi
result: 0
mean error: 56 cp
blunder: 1
mistake: 2
dubious: 9
top1: 36.4%

black: aaa
result: 1
mean error: 59 cp
blunder: 0
mistake: 5
dubious: 0
top1: 57.6%


game: 5
=======

white: aaa
result: 0
mean error: 102 cp
blunder: 1
mistake: 3
dubious: 0
top1: 30.8%

black: CHONG, Kai Ni Agnes
result: 1
mean error: 70 cp
blunder: 0
mistake: 1
dubious: 2
top1: 30.8%


game: 6
=======

white: BORAMANIKAR, Tanisha S
result: 0
mean error: 30 cp
blunder: 1
mistake: 3
dubious: 5
top1: 54.5%

black: aaa
result: 1
mean error: 18 cp
blunder: 0
mistake: 3
dubious: 3
top1: 67.3%


game: 7
=======

white: aaa
result: 1
mean error: 52 cp
blunder: 0
mistake: 2
dubious: 4
top1: 78.9%

black: RENGANAYAKI, V
result: 0
mean error: 18 cp
blunder: 2
mistake: 1
dubious: 7
top1: 39.5%


game: 8
=======

white: THAMIL MANNI, Swetha
result: 0.5
mean error: 12 cp
blunder: 0
mistake: 1
dubious: 1
top1: 75.0%

black: aaa
result: 0.5
mean error: 12 cp
blunder: 0
mistake: 0
dubious: 4
top1: 62.5%


game: 9
=======

white: aaa
result: 1
mean error: 30 cp
blunder: 0
mistake: 2
dubious: 5
top1: 71.1%

black: NITHYASHREE, Saravanan
result: 0
mean error: 27 cp
blunder: 2
mistake: 1
dubious: 6
top1: 47.7%

yurikvelo · Post by **yurikvelo** » Mon Sep 13, 2021 3:56 pm

Anti-cheating will work only against straitforward dumb cheater, who just want to go from ELO 1600 to 2400 in few days with singlePV.

If cheater employ "random blunder, random mistake, never top-multiPV, long-term variable ELO curve (up & down)) you will not find him.

scchess · Post by **scchess** » Mon Sep 13, 2021 4:05 pm

yurikvelo wrote: ↑Mon Sep 13, 2021 3:56 pm Anti-cheating will work only against straitforward dumb cheater, who just want to go from ELO 1600 to 2400 in few days with singlePV.

If cheater employ "random blunder, random mistake, never top-multiPV, long-term variable ELO curve (up & down)) you will not find him.

Wrong. Totally wrong. You have no idea what you're talking about.

Cornfed · Post by **Cornfed** » Mon Sep 13, 2021 6:00 pm

scchess wrote: ↑Mon Sep 13, 2021 4:05 pm
yurikvelo wrote: ↑Mon Sep 13, 2021 3:56 pm Anti-cheating will work only against straitforward dumb cheater, who just want to go from ELO 1600 to 2400 in few days with singlePV.

If cheater employ "random blunder, random mistake, never top-multiPV, long-term variable ELO curve (up & down)) you will not find him.
Wrong. Totally wrong. You have no idea what you're talking about.

I find myself agreeing again with you once again...and have heard/seen people who actually do the testing agree with you on this as well. I seem to remember a good article years back in Chess Life by...think it was GM James Tarjan (?) on the subject of cheat detection, wish I could find it...I might try the USCF online archive if I can find the time.

The instance cited above - really 4 games with such play is not enough to be definitive...just perhaps enough to raise a red flag for further investigation.

I know online I play a couple of different ways I play:

1. To pass the time - often watching some TV or listening to a podcast... that is to say, not particularly seriously and against clearly weaker players I will sometimes 'just go bonkers'...
2. Training - When I asked an IM friend of mine told me a few years back how I specifically could get better - he emphasized 'making things difficult' for myself - I've played him many OTB tournament games and as long as things are somewhat 'controlled', I play really well. He advised to intentionally sacrifice pawns/exchanges/and in general to go into complications where normally I would not. The idea is get better in unbalanced situations and better tune my 'risk/reward' tendencies. I do this quite a lot now and feel more at ease in the positions...I even enjoy them! Someone else of course might be focusing on some other aspect of the game...
3. Games where I really am trying hard from start to finish. Here, I run a quick check of my games and often find myself with what would seem to be an unusually good score via Chessbase 16 or Chess.24 analysis just because they are 'easy' and there for me to look at. BTW, I play even here with my decidedly risky/dubious Black defenses - Scandinavian gambits and the Classical Dutch. Relatively rare systems for my opponents and it allows me to further hone my play in them.

To include all 3 types in cheating analysis with the same weight is obviously wrong...and does not even include the obvious fact that your opponents could have their own varied reasons for playing each and every game, giving you easier games to play sometimes. I rarely if ever play tournaments online. Mostly I play 3/0 - and loose too many on time...as Aimchess is relentless in pointing out. I do occasionally play as slow as 2/12 on ICC...but online chess is perfect for lots of quicker time controls....even the super GM's don't play with standard OTB time controls online.

I do think every site would require more than just a few games in their 'cheating search' and ultimately a good human player should be equipped and paid to review the data and games themselves before saying someone has cheated. With so many caught cheating that may just be a final level used when the situation is more important - ex: tournaments, especially where $$ is involved.

On Chess.com my 'daily' rating grew to north of 2300, before I quit at some point for various reasons - ONE was how many people were deemed to be 'cheating'. It's insane how many chess.com was catching. In a way I did not mind all the cheaters because it forced me to play better...but then these tournaments had no $$ involved. I would have sorely hated the situation if there had been.

I might add that I think one should be required to obtain a premium membership to play any tournament online where $$ is involved.
That alone should serve to help discourage cheaters. If they get to play with a cheap or free membership, they really have little to lose vs the possible reward and should be able to play in these tournament ONLY upon completion of a hefty number of games played on the site which would then use that date to bounce ones $$ tournament games against. To paraphrase an saying I heard in an old movie about some savvy thieves "You don't crap where you work".

scchess · Post by **scchess** » Mon Sep 13, 2021 6:08 pm

Yes, there is no way to catch a cheater from just an excellent chess game. Actually we could, that'd improve the sensitivity of the model but then the false positive rate will also skyrocket.

Anti-cheating is just another way to run credit card fraud detection. How does your credit card company work out if a transaction is fraud? They will most likely keep a database of known fraud and legal transactions. Improve whatever model being used, it can be something like Deep Learning CNN. Each time there is a new model, re-run the model on known data to work out the false positive rates and any other metrics that could be useful. I think this process is known as backtesting.

towforce · Post by **towforce** » Mon Sep 13, 2021 6:11 pm

scchess wrote: ↑Mon Sep 13, 2021 2:23 amA simple toy-example is "z-score". Given a null hypothesis that you're not a cheater, how does your performance stand against everybody else in your rating group on a scaled normal distribution? Are we confident that you're a cheater because you're performance is 5 standard deviations away? How does the false positive rates look like if the rule is 5 SD, how does it look like if it's 3 SD? Can we plot something like a ROC curve? What'd the confusion matrix look like?

How would that avoid a false-positive in the case of a player who has greatly improved their play since the last game?

scchess · Post by **scchess** » Mon Sep 13, 2021 6:17 pm

towforce wrote: ↑Mon Sep 13, 2021 6:11 pm
scchess wrote: ↑Mon Sep 13, 2021 2:23 amA simple toy-example is "z-score". Given a null hypothesis that you're not a cheater, how does your performance stand against everybody else in your rating group on a scaled normal distribution? Are we confident that you're a cheater because you're performance is 5 standard deviations away? How does the false positive rates look like if the rule is 5 SD, how does it look like if it's 3 SD? Can we plot something like a ROC curve? What'd the confusion matrix look like?

How would that avoid a false-positive in the case of a player who has greatly improved their play since the last game?

That's why we need to factor in the effects of the confidence of one's ELO rating, performance history and the speed of human chess learning. Humans are not LC0, there is no such thing as reinforcement learning that can greatly improve chess in just a few days. Even a super talent junior needs years to reach a GM level. Humans, no matter how talent, have a limit of learning. They are not machines.

Anti-cheating

Re: Anti-cheating

Re: Anti-cheating

Re: Anti-cheating

Re: Anti-cheating

Re: Anti-cheating

Re: Anti-cheating

Re: Anti-cheating

Re: Anti-cheating

Re: Anti-cheating

Re: Anti-cheating