Yeah. The ideas are good. Also, features that are not related to the game itself such as ELO history, when was the account created etc.Ferdy wrote: ↑Mon Sep 13, 2021 9:26 amThat repo is interesting I have not read the code, but what I have in mind is to identify the features that will be used to detect cheaters. So the usual evaluation errors must be there, the stats on top 3 best moves or so, the number of blunders, mistakes, dubious moves, the number of difficult/complex positions where the best moves are found, high number of good moves found in complex positions with low time left and others. Then at the end we need a label of whether this set of features has 1 for a cheater or 0 for non cheater or a range of values from 1 to 0, where closer to 1 has a higher probability of cheating, then build a model from it, extract the weights of those features and apply those weights to a game (to be analyzed to get the features) that will be under examination for cheating.scchess wrote: ↑Mon Sep 13, 2021 3:03 amWhile I don't have lichess's data. It ---- looks ---- like the model is predicting if someone is a cheater or not, a binary decision. For example, https://github.com/clarkerubber/irwin/b ... ng.py#L105 looks like forming an array of binary labels. The loss function is binary_crossentropy (https://github.com/clarkerubber/irwin/b ... del.py#L81) so the assumption looks correct.scchess wrote: ↑Mon Sep 13, 2021 2:11 amCorrect. Lichess, true to the spirit, did publish the source code open. However, it's useless to the outside world in practice. There is no documentation the input features and what exactly the model is predicting. More importantly, there is no mention at the data format.Guenther wrote: ↑Sun Sep 12, 2021 7:38 pmWell, that's exactly what they did for lichess.
https://github.com/clarkerubber/irwin
By far the most important in the project is the input data. It looks like the project requires a connection to lichess' labelled cheater in their DB (https://github.com/clarkerubber/irwin/b ... ing.py#L42). The input data seems to be analyzed PGN data from Fishnet (https://github.com/clarkerubber/irwin/b ... me.py#L163).
For the actual model, it looks like an LSTM model. https://github.com/clarkerubber/irwin/b ... meModel.py. The model looks alright if the input data is game moves, because the game moves indeed form a sequence.
Overall, it looks like the project takes game analysis data from Fishnet to form a binary classification. Doesn't look like it takes the player's performance history, but I could be wrong. There's definitely more cheating code somewhere else in the main part of the lichess project.
Another idea is to create models based on rating range, so there can be a model to evaluate players in the rating range 1000 to 1500 or so.
Anti-cheating
Moderators: hgm, Rebel, chrisw
-
- Posts: 45
- Joined: Mon Jan 04, 2021 4:12 pm
- Full name: Ted Wong
Re: Anti-cheating
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Anti-cheating
Make a model with keras and TF from random data, with basic features such as mean error, blunder counts and others and make the model predict on some data after training.
Generate 1000 random data points, there are 50 samples below.
Input layer has 5 features (meanerror, blunder, mistake, dubious, top1pct) there are 2 hidden layers used and output layer has one node with sigmoid activation function.
The model is then evaluated getting an accuracy 97.90%
Tests on some data of how the model will predict.
The input has [meanerror, blunder, mistake, dubious, top1pct]
There is 1 false positive in 20 sampling tests.
Generate 1000 random data points, there are 50 samples below.
Code: Select all
meanerror blunder mistake dubious top1pct cheat
0 166 1 14 2 25.0 0
1 48 1 19 1 34.0 0
2 132 7 10 5 24.0 0
3 172 0 6 5 64.0 1
4 109 10 16 21 0.0 0
5 124 8 18 24 91.0 0
6 48 8 15 11 28.0 0
7 86 7 0 20 46.0 0
8 153 2 0 41 74.0 0
9 12 1 4 15 52.0 0
10 173 0 12 36 19.0 1
11 16 3 8 14 38.0 0
12 197 4 16 19 84.0 0
13 178 5 17 50 8.0 0
14 75 9 20 25 91.0 0
15 22 5 8 37 15.0 0
16 69 5 10 36 69.0 0
17 97 7 8 37 41.0 0
18 171 7 10 29 55.0 0
19 47 4 1 13 49.0 0
20 48 1 1 48 89.0 1
21 118 5 20 26 58.0 0
22 3 5 15 38 67.0 0
23 104 9 12 39 46.0 0
24 98 9 14 47 52.0 0
25 161 1 4 45 54.0 0
26 56 9 1 45 57.0 0
27 48 2 10 30 26.0 0
28 193 0 0 5 91.0 1
29 151 8 0 36 64.0 0
30 138 5 19 18 42.0 0
31 118 2 3 40 65.0 0
32 126 1 10 11 74.0 0
33 157 7 20 2 69.0 0
34 145 1 17 20 84.0 0
35 162 7 20 46 92.0 0
36 89 10 7 24 15.0 0
37 91 4 1 47 87.0 0
38 32 1 7 8 39.0 0
39 199 6 13 18 31.0 0
40 51 2 0 25 72.0 1
41 109 1 20 10 29.0 0
42 8 10 20 4 31.0 0
43 152 8 18 33 90.0 0
44 130 3 16 41 98.0 0
45 59 0 10 18 38.0 1
46 41 6 7 42 96.0 0
47 2 9 18 12 23.0 0
48 87 0 13 8 40.0 1
49 25 10 17 45 22.0 0
Code: Select all
model = Sequential()
model.add(Dense(8, input_dim=5, activation='relu'))
model.add(Dense(5, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# compile
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit data
model.fit(X, y, epochs=epoch, batch_size=32, verbose=0)
Tests on some data of how the model will predict.
The input has [meanerror, blunder, mistake, dubious, top1pct]
Code: Select all
input: [166.0, 1.0, 14.0, 2.0, 25.0], model_prediction: 0.32107 (actual 0.0)
input: [48.0, 1.0, 19.0, 1.0, 34.0], model_prediction: 0.24669 (actual 0.0)
input: [132.0, 7.0, 10.0, 5.0, 24.0], model_prediction: 0.00001 (actual 0.0)
input: [172.0, 0.0, 6.0, 5.0, 64.0], model_prediction: 0.79535 (actual 1.0)
input: [109.0, 10.0, 16.0, 21.0, 0.0], model_prediction: 0.00000 (actual 0.0)
input: [124.0, 8.0, 18.0, 24.0, 91.0], model_prediction: 0.00001 (actual 0.0)
input: [48.0, 8.0, 15.0, 11.0, 28.0], model_prediction: 0.00006 (actual 0.0)
input: [86.0, 7.0, 0.0, 20.0, 46.0], model_prediction: 0.00006 (actual 0.0)
input: [153.0, 2.0, 0.0, 41.0, 74.0], model_prediction: 0.13261 (actual 0.0)
input: [12.0, 1.0, 4.0, 15.0, 52.0], model_prediction: 0.96744 (actual 0.0)
input: [173.0, 0.0, 12.0, 36.0, 19.0], model_prediction: 0.73907 (actual 1.0)
input: [16.0, 3.0, 8.0, 14.0, 38.0], model_prediction: 0.02528 (actual 0.0)
input: [197.0, 4.0, 16.0, 19.0, 84.0], model_prediction: 0.00395 (actual 0.0)
input: [178.0, 5.0, 17.0, 50.0, 8.0], model_prediction: 0.00002 (actual 0.0)
input: [75.0, 9.0, 20.0, 25.0, 91.0], model_prediction: 0.00000 (actual 0.0)
input: [22.0, 5.0, 8.0, 37.0, 15.0], model_prediction: 0.00000 (actual 0.0)
input: [69.0, 5.0, 10.0, 36.0, 69.0], model_prediction: 0.00102 (actual 0.0)
input: [97.0, 7.0, 8.0, 37.0, 41.0], model_prediction: 0.00006 (actual 0.0)
input: [171.0, 7.0, 10.0, 29.0, 55.0], model_prediction: 0.00002 (actual 0.0)
input: [47.0, 4.0, 1.0, 13.0, 49.0], model_prediction: 0.00313 (actual 0.0)
Code: Select all
input: [12.0, 1.0, 4.0, 15.0, 52.0], model_prediction: 0.96744 (actual 0.0)
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Anti-cheating
History is indeed useful, as generally the strength of players will stabilize at some point, and after that there could be improvements but only by a small increment, sudden change in the trend could be a suspect. Here is a plot where p2 player seems to be doing something.scchess wrote: ↑Mon Sep 13, 2021 6:33 pmYeah. The ideas are good. Also, features that are not related to the game itself such as ELO history, when was the account created etc.Ferdy wrote: ↑Mon Sep 13, 2021 9:26 amThat repo is interesting I have not read the code, but what I have in mind is to identify the features that will be used to detect cheaters. So the usual evaluation errors must be there, the stats on top 3 best moves or so, the number of blunders, mistakes, dubious moves, the number of difficult/complex positions where the best moves are found, high number of good moves found in complex positions with low time left and others. Then at the end we need a label of whether this set of features has 1 for a cheater or 0 for non cheater or a range of values from 1 to 0, where closer to 1 has a higher probability of cheating, then build a model from it, extract the weights of those features and apply those weights to a game (to be analyzed to get the features) that will be under examination for cheating.scchess wrote: ↑Mon Sep 13, 2021 3:03 amWhile I don't have lichess's data. It ---- looks ---- like the model is predicting if someone is a cheater or not, a binary decision. For example, https://github.com/clarkerubber/irwin/b ... ng.py#L105 looks like forming an array of binary labels. The loss function is binary_crossentropy (https://github.com/clarkerubber/irwin/b ... del.py#L81) so the assumption looks correct.scchess wrote: ↑Mon Sep 13, 2021 2:11 amCorrect. Lichess, true to the spirit, did publish the source code open. However, it's useless to the outside world in practice. There is no documentation the input features and what exactly the model is predicting. More importantly, there is no mention at the data format.Guenther wrote: ↑Sun Sep 12, 2021 7:38 pmWell, that's exactly what they did for lichess.
https://github.com/clarkerubber/irwin
By far the most important in the project is the input data. It looks like the project requires a connection to lichess' labelled cheater in their DB (https://github.com/clarkerubber/irwin/b ... ing.py#L42). The input data seems to be analyzed PGN data from Fishnet (https://github.com/clarkerubber/irwin/b ... me.py#L163).
For the actual model, it looks like an LSTM model. https://github.com/clarkerubber/irwin/b ... meModel.py. The model looks alright if the input data is game moves, because the game moves indeed form a sequence.
Overall, it looks like the project takes game analysis data from Fishnet to form a binary classification. Doesn't look like it takes the player's performance history, but I could be wrong. There's definitely more cheating code somewhere else in the main part of the lichess project.
Another idea is to create models based on rating range, so there can be a model to evaluate players in the rating range 1000 to 1500 or so.
-
- Posts: 511
- Joined: Sun Apr 26, 2020 11:40 pm
- Full name: Brian D. Smith
Re: Anti-cheating
Not sure I understand all this, but am curious as to if/how you account for the situation in my last post...the 3 different...lets call them 'moods' I play in. Basically:Ferdy wrote: ↑Tue Sep 14, 2021 5:40 am
History is indeed useful, as generally the strength of players will stabilize at some point, and after that there could be improvements but only by a small increment, sudden change in the trend could be a suspect. Here is a plot where p2 player seems to be doing something.
1. Passing the time
2. Training by ambitious even risky play
3. Serious play.
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Anti-cheating
In the passing time mode you might lose some rating points. In the training mode, it could be close to equal. In the serious mode you might increase your rating. All these will not increase or decrease your rating by a wide increment/decrement, especially when you already have some games played, so by rating history criteria you are just fine.Cornfed wrote: ↑Tue Sep 14, 2021 6:37 amNot sure I understand all this, but am curious as to if/how you account for the situation in my last post...the 3 different...lets call them 'moods' I play in. Basically:Ferdy wrote: ↑Tue Sep 14, 2021 5:40 am
History is indeed useful, as generally the strength of players will stabilize at some point, and after that there could be improvements but only by a small increment, sudden change in the trend could be a suspect. Here is a plot where p2 player seems to be doing something.
1. Passing the time
2. Training by ambitious even risky play
3. Serious play.
At some point the strength of a player will stabilize at certain rating range. For example my lichess blitz rating is around 2200 but so far I find it difficult to hold at that rating, I can go 2100 to 2250. I can hold at 2100 and below 2200, I need more training to hold the 2200 and above. If I play unfairly and suddenly (after 3 months or so) can hold at 2300 comfortably I could become a suspect.
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Anti-cheating
With clock or time left info in the game notation, time per move or movetime can be calculated.
Most novice cheater plays with almost constant movetime. By calculating the stdev, mean and CV - coefficient of variation or stdev/mean without looking at the plot, one can tell that a player is probably doing something wrong if movetime CV is below 1.
Example white player c is probably using an engine, its CV is 0.4, low mean error, zero blunder and low mistake count and won the game.
Movetime plot in the actual game.
Code: Select all
1. e4 { [%clk 0:03:00] } 1... e5 { [%clk 0:03:00] } 2. Nf3 { [%clk 0:02:56] } 2... Nc6 { [%clk 0:03:00] } ...
Example white player c is probably using an engine, its CV is 0.4, low mean error, zero blunder and low mistake count and won the game.
Code: Select all
white: c
result: 1
mean error: 22.2 cp
blunder: 0
mistake: 1
dubious: 9
top1: 50.0%
movetime stdev: 1.3 sec
movetime min: 0.0 sec
movetime max: 7.0 sec
movetime mean: 3.0 sec
movetime CV (coefficient of variation): 0.4
black: p
result: 0
mean error: 60.8 cp
blunder: 0
mistake: 5
dubious: 5
top1: 56.5%
movetime stdev: 5.5 sec
movetime min: 0.0 sec
movetime max: 22.0 sec
movetime mean: 4.5 sec
movetime CV (coefficient of variation): 1.2
Movetime plot in the actual game.
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Anti-cheating
Detecting anomalies in rating history. Plot the actual rating, take the mean and stdev from the rolling 14 days (take the mean in 14 days ...), plot the mean with confidence interval (CI) at twice the stdev. The blue is the actual rating, the green is the mean, the red covers the confidence interval.
Any point outside the CI could be a suspect (excluding at the start). In the plot there are 3 locations that seems to be outliers. That is possible when you beat a higher rated player in a tournament for example so you gain more rating. It is also possible that one is cheating every now and then especially when there is tournament with prizes.
Zooming on the three locations.
Any point outside the CI could be a suspect (excluding at the start). In the plot there are 3 locations that seems to be outliers. That is possible when you beat a higher rated player in a tournament for example so you gain more rating. It is also possible that one is cheating every now and then especially when there is tournament with prizes.
Zooming on the three locations.
-
- Posts: 710
- Joined: Sat Dec 06, 2014 1:53 pm
Re: Anti-cheating
I have no idea about your Target audience, what is their motivation and what is typical "portrait of a criminal" you are going to fight with.
Dumb cheater will have animalities in rating, and animalities in in particular game which doesn't correspond to human game at all, or to humans with his particular ELO rating at the moment.
I guess 99% of your target audience are extremely unmotivated cheaters (occasional by-passers) with zero intention to spend time or money on developing cheating algorithm.
Motivated cheater will have long-term strategy for ELO (can be copy-pasted from some other person known as non-cheater), strategy on blunders, mistakes, never-top-moves, strategy on move time.
-
- Posts: 511
- Joined: Sun Apr 26, 2020 11:40 pm
- Full name: Brian D. Smith
Re: Anti-cheating
Let him play enough games...and he will get caught.yurikvelo wrote: ↑Thu Sep 16, 2021 6:44 pmI have no idea about your Target audience, what is their motivation and what is typical "portrait of a criminal" you are going to fight with.
Dumb cheater will have animalities in rating, and animalities in in particular game which doesn't correspond to human game at all, or to humans with his particular ELO rating at the moment.
I guess 99% of your target audience are extremely unmotivated cheaters (occasional by-passers) with zero intention to spend time or money on developing cheating algorithm.
Motivated cheater will have long-term strategy for ELO (can be copy-pasted from some other person known as non-cheater), strategy on blunders, mistakes, never-top-moves, strategy on move time.
And I've always thought it was good (where applicable) to sort of cross reference one's online results with their OTB player. An 1800 OTB player is not likely to be a 2600 Online player...
-
- Posts: 45
- Joined: Mon Jan 04, 2021 4:12 pm
- Full name: Ted Wong
Re: Anti-cheating
It's all about sensitivity vs specificity, as in life nothing will ever be perfect. A very powerful sensitive algorithm will generate a lot of false positives. Not giving any false positives will just mean do nothing. As I mentioned repeatedly, this field is much more than just engine evaluations, it's more like a statistical problem where controlling false positive rates is critical here. Also, anyone who has a plan of doing such a thing may go below the sensitivity threshold, but the time spent against benefits won't match up. There're a lot of things to do in life than such a long tedious plan.yurikvelo wrote: ↑Thu Sep 16, 2021 6:44 pmI have no idea about your Target audience, what is their motivation and what is typical "portrait of a criminal" you are going to fight with.
Dumb cheater will have animalities in rating, and animalities in in particular game which doesn't correspond to human game at all, or to humans with his particular ELO rating at the moment.
I guess 99% of your target audience are extremely unmotivated cheaters (occasional by-passers) with zero intention to spend time or money on developing cheating algorithm.
Motivated cheater will have long-term strategy for ELO (can be copy-pasted from some other person known as non-cheater), strategy on blunders, mistakes, never-top-moves, strategy on move time.