Correlation test reliability concerning clone issue.

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Aaron Becker
Posts: 292
Joined: Tue Jul 07, 2009 4:56 am

Re: Correlation test reliability concerning clone issue.

Post by Aaron Becker »

Statistics does not only apply to random events. It can also tell us things about populations (or distributions) of individuals, even when there's nothing random about the individuals themselves. In fact, I would say that this is its main use.

In our case, consider the population of all legal chess positions. If we knew how each engine responded to every position, we would be able to make very strong statements about how closely two engines are related (although, I would argue, this method would never provide proof of cloning). Clearly this is impossible to execute in practice, so instead we draw positions from this population and use statistical methods to compare the engines. There's nothing fundamentally unsound about it, as long as you're careful not to make any claims that aren't supported by the evidence.

More to your point it's absolutely not true that "with statistics you can prove anything you want". Math is math, you can only prove things that are true. You can convince a lot of people of things that aren't true if they don't understand statistics themselves and you convince them of your unsound conclusions, but that's hardly statistics' fault.
MattieShoes
Posts: 718
Joined: Fri Mar 20, 2009 8:59 pm

Re: Correlation test reliability concerning clone issue.

Post by MattieShoes »

I'm not a statistics guru but I think they generally don't "prove" things in the mathematical sense. Take the famous "is this coin fair" example with a bunch of flips. Even if you get 1000 heads in 1000 flips, you can't say it's proven to be unfair, just that it's incredibly likely to be unfair -- there is a vanishingly small chance that a completely fair coin could give those results. Sort of like the justice system's "beyond a reasonable doubt".

So what the test is saying is something more along the lines of "we're 95% sure that engine A is more similar to engine B than would normally be expected" or some such. To "prove" clonishness, you'd have to look at the source or decompile the program, something like that.

Given the work involved in proving such a claim, the tool could certainly help for indicating which engines deserve a harder look.
Hood
Posts: 657
Joined: Mon Feb 08, 2010 12:52 pm
Location: Polska, Warszawa

Re: Correlation test reliability concerning clone issue.

Post by Hood »

"More to your point it's absolutely not true that "with statistics you can prove anything you want". Math is math, you can only prove things that are true. You can convince a lot of people of things that aren't true if they don't understand statistics themselves and you convince them of your unsound conclusions, but that's hardly statistics' fault."

I am not believing too much in statistics, it is readable :-)
The prerequisite p is a weak point of that.
You have to to decide - assume if it is the Gauss or other distribution of variable. Blind statistics believer can make much harm.

Example:
Consulting company McKinley has prepared special evaluation method of workers in the office. They assumed that peoples are working statisticly as in the Gauss distribution so they decided that they assign the evaluations according to Gauss distribution - the number of people who receive the highiest evaluation is according to the Gauss distribution 1 sigma range.
The number of people who shall receive the lowest grade were according to the Gauss distribution, either.
What was the consequence, the manager has to act according to that principles and every month assign estimations to the workers. He shall assign the grades according to the Gauss distribution so they have to be certain % of highiest, medium and lowest grades. It was possible and it happens that even good working people had to receive low evalution to fullfill the distribution and opposite.

It would be funny if not the further consequence that after receiving some low grades someone was in the queue to be fired.

How do you like that ?

That fact showed me that we have to be very carefull with the statistics.
Every time when you want to use statistics try to imagine that you are the subject of that statistics !

Rgds
Chris
Polish National tragedy in Smoleńsk. President and all delegation murdered or killed.
Cui bono ?

There are not bugs free programs.
There are programs with undiscovered bugs.




Ashes to ashes dust to dust. Alleluia.
Hood
Posts: 657
Joined: Mon Feb 08, 2010 12:52 pm
Location: Polska, Warszawa

Re: Correlation test reliability concerning clone issue.

Post by Hood »

MattieShoes wrote: To "prove" clonishness, you'd have to look at the source or decompile the program, something like that.
I agree that only comparing sources can be a prove.

Dissasembled code it would be more like statistical animal ;-) in my feeling.

Rgds
Chris
Polish National tragedy in Smoleńsk. President and all delegation murdered or killed.
Cui bono ?

There are not bugs free programs.
There are programs with undiscovered bugs.




Ashes to ashes dust to dust. Alleluia.
MattieShoes
Posts: 718
Joined: Fri Mar 20, 2009 8:59 pm

Re: Correlation test reliability concerning clone issue.

Post by MattieShoes »

Hood wrote:"More to your point it's absolutely not true that "with statistics you can prove anything you want". Math is math, you can only prove things that are true. You can convince a lot of people of things that aren't true if they don't understand statistics themselves and you convince them of your unsound conclusions, but that's hardly statistics' fault."

I am not believing too much in statistics, it is readable :-)
The prerequisite p is a weak point of that.
You have to to decide - assume if it is the Gauss or other distribution of variable. Blind statistics believer can make much harm.

Example:
Consulting company McKinley has prepared special evaluation method of workers in the office. They assumed that peoples are working statisticly as in the Gauss distribution so they decided that they assign the evaluations according to Gauss distribution - the number of people who receive the highiest evaluation is according to the Gauss distribution 1 sigma range.
The number of people who shall receive the lowest grade were according to the Gauss distribution, either.
What was the consequence, the manager has to act according to that principles and every month assign estimations to the workers. He shall assign the grades according to the Gauss distribution so they have to be certain % of highiest, medium and lowest grades. It was possible and it happens that even good working people had to receive low evalution to fullfill the distribution and opposite.

It would be funny if not the further consequence that after receiving some low grades someone was in the queue to be fired.

How do you like that ?

That fact showed me that we have to be very carefull with the statistics.
Every time when you want to use statistics try to imagine that you are the subject of that statistics !

Rgds
Chris
Given a large enough sample (very large), a certain percentage will deserve to be fired. Jack Welch was famous for this at GE, firing the bottom 10% of his managers and giving large bonuses and stock options to the top 20%. GE flourished during his tenure.

The bigger issue I think would be getting honest evaluations from managers. I've worked for men who never gave bad reviews of attractive women. I've also worked for women who would always give bad reviews of other women but let males who were slacking off get away with it.
Aaron Becker
Posts: 292
Joined: Tue Jul 07, 2009 4:56 am

Re: Correlation test reliability concerning clone issue.

Post by Aaron Becker »

Hood wrote: It would be funny if not the further consequence that after receiving some low grades someone was in the queue to be fired.

How do you like that ?

That fact showed me that we have to be very carefull with the statistics.
Every time when you want to use statistics try to imagine that you are the subject of that statistics !

Rgds
Chris
Do you have specific problems with this method? I certainly agree that statistics can be abused, but I don't think that's what's happening here. Of course we should be careful, but we should also not dismiss useful methods just because statistics have been abused by others.
Hood
Posts: 657
Joined: Mon Feb 08, 2010 12:52 pm
Location: Polska, Warszawa

Re: Correlation test reliability concerning clone issue.

Post by Hood »

"Do you have specific problems with this method? I certainly agree that statistics can be abused, but I don't think that's what's happening here. "

Yes, I have. It assumes that there have to be ! wrong workers in particular group - small one without checking assumptions correctness.

It is like telling that because there is 40% people with yellow skin on the Earth it means that on this forum 40% of posters does have yellow skin.!
But it can be total absurd.

The other statistical manipulation is medium value of salary....in some population.

The medium salary can be enough to live but in practice there can be a lot of people dying from hunger.

Most of people are using statistical tools having no idea for what they are and when they can be used.

Rgds
Chris
Last edited by Hood on Sun Feb 14, 2010 1:02 pm, edited 2 times in total.
Polish National tragedy in Smoleńsk. President and all delegation murdered or killed.
Cui bono ?

There are not bugs free programs.
There are programs with undiscovered bugs.




Ashes to ashes dust to dust. Alleluia.
Hood
Posts: 657
Joined: Mon Feb 08, 2010 12:52 pm
Location: Polska, Warszawa

Re: Correlation test reliability concerning clone issue.

Post by Hood »

MattieShoes wrote:
Given a large enough sample (very large), a certain percentage will deserve to be fired. Jack Welch was famous for this at GE, firing the bottom 10% of his managers and giving large bonuses and stock options to the top 20%. GE flourished during his tenure.

The bigger issue I think would be getting honest evaluations from managers. I've worked for men who never gave bad reviews of attractive women. I've also worked for women who would always give bad reviews of other women but let males who were slacking off get away with it.
I do not agree that certain percentage has to be fired...
People work is not statistical animal ;-) and can not be treated as distribution.
That what happened in GE is robbery.

The problem with proper evaluation by managers is other but important matter.

Rgds
Chris
Polish National tragedy in Smoleńsk. President and all delegation murdered or killed.
Cui bono ?

There are not bugs free programs.
There are programs with undiscovered bugs.




Ashes to ashes dust to dust. Alleluia.
Aaron Becker
Posts: 292
Joined: Tue Jul 07, 2009 4:56 am

Re: Correlation test reliability concerning clone issue.

Post by Aaron Becker »

Hood wrote:"Do you have specific problems with this method? I certainly agree that statistics can be abused, but I don't think that's what's happening here. "

Yes, I have. It assumes that there have to be ! wrong workers in particular group - small one without checking assumptions correctness.

It is like telling that because there is 40% people with yellow skin on the Earth it means that on this forum 40% of posters does have yellow skin.!
But it can be total absurd.

The other statistical manipulation is medium value of salary....in some population.

Rgds
Chris
Sorry, I must not have been clear. When I said "this method", I meant the move correlation test that is the subject of the thread, not your obviously objectionable example.
Hood
Posts: 657
Joined: Mon Feb 08, 2010 12:52 pm
Location: Polska, Warszawa

Re: Correlation test reliability concerning clone issue.

Post by Hood »

"Sorry, I must not have been clear. When I said "this method", I meant the move correlation test that is the subject of the thread, not your obviously objectionable example."

Ok, it was our misunderstanding.

The origin of statistics is probability theory which was developed for random issues. Using it for not random issues is giving big risk and mostly wrong results.

I can agree that coin flipping is giving random results but choosing the moves by the algorithm is not random issue and can not be treated by statistics. It would be if algorithm will choose move on the base of random function.

All times the problem with 'p', if p is false than any result is true.

Rgds
Chris
Polish National tragedy in Smoleńsk. President and all delegation murdered or killed.
Cui bono ?

There are not bugs free programs.
There are programs with undiscovered bugs.




Ashes to ashes dust to dust. Alleluia.