Quick ratings estimate?

Carey · Post by **Carey** » Mon Jul 28, 2008 5:50 pm

Is there any semi-reliable way to get a ratings estimate with minimal work?

Say +/- 100 ratings points?

I'm think of the old Bratko-Kopec test. I don't have the full paper unfortunately (what's on Mr. Bratko's webiste is incomplete), but I know they attempted to come up with a way to get a rough rating for both people and programs.

I know there were several later papers revising & revisiting their work. Trying to correct errors and make adjustments to the ratings.

I am not, however, familiar with any later research on this. I've never bothered to look.

I'm just thinking about how to get a score that's within +/- 100 points or something. Where you can take a couple dozen programs, run some quick autmated tests and rank them into general categories of strength. (Closer would be better, provided the amount of work would be minimal.)

Something where if somebody claims their program is 2800 or stronger than XYZ and you can run a 20 minute test and say, "Nope, it's 2100 +/-100 with 95% probability." kind of thing.

I know there have been a number of discussions on doing testing to with high accuracy and at various time controls and with various ratings methods etc.

I'm not wanting to get into those kind of discussions and I'm not caring about those kinds of accuracies anyway.

I'm just interested in rough estimates with minimal effort.

Incidentally, the subject of rating your chess program and doing automated testing and the accuracy levels needed, etc. would be a great subject for the chess programming wiki. All it has is basic testing for bugs kind of stuff.

cyberfish · Post by **cyberfish** » Wed Jul 30, 2008 7:36 am

Get an FICS computer account. Run the program automated on there for a night or two, and you will get a reasonable estimate. Not to mention the fun of watching your program slaughtering real humans, or being slaughtered as in my case

.

Carey · Post by **Carey** » Wed Jul 30, 2008 4:08 pm

Not practical.

#1 I'm wanting more of a process to get quick estimates, than the results.

#2 It's not just for me.

#3 some are stand-alone systems and not programs, meaning I'd have to sit there and manually make moves.

#4 some of the programs aren't mine and aren't designed for anything but human play. Meaning I'd have to sit there and manually make the moves.

#5 I'm just looking for some quick estimates, without having to play lots of games. I'm quite willing to sacrafice accuracy for speed.

The old Bratko-Kopec method was great, except the original process was poorly designed and the results were overly optimistic.

Some of the later modifications fixed some stuff, but I'm still not so sure how reliable it is. Even for my relaxed requirements.

Since I'm not very familiar with research in the area of quick-ratings estimates, I thought I'd ask.

bob · Post by **bob** » Thu Jul 31, 2008 12:01 am

Carey wrote:Not practical.

#1 I'm wanting more of a process to get quick estimates, than the results.

#2 It's not just for me.

#3 some are stand-alone systems and not programs, meaning I'd have to sit there and manually make moves.

#4 some of the programs aren't mine and aren't designed for anything but human play. Meaning I'd have to sit there and manually make the moves.

#5 I'm just looking for some quick estimates, without having to play lots of games. I'm quite willing to sacrafice accuracy for speed.

The old Bratko-Kopec method was great, except the original process was poorly designed and the results were overly optimistic.

Some of the later modifications fixed some stuff, but I'm still not so sure how reliable it is. Even for my relaxed requirements.

Since I'm not very familiar with research in the area of quick-ratings estimates, I thought I'd ask.

Unfortunately such tests are no good. But you can play automated tournaments and feed the results into something like BayesElo to get ratings...

Carey · Post by **Carey** » Thu Jul 31, 2008 12:37 am

bob wrote: Unfortunately such tests are no good. But you can play automated tournaments and feed the results into something like BayesElo to get ratings...

Unfortunately, as I explained, automated testing isn't an option.

It would have to be done manually. And I'm not willing to spend that much time doing it.

There are a couple stand-alone machines, as well as programs that have no ability to do any automated testing or standard interface that winboard etc. could be hooked into.

Old chess programs running under an emulator, others running on the PC but having only a GUI with moves done by the mouse, others (TBelle, for example) recompiled to run on modern hardware, etc. etc.

Automated testing is an impossibility.

That's why I was wanting a reduced test. I was willing to increase the error bounds in order to get any restuls, as long as they were somewhere in the right range.

That's why my question was done as a seperate post rather than in one of the other discussions about calculating ratings.

I know the B-K test is outdated and the results aren't representative. Even with a couple recalibrations etc., I figured 24 positions just weren't enough to get a feel for what a program was really capable of doing.

But I also figured there had probably been more research into it which might be more representative.

Even +/- 100 points or more would be good enough.

I wanting the most "bang for the buck", and I'm willing to use a shot-gun instead of a rifle, so to speak. In other words, easy results with an increased error bounds.

There's bound to be more current research into estimating program / player strength. I just don't know about it.

I suppose I could just go ahead and do the BK tests and report those results. They've been publicized enough and used often enough that it would at least give something to compare with.

Richard Allbert · Post by **Richard Allbert** » Thu Jul 31, 2008 2:27 pm

Then no, probably not.

I've tried many times doing what you described, but it is just too innaccurate.

The "elo" you get from the testsuite is a "testsuite elo", and has no real relation to in game performance.

I ended up buying a cheap s/h laptop which sits running 1'1" tournaments.

Richard

Ron Murawski · Post by **Ron Murawski** » Sat Aug 02, 2008 1:36 am

You can download the epd mega-download from the computer-chess wiki
http://computer-chess.org/doku.php?id=c ... load:index
Click on the 'more details' link to see which ones are Elo tests.

Jim Monaghan's IQ test is, I believe, the latest. Here's a direct link to Jim's original file.
http://horizonchess.com/Jim/IQ81.rar
The file contains the formula to convert percentage of correct solutions to Elo.

The IQ test asks to test each position for 10 seconds. Maybe by making this 5 seconds (or less) the Elo scores might be somewhat believable -- but probably not...

Ron

Quick ratings estimate?

Quick ratings estimate?

Re: Quick ratings estimate?

Re: Quick ratings estimate?

Re: Quick ratings estimate?

Re: Quick ratings estimate?

Re: Quick ratings estimate?

Re: Quick ratings estimate?