What Kai shows is just contempt since data is from the multiple opponents tournament which is useless for scaling purposes as already proven without doubt (one just needs to look at draw percentages, only a blind man or someone advertising a product would not see the obvious). You also never show real data. You just "show" numbers. And since you are selling a product, sorry that I don't believe your numbers. Show us some real PGNs and then we can talk. Till then I call BS on your results about superior scaling and I am certainly not the only one here.mjlef wrote:You present no data and suggest worthless modifications to programs to ruin what data might be collected. Kai just showed the scaling effect in the time control ranges he presented. It is possible that scaling could change remarkably at a much longer time control. But we have not said it would, and neither has Kai.
You are not taking this seriously, so I will stop taking you seriously too.
Scaling of engines from FGRL rating list
Moderators: hgm, Rebel, chrisw
-
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: Scaling of engines from FGRL rating list.
-
- Posts: 4367
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: Scaling of engines from FGRL rating list
I don't think it is so strong that it is bumping against the limits of what is possible in terms of strength.
But I am quite amazed at how strong it is tactically, even compared to Houdini and Komodo.
--Jon
But I am quite amazed at how strong it is tactically, even compared to Houdini and Komodo.
--Jon
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Scaling of engines from FGRL rating list.
I took 10 engines as shown in excellent FGRL Top 10 rating list, I was not intending to compare Komodo and Stockfish, and they are anyway close in scaling, at least in my first results, maybe within error margins. Andscacs 0.89 and Fritz 15 stand out as well and respectively badly scaling probably outside error margins. It's not very hard to come up even with an inversion due to scaling. I knew that Ippos are scaling badly. So I compared RobboLito 0.10, antecessor of Houdini 5, and very close in rating Komodo 5, antecessor of Komodo 10.Milos wrote:What Kai shows is just contempt since data is from the multiple opponents tournament which is useless for scaling purposes as already proven without doubt (one just needs to look at draw percentages, only a blind man or someone advertising a product would not see the obvious). You also never show real data. You just "show" numbers. And since you are selling a product, sorry that I don't believe your numbers. Show us some real PGNs and then we can talk. Till then I call BS on your results about superior scaling and I am certainly not the only one here.mjlef wrote:You present no data and suggest worthless modifications to programs to ruin what data might be collected. Kai just showed the scaling effect in the time control ranges he presented. It is possible that scaling could change remarkably at a much longer time control. But we have not said it would, and neither has Kai.
You are not taking this seriously, so I will stop taking you seriously too.
100ms/move:
Code: Select all
Games Completed = 1000 of 1000 (Avg game length = 14.311 sec)
Settings = Gauntlet/32MB/100ms per move/M 600cp for 3 moves, D 120 moves/EPD:C:\LittleBlitzer\2moves_v1.epd(32000)
Time = 3695 sec elapsed, 0 sec remaining
1. Komodo 5 64-bit 474.5/1000 341-392-267 (L: m=1 t=0 i=0 a=391) (D: r=121 i=44 f=10 s=1 a=91) (tpm=110.3 d=12.92 nps=1710990)
2. RobboLito 0.10 SMP x64 525.5/1000 392-341-267 (L: m=1 t=0 i=0 a=340) (D: r=121 i=44 f=10 s=1 a=91) (tpm=108.2 d=12.61 nps=2330838)
Code: Select all
Games Completed = 1000 of 1000 (Avg game length = 72.125 sec)
Settings = Gauntlet/32MB/500ms per move/M 600cp for 3 moves, D 120 moves/EPD:C:\LittleBlitzer\2moves_v1.epd(32000)
Time = 18221 sec elapsed, 0 sec remaining
1. Komodo 5 64-bit 540.0/1000 364-284-352 (L: m=0 t=0 i=0 a=284) (D: r=155 i=66 f=7 s=4 a=120) (tpm=506.1 d=15.97 nps=1670582)
2. RobboLito 0.10 SMP x64 460.0/1000 284-364-352 (L: m=0 t=0 i=0 a=364) (D: r=155 i=66 f=7 s=4 a=120) (tpm=510.7 d=15.60 nps=2321888)
-
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: Scaling of engines from FGRL rating list.
This is a nice example for inversion but I would not attribute it really to actual strength scaling more to the fact that Komodo at that time was known to be slow engine, i.e. particularly bad at hyperbullet, probably due to extra cautious time management and slow search initialization, while Robbo on the other hand was known for time management really optimized for hyperbullet as well as very quick search initialization code and also very well optimized code which resulted in inflated rating at hyperbullet.Laskos wrote:Andscacs 0.89 and Fritz 15 stand out as well and respectively badly scaling probably outside error margins. It's not very hard to come up even with an inversion due to scaling. I knew that Ippos are scaling badly. So I compared RobboLito 0.10, antecessor of Houdini 5, and very close in rating Komodo 5, antecessor of Komodo 10.
100ms/move:500ms/move:Code: Select all
Games Completed = 1000 of 1000 (Avg game length = 14.311 sec) Settings = Gauntlet/32MB/100ms per move/M 600cp for 3 moves, D 120 moves/EPD:C:\LittleBlitzer\2moves_v1.epd(32000) Time = 3695 sec elapsed, 0 sec remaining 1. Komodo 5 64-bit 474.5/1000 341-392-267 (L: m=1 t=0 i=0 a=391) (D: r=121 i=44 f=10 s=1 a=91) (tpm=110.3 d=12.92 nps=1710990) 2. RobboLito 0.10 SMP x64 525.5/1000 392-341-267 (L: m=1 t=0 i=0 a=340) (D: r=121 i=44 f=10 s=1 a=91) (tpm=108.2 d=12.61 nps=2330838)
The result is outside 2SD interval. This inversion cannot be assigned to any kind of Contempt. Engines can scale differently, if you take an ancient engine like Mephisto Gideon in modern conditions, you will see that its doubling at 40/60'' is roughly 60 Elo points, while a modern engine can bring 120 Elo points at this time control.Code: Select all
Games Completed = 1000 of 1000 (Avg game length = 72.125 sec) Settings = Gauntlet/32MB/500ms per move/M 600cp for 3 moves, D 120 moves/EPD:C:\LittleBlitzer\2moves_v1.epd(32000) Time = 18221 sec elapsed, 0 sec remaining 1. Komodo 5 64-bit 540.0/1000 364-284-352 (L: m=0 t=0 i=0 a=284) (D: r=155 i=66 f=7 s=4 a=120) (tpm=506.1 d=15.97 nps=1670582) 2. RobboLito 0.10 SMP x64 460.0/1000 284-364-352 (L: m=0 t=0 i=0 a=364) (D: r=155 i=66 f=7 s=4 a=120) (tpm=510.7 d=15.60 nps=2321888)
-
- Posts: 2283
- Joined: Sat Jun 02, 2012 2:13 am
Re: Scaling of engines from FGRL rating list
The fallacy in the interpretation you posted is that the top engine should be extremely close to perfection at any time control (TC), no matter how short, thus leaving no room for fast scaling. Even if you could allow for near-perfect strength at LTC, it cannot hold true at a very STC.Isaac wrote:Is the following guess/interpretation gibberish or plausible? :
Since Stockfish 8 seems to be the highest rated engine in the list, it is closer to a perfect player and finds "the best" move quicker than all the other engines in average. Hence it cannot scale much better with more time control because there is not as much strength to improve compared to any other engine.
Thank you math/logic guys for answering.
Stockfish or whatever engine is best right now, is not that close to being perfect, and even more so at short time controls. Lots of room should therefore remain for 'scaling' towards more strength with more time given.
Of course, the strongest engine at a given time control may not necessarily be the better/best 'scaler', and this will not guarantee being on top at all TCs.
CL
-
- Posts: 1494
- Joined: Thu Mar 30, 2006 2:08 pm
Re: Scaling of engines from FGRL rating list.
Selective editing I see. I said:jhellis3 wrote:Thanks for the lecture prof Mark, you are such a smart guy . Nothing I like more than being talked down to.... .Science and absolute certainty:
Like I said earlier (perhaps you are a bit slow on the uptake?), I am not here to convince anybody. I am not here to promote an agenda *cough*. I present my viewpoints, and let other people do with them what they may.I think your flippant remarks are not helping you convince people.
In my view, false belief is its own punishment .
Quote:
In science there is no "absolute certainty"
You responded:
Actually, there is. It is called reality.
That is incorrect. My statement is just what this conversation is about. You need lots of data in science to show something is likely right. "absolute reality" is not science and has nothing to do with my point. Instead of responding to the issues you launch ineffective rhetoric.
I am not trying to talk down to you, but you do like rhetoric which in this case is not useful. There is a lot to learn here if we are all willing to listen.
-
- Posts: 546
- Joined: Sat Aug 17, 2013 12:36 am
Re: Scaling of engines from FGRL rating list.
Just not for you right... gross.There is a lot to learn here if we are all willing to listen.
More down talking... Jesus Wept.You need lots of data in science to show something is likely right
Right.... which is why you did it again... At any rate I will take well reasoned "rhetoric" (if actual games played by engines is called that now) over snake oil and bad science any day of the week...I am not trying to talk down to you, but you do like rhetoric which in this case is not useful.
-
- Posts: 265
- Joined: Sat Feb 22, 2014 8:37 pm
Re: Scaling of engines from FGRL rating list
I agree with you, thank you for the reply. It makes sense.carldaman wrote:The fallacy in the interpretation you posted is that the top engine should be extremely close to perfection at any time control (TC), no matter how short, thus leaving no room for fast scaling. Even if you could allow for near-perfect strength at LTC, it cannot hold true at a very STC.Isaac wrote:Is the following guess/interpretation gibberish or plausible? :
Since Stockfish 8 seems to be the highest rated engine in the list, it is closer to a perfect player and finds "the best" move quicker than all the other engines in average. Hence it cannot scale much better with more time control because there is not as much strength to improve compared to any other engine.
Thank you math/logic guys for answering.
Stockfish or whatever engine is best right now, is not that close to being perfect, and even more so at short time controls. Lots of room should therefore remain for 'scaling' towards more strength with more time given.
Of course, the strongest engine at a given time control may not necessarily be the better/best 'scaler', and this will not guarantee being on top at all TCs.
CL
-
- Posts: 265
- Joined: Sat Feb 22, 2014 8:37 pm
Re: Scaling of engines from FGRL rating list.
Hello Joseph, I would like to understand you here but I fail to see the difference between scaling better with increasing time and scaling badly with decreasing time. To me, it is exactly the same, just another way of describing the same effect.jhellis3 wrote:Instead of saying Andscacs scales well with increasing time, we might say it actually just scales horribly with decreasing time. The problem is we only looked at 2 data points and have no way of knowing for sure, without broadening our scope. And it doesn't even have to be one or the other, but could potentially be a combination of both, where Andscacs does scale better with more time but not nearly as much as it first appears because it also scales relatively poorly with less time.
For example I can't imagine a way to scale both well at increasing and decreasing time control. Will you (or any other) please help me to figure this particular case out? Thank you.
-
- Posts: 2204
- Joined: Sat Jan 18, 2014 10:24 am
- Location: Andorra
Re: Scaling of engines from FGRL rating list.
You can tweak the search of an engine to make it play worst at short time control, but more or less equal at ltc. So not that is talking about doing it on purpose, but that some improvements on the engine produced this effect instead of the more desirable one of to improve it at all time controls.Isaac wrote: Hello Joseph, I would like to understand you here but I fail to see the difference between scaling better with increasing time and scaling badly with decreasing time. To me, it is exactly the same, just another way of describing the same effect.
For example I can't imagine a way to scale both well at increasing and decreasing time control. Will you (or any other) please help me to figure this particular case out? Thank you.
Anyway this is more on the speculation field, as the "character" of an engine is due to so many parts that is just open to any interpretation.
Daniel José - http://www.andscacs.com