Don wrote:diep wrote:Don wrote:diep wrote:
We seldom agree, but in this case we do.
Rybka3 is on CCRL 3134 and Houdini which is like 7 plies deeper searching yet having the same eval, it's 3208.
That's 70 elopoints for 7 plies.
However in case of Komodo he's winning 2 ply already at a depth of 10, whereas he's heavy forward pruning last few plies.
It means Komodo without LMR is hopelessly inefficient, so that changes the equasion as i assume Don didn't cut'n paste the evaluation of rybka.
The point is that Don's search is total inefficient *without* LMR.
Komodo without LMR is still stronger than most programs. But what does that matter? This would be like me saying Diep is totally dependent on alpha/beta pruning or else it's hopelessly inefficient. That would be a true but meaningless statement because what matters is how you put everything together.
This basically means don gets more out of LMR than rybka - which could be very true as rybka of course forward prunes a lot last few plies.
Also i advice to Don to quote Johan de Koning to not do incesttesting as incesttesting never is good idea.
Vincent
I have no idea what you are talking about - but we do all our testing against other programs, not komodo vs komodo. Is that what you are talking about?
Don
The most stupid way of forward pruning, if i enable it in Diep, it's 100 elo stronger for Diep in blitz. It's 300 elo stronger in diep-diep. It's 60 elo weaker if i play against other programs.
If i enable multicut. It's a lot of elo stronger at fast time controls single core and especially in diep - diep, and at slower time controls it's a LOT weaker against other programs. Most importantly i noticed that if i enable it, it searches a ply deeper. That matters most 10 to 12 ply.
When diep already gets above 14 ply search depth as a minimum, then multicut no longer gives elo.
I see multi-cut as a drop in replacement for null move pruning but probably not quite as good. We did a lot of experimenting with it and the mult-cut version was about 20 ELO weaker if I remember correctly. But I'm not one to pass judgment too hastily, it could be that we never stumbled on the right formula for doing it.
We can mathematically prove nullmove to be a stronger algorithm than multicut for computerchess. Take me correct - multicut is a real invention.
It's a big thing to invent - all credits to Ingvi there who invented it in 90s.
It has a similar weakness like fail-high reductions and another one. They all have to do with hashtable.
The weakness with hashtable is that there is no cheap way of verifying that all 3 paths transpose into the 100% same refutation.
In chess this is a big issue as it's very common for paths to converge into the same refutation.
Say we can do Bxh6+ here, Qxg6+ or Qd8+.
All 3 transposte further away from the root into the same position upon which we base our entire multicut.
So the algorithm is really transpositionsensitive. There will be undoubtfully ways to undo this, but that obviously will take care that your transpositiontable will work less.
If we nullmove in the same position we tend to reduce with a bigger R than we can do with multicut. Giving the opponent the move is a very strong assumption.
We do not suffer from real ugly transpositions then unlike with multicut.
So nullmove in general is far more powerful than multicut as we can reduce more and we do not suffer from the transpositiontable effect.
QED.
So with drop in i assume you mean that they try to reduce basically the same search space. This is entirely correct - yet nullmove is doing this in a far more correct manner than multicut.
Also where nullmove doesn't work because the opponent can capture a piece of us, yet where normal search gives a cutoff, there multicut has usually a problem to give a cutoff, as it needs 3 cutoffs. It doesn't happen a lot that you can capture a piece in 3 manners. AND IF YOU CAN THEN WE HAVE THE TRANSPOSITION BUG AGAIN IN SEVERAL CASES.
I experimented a lot as well with multicut and i'm sure with some tricks you can eliminate the transpositiontable problem a tad.
Yet even then the reduction factor you need for multicut is rather huge. At least you need to reduce a ply or 3 (on top of the normal reduction, to get any benefit at all out of multicut.
In principle you reduce your position P, without giving the opponent some sort of benefit like with nullmove, with a bunch of plies.
That's very tricky at least.
Doing that i can win 1 ply search depth with Diep. Is it worth to win 1 ply meanwhile having a reduction of in total 3 plies in what is possible a critical line?
For some who just test bullet i'm sure it might work and some will find it working at all levels for them, but i garantuee you, nullmove is far stronger assumption than multicut.
Now you claim super-bullet time controls, and something that gives you 2 ply at 10 ply search depth, this for an engine that's gettingeasily 20-25 ply,
and you test komodo versus komodo.
That's not science.
Science must first start with facts and you apparently don't have any idea about what we do because you have this all wrong.
We are very concerned with scalability and if you look at the rating lists you will notice that the longer the time control the better Komodo does. We have put a significant amount of time into understanding what works and how it's affected by depth.
We don't test Komodo versions against each other. We test in gauntlet fashion where each new candidate plays several foreign programs and not other Komodo versions.
So you are just saying stupid things that are not factual.
We agree that you have to start somewhere.
what i simply see is that you played komodo without lmr against komodo with lmr at a small search depth.
Now There is a ton of algorithms that will give diep BIG ELO at bullet/blitz single core.
I gave a few examples.
Especially if we consider you also forward prune heavily last few plies, practical your 10 ply search depth is equal to Diep search depth 7 or so, from tactical viewpoint.
At that depth of course *anything* get gets me search deeper will give elo.
I do agree on one thing, some types of changes can help at fast time controls and hurt at long time controls - there is no doubt about that. There are also things that help only at long time controls and those things sometimes don't make it into most programs because they are too hard to test where you need 20,000 games to prove an idea.
I'm more than anyone else in the know how with just 1 machine at home you can't do much.
In this case: more than 150 elopoints are rather easy to prove.
Additionally it's you who in 90s already posted the indication that at bigger search depths the elowin for basically anything is smaller.
Claiming the opposite now is contradicting that claim, and rather naive claim.
I personally think you are too quick to draw conclusions about what works and what doesn't work. You talk about science but you don't use science at all, everything about you is intuition driven. Even your program has strengths and weaknesses that are driven by whether your intuition was good or bad.
I'm not doing a claim that LMR is 150 elopoints at super-bullet of say 0.1 seconds a move and THEREFORE it gives more elo at slower time controls.
That's not only counter-intuitiion, it's also dead wrong science.
Furthermore much of the forward pruning experiments i did do, they are at a BIGGER search depth actually than your superbullet. And that already around 11-12 years ago.
Sure it was done at 36 computers and you didn't have 36 computers. Usually it took me a whole day to setup the 36 computers, besides the driving time to Jan Louwman. He then reported back after a week or 3.
So each experiment also took way longer than what you posted here.
Usually around a 500 games with and 500 games without X it was.
Obviously i couldn't repeat this too much and it layed a big stress on Jan.
Calling that intuition driven is dead wrong. I had probably more accurate data than anyone except a few chessprogrammrs with machines at home, to prove or disprove specific algorithms.
Also another hard fact is that you need far less games in slower time control games than you'd need in bullet to prove anything.
About all chessprogrammers that i know of confirmed this in the tournaments they showed up at.
Some algorithms require overhead. Even at superbullet you won't have enough system time for that overhead.
There will be many algorithms that super-bullet willl never discover.
Yet many simple type algorithms, no matter how big of an invention, they just do not work very well in computerchess, yet in many programs they work in superbullet.
The pattern I see over and over again in computer chess is that it is full of superstition and conjecture. The only way to have a really strong program today (other than copying someone else and calling it yours) is to leave your superstitions at the door and open up your mind and never take yourself (or your opinions) too seriously or you will just end up painting yourself into a corner.
I'm not here to impose what sort of social behaviour others must follow. I do notice a lot of bad science though.
Here is a thought experiment. Imagine that computers continue to get faster and faster until they are again 100 times faster than today. Are we still going to have the argument that things that work at 1 minute will not work at 1 hour? Because 1 hour now is like 36 seconds will be then. If
A single core today is not factor 100 faster than 12 years ago.
The cores i have here are 2.5Ghz core2 Xeons. And 64 cores of them
Back in 2001 a forward pruning experiment at Jan Louwmans computer with 1000 games in total (500 with and around 500 without), which scored 20% worse against the world top in those days, at slow level time controls,
we used mostly k7 machines as well as p3's. The slower P3's at 800Mhz i put at a time control of 9 hours a game. The 1.x Ghz k7's were put at 40 in 2.
That's 3 minutes a move. So in Ghz minute a move that's 1.6Ghz * 3 = 4.8Ghz minute a move.
Your superbullet, i don't know what your hardware is, could be high clocked i7 or something, but if you'd have 2.5Ghz core2/i7 based hardware that's.
2.5Ghz * 0.1 seconds / 60 = 0.04167 Ghz minute a move.
Even the testing from 12 years ago, with factor 100 faster hardware, you won't do better than what i did do in 2001 at single core machines.
Yet you know just as well as i do, that a single core in 10 years from now won't be factor 100 faster. You'll have to scale SMP.
we had had this discussion 20 years ago what would we have concluded?
We would've concluded that searching at 2 ply search depths would be pretty stupid.
What most here have forgotten is my statements from end 90s, which is that you first need to get through a tactical barrier.
That barrier for todays programs lies somewhere 17+ plies or so. For software from a year or 10 ago it was somewhere around a 12-14 ply.
Above that testing becomes easier as a lot of the tactical noise you stumble upon gets away and positional and strategical factors tend to become more important.
The huge reduction factors used for nullmove and even LMR nowadays definitely give indication of that tactical barrier.
If it wouldn't be there, you would not be able to use such huge reduction factors.
It seems to me that in 10 or 20 years we will have to "adjust" our programs over and over again to be strong at the only levels we can reasonably test. When we work on scalability we take the cognitive shortcut of assuming that there are only 2 search depths, anything below a few ply which are "fast" and everything above that, which is "slow" and that beyond that an idea either "works" or "does not work." So you get language such as, "it does not work at long time controls." But long and short are highly relative concepts. Komodo at game in 1 second would be like Sargon on the z80 playing a correspondence game.
At factor 100 faster hardware you would first need to get a speedup of factor 100.0 and even then you barely are having the accuracy of the algorithmic experiments i did do in 2001 as testing at factor 100 faster hardware at 0.1 seconds a move is a lot slower than game in 1 second and 0.04 * 100 = 4.0 which still is slower than what i did do in 2001.
entire game in 1 second at factor 100 faster hardware, probably by then games are 100 moves if not more, means you can take at most 10 milliseconds a move.
So at multiplce socket machines you'll get dicked as the fastest timing you can do there is gonna eat 33 milliseconds or so from the runqueue.
But let's continue the throught experiment.
10 milliseconds at the virtual 250Ghz core2 hardware that's in Ghz minute a move:
250 ghz * 0.01 / 60 = 0.04Ghz/minute.
Or about equal to the time control you tested at now and factor 100 slower than what i tested at in 2001.
Thanks,
Vincent