18 days from SF4 release and about ~30+ ELO gain!

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

phenri
Posts: 284
Joined: Tue Aug 13, 2013 9:44 am

Re: 18 days from SF4 release and about ~30+ ELO gain!

Post by phenri »

Houdini wrote:(...)

Commercial engines will have to adapt to the new situation or disappear.

(...)
It only remains for you to release the source code for all of us see what houdini contained. ;)
Albert Silver
Posts: 3026
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: 18 days from SF4 release and about ~30+ ELO gain!

Post by Albert Silver »

kranium wrote:
I think the systems mentioned above are beyond reach of most developers
(the cheapest modern 32 core IBM system I could find is $44,648.00)
Is it really so advantageous to have a single 32-core rig as opposed to eight quads? I know the eight quads would be a lot cheaper, and not slower, but I am wondering about the costs to run them.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: 18 days from SF4 release and about ~30+ ELO gain!

Post by gladius »

Albert Silver wrote:
kranium wrote:
I think the systems mentioned above are beyond reach of most developers
(the cheapest modern 32 core IBM system I could find is $44,648.00)
Is it really so advantageous to have a single 32-core rig as opposed to eight quads? I know the eight quads would be a lot cheaper, and not slower, but I am wondering about the costs to run them.
IMO - yes. Managing eight machines would be a nightmare! You have to worry about all the components, 8x more things that go wrong than for one system. I'm sure the 32 core machine is much cheaper in electricity costs as well.

Btw, you can get much, much cheaper 32 core machines from dell, especially the AMD systems. Example: http://configure.us.dell.com/dellstore/ ... =bsd&cs=04 for 3.6k USD right now.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: 18 days from SF4 release and about ~30+ ELO gain!

Post by Don »

mcostalba wrote:
Houdini wrote:Indeed, the pace of Stockfish improvement is amazing, the development framework constructed by Gary is awesome.
Clearly no individual or two-person team can keep up with this in the long run, so this could mean the end of commercial chess engines as we currently know them. Maybe in 2 years time only Stockfish and derivates will continue to be developed.

Robert
Improvement cannot be foreseen in advance. It could be that next month we got 0 ELO, it happens and you know it.

But there is another side effect of open development that could be more threatening for commercial engines, a threat that was not foreseen in advance and that even I didn't realize it would be a problem. This is the obsolescence of release process: just few days after Stockfish 4 is out, almost all power users have dismissed it in favor of last nightly build (I just come here now from Playchess where there is even not one SF 4 but are all nightly builds): this is something commercial engines have no defense against, simply they cannot do this. As long as the ELO gap is big it is ok, but when the open developed engine reaches the level of commercials, a new compile each day can really badly affect the commercial release because it greatly speeds up its obsolescent.

I have to say that this was not foreseen and I am sorry for this, it is not out target (daily binaries are even built outside of SF team) but it is something that, considered the open nature of the development, it is almost impossible to avoid.
This is easily possible with commercial engines. You charge for the engine and you are promised daily upgrades or as they become available. When a new version is released you no longer get the updates unless of course you pay for that. So you are buy all the versions from one release to the next.

We often make 3 or 4 versions per day - not all are improvements but that is true of Stockfish too.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: 18 days from SF4 release and about ~30+ ELO gain!

Post by Don »

Adam Hair wrote:
kranium wrote:I believe 'ideas' are fairly easy to come by, for ex:
bonus for rook behind passed pawn
penalty for undefended piece
etc.

The difficult part is the testing...
i.e. the truly beneficial ideas are found and verified thru extensive testing...
and the more processing power you have, the more ideas can be tested
(as result: more good ones can/may be identified)
My biggest assumptions are that it takes time to encode ideas and that a single developer will, at times, have no useful ideas or need some time to recharge. Stockfish suffers from neither of these problems.
kranium wrote: I think the systems we are talking bout are beyond reach of most developers
(the cheapest modern 32 core IBM system I could find is $44,648.00)

A group of older systems is certainly better than nothing, but has the disadvantage of being very slow compared to today's moderen systems
i.e. the same # of tests would take much much longer to complete
I suspect that the average core in the Stockfish framework is not more than twice as fast than what I now have (Xeon L5420 2.5 GHz quad processors), and going half as fast is not much of a detriment considering that the queue of tests coming from one person is going to be shorter than that from a group of people.
In my opinion it's not the number of ideas - it's the amount of testing that is possible. CPU power has always been a major bottleneck for us. The ideas have never been a bottleneck.

Having many developers could even be a detriment because they CREATE yet another testing bottleneck. Suppose you have 50 developers queuing up 48 bad ideas - you once again have a testing bottleneck.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: 18 days from SF4 release and about ~30+ ELO gain!

Post by Don »

kranium wrote:I believe 'ideas' are fairly easy to come by, for ex:
bonus for rook behind passed pawn
penalty for undefended piece
etc.

The difficult part is the testing...
You hit the nail on the head. As I already mentioned in another post, testing is a SERIOUS bottleneck for us, ideas are not.

In a way this is sad for me - I feel that I can compete with anyone with ideas, but I cannot compete with testing horsepower which is really where it's at. It seems like chess has been locked forever in a horsepower race. It used to be whoever has the biggest machine wins the tournament, now it's whoever has the most testing resources. I am considering switching over to something that will reward creativity a lot more.

We have had people with serious hardware offer it to us - but not for the kind of testing that really makes a difference. They claim to want to support us substantially but it turns out that what they want is for us to provide them with a constant supply of updates so that they can play games on the servers. When the testers demand more than they return - they are not useful to us, even if they are well meaning.


i.e. the truly beneficial ideas are found and verified thru extensive testing...
and the more processing power you have, the more ideas can be tested
(as result: more good ones can/may be identified and implemented faster)
this is where the Stockfish distributed testing network shines...(the sky is the limit and it's free!)
Not necessarily. The secret is the good idea to bad idea ratio. If you have a lot of hardware, you can tolerate a higher bad idea ratio but that's not the best use of resources. I don't understand the Stockfish testing framework but presumably they have some sort of process to avoid testing EVERY idea someone may get - in most open source project someone is in charge of what gets accepted and tested and presumable that is how it work with this project.

R. Houdart:
"New ideas or code changes are first validated by playing against the current development version of Houdini. If the outcome is promising, the new version plays a tournament against 7 or 9 different opponents. In both stages typically 10,000 to 50,000 games are played, depending on the results. I use 2 servers (16-core and 32-core) to play about 100,000 games per day, with each game taking about 20 to 30 seconds."
(see Martin Thoresen's interview: http://www.tcec-chess.net/viewtopic.php?f=17&t=154)

I think the systems mentioned above are beyond reach of most developers
(the cheapest modern 32 core IBM system I could find is $44,648.00)

A group of older systems is certainly better than nothing, but has the disadvantage of being very slow compared to today's modern systems
i.e. the same # of tests would take much much longer to complete

I believe Houdini's author may have a huge advantage in this regard...
he credits a wealthy Abu Dhabi businessman (with whom I'm personally acquainted) whose customers include IBM on his web page:
"The Houdini 3 development was greatly aided by the kind support of Mr. Ahmed Mansoor"

I commented on Ahmed's 48 core Ivy Bridge E2697, and asked him about the relationship between he and R. Houdart here:
http://talkchess.com/forum/viewtopic.ph ... &start=120
but could not get an answer

My 2 cents-
Norm
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
hgm
Posts: 27894
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: 18 days from SF4 release and about ~30+ ELO gain!

Post by hgm »

mcostalba wrote:But there is another side effect of open development that could be more threatening for commercial engines, a threat that was not foreseen in advance and that even I didn't realize it would be a problem. This is the obsolescence of release process: just few days after Stockfish 4 is out, almost all power users have dismissed it in favor of last nightly build (I just come here now from Playchess where there is even not one SF 4 but are all nightly builds): this is something commercial engines have no defense against, simply they cannot do this.
I don't see why commercials could not do this. It seems quite trivial to post last-nights compile (if it is any good) on a website to which only customers have the password. Passwords can be made to expire after a certain Elo gain has been reached relative to the time when people bought that password, after which they have to buy a new one (just as they have to buy a new release now). The passwords would not even have to be known to the people buying them; you could justsell them a download client with a unique ID, and the server could keep track of which ID is used with which IP address, and block any IDs that are used from too many different IP addresses.
stevenaaus
Posts: 608
Joined: Wed Oct 13, 2010 9:44 am
Location: Australia

Re: 18 days from SF4 release and about ~30+ ELO gain!

Post by stevenaaus »

Houdini wrote:Indeed, the pace of Stockfish improvement is amazing, the development framework constructed by Gary is awesome.
Clearly no individual or two-person team can keep up with this in the long run, so this could mean the end of commercial chess engines as we currently know them. Maybe in 2 years time only Stockfish and derivates will continue to be developed.
mcostalba wrote: But there is another side effect of open development that could be more threatening for commercial engines, a threat that was not foreseen in advance and that even I didn't realize it would be a problem. This is the obsolescence of release process: .... but it is something that, considered the open nature of the development, it is almost impossible to avoid.
The analogies with operating systems and open source projects, is interesting. Especially Android, which is really powering ahead as a general purpose computing platform. These processes are an evolution all of it's own , and have strong and interesting analogies with the process of natural life.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: 18 days from SF4 release and about ~30+ ELO gain!

Post by mcostalba »

Don wrote: Not necessarily. The secret is the good idea to bad idea ratio. If you have a lot of hardware, you can tolerate a higher bad idea ratio but that's not the best use of resources. I don't understand the Stockfish testing framework but presumably they have some sort of process to avoid testing EVERY idea someone may get - in most open source project someone is in charge of what gets accepted and tested and presumable that is how it work with this project.
This was another decision that was made: not to pre-filter / veto any patch that is queued for testing. And it worked.

Yes you see dubious/crazy patches and yes, you see people that you can define enthusiasts more than engine developers. And at the beginning it was a kind of a bet to let anyone queueing up tests. But now I can say this is one of the key reasons of the success of the framework. We had good patches that we were happy to commit from people with a long track record of failures, but one day it happens that the patch they wrote was good or had some very good stuff in that nobody saw before. Allowing all people to join the development increases lateral thinking and increases experimenting with unusual/not mainstream stuff. This is where the biggest surprises come from, very rarely, but come.

Of course to make it work you need a very reliable testing procedure and actually the tests undergo a very strict and demanding process so that we are quite sure that whatever we commit is good. If we have some doubts we don't.

So one of the positive side effects of trying to balance resources to allow anybody to test but discard bad patches in a efficient way was the developing of a very sophisticated statistic based on SPRT, something really new in this field and I think the first time advanced statistic method for testing is used in chess engine development. This thanks to a bunch of mathematicians in our forum that laid out the quantitative foundation and made the simulations to validate the procedure. And it is still not finished, eventually in these days we are discussing on an even more advanced way to efficiently test what we call "simplifications", i.e. harmless removal of old code. This has been recognized to be an even harder problem than detecting good patches.
Uri Blass
Posts: 10420
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: 18 days from SF4 release and about ~30+ ELO gain!

Post by Uri Blass »

Adam Hair wrote:
Uri Blass wrote:
mar wrote:
Uri Blass wrote: I think that the main advantage is practically not the number of people but the number of machines that they can use when commercial programmers usually do not have 200 cores that work 24 hours per day for them.

Inspite of it I am not sure that stockfish is going to be developed faster.

Stockfish has also the disadvantage that it is open source so other can learn from it to improve their programs.
I think Adam is right. Do you really think that commercials can't build such framework?!
As for the rest, do you understand what it means being one step ahead? :)
P.S. I certainly won't miss Houdini.
I think that it is going to cost too much for the commercial to build the framework and use it.

It is clear that it is only a question of money and if the commercial are interested they can probably also buy most of the people who give ideas for stockfish and I think that it is going to cost them less money then the money that they need to use 200 cores for 24 hours per day.
Older 8 core Xeon servers are cheap and plentiful in the US on ebay. A commercial author based in the US could replicate the framework for under $5,000. I do not know if it can be done as cheaply in Europe, but the total cost is not prohibitive for a small business investment.

I really believe that the production of legitimate ideas is the limiting factor.
The question is not what is the price of buying the framework but what is the price of using it 24 hours per day.

As Don said I do not believe that the main problem of the commercial programmers is lack of ideas espacially when they can look at the patchs that people contribute to stockfish and get additional ideas and time to implement 10 new ideas in every day is not a problem for commercial programmers.

I think that they can clearly compete if they have enough money to buy the time of testers.

If they can find many testers who give a lot of computer time for free or almost for free then it is possible for them to compete but I am not sure if they can get it.