Glaurung 2 and SMP

Discussion of chess software programming and technical issues.

Moderator: Ras

hcyrano

Glaurung 2 and SMP

Post by hcyrano »

hi all

sorry my english is not perfect :(

I read glaurung2 code, only SMP (multiprocessing) i write a othello program

and i have one problem, why all data of class Thread are bad synchronized.

protected on write, but not protected on read ???? in rare case u may be have "stale" data.

have you some problem with this code?

thx
Tord Romstad
Posts: 1808
Joined: Wed Mar 08, 2006 9:19 pm
Location: Oslo, Norway

Re: Glaurung 2 and SMP

Post by Tord Romstad »

Hello Bruno,

Nothing wrong with your English, as far as I can see -- but I am also no native speaker. :)
hcyrano wrote:hi all

sorry my english is not perfect :(

I read glaurung2 code, only SMP (multiprocessing) i write a othello program

and i have one problem, why all data of class Thread are bad synchronized.

protected on write, but not protected on read ???? in rare case u may be have "stale" data.
I just had a quick look at all places i use the Thread[] array, and I don't see anything suspicious. Could you please give an example or two of the problem you see?
have you some problem with this code?
No, not that I am aware of. The program is very well tested on 2 and 4 CPUs, people have played thousands of games in several different OSes, and nobody has reported problems. I have very little data for more than 4 threads; I am not sure how well it works there.

Tord
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Glaurung 2 and SMP

Post by bob »

hcyrano wrote:hi all

sorry my english is not perfect :(

I read glaurung2 code, only SMP (multiprocessing) i write a othello program

and i have one problem, why all data of class Thread are bad synchronized.

protected on write, but not protected on read ???? in rare case u may be have "stale" data.

have you some problem with this code?

thx
How exactly can you get stale data when the cache controllers on each processor are specifically designed to avoid this? There is no ordering constraints or synchronization necessary when reading data, only when you try to read-modify-write do you need to serialize access...
hcyrano

Re: Glaurung 2 and SMP

Post by hcyrano »

How exactly can you get stale data when the cache controllers on each processor are specifically designed to avoid this? There is no ordering constraints or synchronization necessary when reading data, only when you try to read-modify-write do you need to serialize access...
right,

ex : master thread write Thread[slave_id].stop while slave_id read this data.[/quote]
hcyrano

Re: Glaurung 2 and SMP

Post by hcyrano »

i see other little thing,

many methods are not synchronised, idle_thread_exists(), thread_is_available(), so many threads can have same information.

but only one (split() is synchronised) can take the slave thread.

this code seems good but this is not a waste of time? == delete many call at these methods?

ps: really a great job, congrats, i think use your code in my program
Tord Romstad
Posts: 1808
Joined: Wed Mar 08, 2006 9:19 pm
Location: Oslo, Norway

Re: Glaurung 2 and SMP

Post by Tord Romstad »

hcyrano wrote:
How exactly can you get stale data when the cache controllers on each processor are specifically designed to avoid this? There is no ordering constraints or synchronization necessary when reading data, only when you try to read-modify-write do you need to serialize access...
right,

ex : master thread write Thread[slave_id].stop while slave_id read this data.
Thread[slave_id].stop is a boolean variable, which is probably compiled to an int or a char in machine code. Aren't writes to ints and chars atomic operations?

Tord
Tord Romstad
Posts: 1808
Joined: Wed Mar 08, 2006 9:19 pm
Location: Oslo, Norway

Re: Glaurung 2 and SMP

Post by Tord Romstad »

hcyrano wrote:i see other little thing,

many methods are not synchronised, idle_thread_exists(), thread_is_available(), so many threads can have same information.

but only one (split() is synchronised) can take the slave thread.

this code seems good but this is not a waste of time? == delete many call at these methods?
I haven't tested, but actually I think it saves time. I call idle_thread_exists() without locking from search_pv() and search(), when deciding whether to split. Because all threads are usually busy, idle_thread_exists() will almost always return false. Avoiding locking here saves time, and is (as far as I can see) entirely safe.

In the split() function, I call idle_thread_exists() again, to make sure some other thread hasn't stolen the idle threads in the meantime. At this time I do use locking, which is necessary to avoid race conditions.

Said in another way: It is true that I waste some time when two threads try to grab the same slave thread simultaneously, but I think this is more than compensated by not having to lock and unlock so often. Locks are very expensive.
ps: really a great job, congrats, i think use your code in my program
Thanks! I hope you find a way to use it. :)

Tord
hcyrano

Re: Glaurung 2 and SMP

Post by hcyrano »

Thread[slave_id].stop is a boolean variable, which is probably compiled to an int or a char in machine code. Aren't writes to ints and chars atomic operations?
hehe, here is the question :wink:
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Glaurung 2 and SMP

Post by bob »

hcyrano wrote:
How exactly can you get stale data when the cache controllers on each processor are specifically designed to avoid this? There is no ordering constraints or synchronization necessary when reading data, only when you try to read-modify-write do you need to serialize access...
right,

ex : master thread write Thread[slave_id].stop while slave_id read this data.
[/quote]

That is exactly what the cache controller handles using bus snooping.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Glaurung 2 and SMP

Post by bob »

Tord Romstad wrote:
hcyrano wrote:i see other little thing,

many methods are not synchronised, idle_thread_exists(), thread_is_available(), so many threads can have same information.

but only one (split() is synchronised) can take the slave thread.

this code seems good but this is not a waste of time? == delete many call at these methods?
I haven't tested, but actually I think it saves time. I call idle_thread_exists() without locking from search_pv() and search(), when deciding whether to split. Because all threads are usually busy, idle_thread_exists() will almost always return false. Avoiding locking here saves time, and is (as far as I can see) entirely safe.

In the split() function, I call idle_thread_exists() again, to make sure some other thread hasn't stolen the idle threads in the meantime. At this time I do use locking, which is necessary to avoid race conditions.

Said in another way: It is true that I waste some time when two threads try to grab the same slave thread simultaneously, but I think this is more than compensated by not having to lock and unlock so often. Locks are very expensive.
ps: really a great job, congrats, i think use your code in my program
Thanks! I hope you find a way to use it. :)

Tord
That is exactly the way Crafty does it as well...