I had given the standard deviation, and the data are that 7 threaded is outside error margins weaker than 6 threaded.bob wrote:First thing that jumps out at me is the inconsistency. Worse with 7 than with 6? Tells me more positions or more time per position is needed. For all of my SMP results, I generally use at LEAST one minute per position, and that is always with max cores, meaning fewer cores are going to run a LOT longer.
I have only seen one machine where HT helps crafty, my mac book, because the process scheduler has no idea that when I run two threads, they ought to run on two different physical cores. If I watch to see what is happening, I see 1-2 busy, 1-3 busy, 1-4 busy, 2-3 busy, 2-4 busy, or 3-4 busy. However half of those are using one physical core. With 4 threads, I at least get to use both cores fully.
Not exactly the best justification for using hyper threading...
I used 4,800 positions per data point, if the time would be at least one minute per position, the test would take a month or so. Did you do 5,000 positions at least one minute each? Significantly lower number of positions gives higher error margins, so people claiming who knows what based on 20 positions (or 20 times the same position) have no idea what they are talking about.
I don't know your Mac Book, Windows 7 assigns threads reasonably. You could check the true behavior of your Mack Book using HT off in BIOS. Or assigning correct affinities. I did that for 4 threads on my 4 core, and I saw that setting affinities to 0,2,4,6 gives the same result as HT off in the BIOS.
My result was for i7 2600 quad, maybe other i7 are a bit different.