For Intel Q9450 @2.66Ghz and 3200Mhz for 4cpu it gets 265kN/s
For Intel E6750 @2.66Ghz and 3203Mhz for 2cpu it gets 142kN/s
In this case Rybka shows a 86.6% speed increase for doubling cores.
But I heard somewhere that doubling cores leads to about a 70% speed increase.
1. Is this true or just true for the old Rybka?
Also for fritzbenchmarks at http://www.jens.tauchclub-krems.at/diverses/Schach/fritz9_benchmarks.html I saw that doubling cores usually gave 92%-98% speed increase in many newer processors.
2. Assuming that Rybka does get a 86.6% increase in speed by doubling then would the 2%-8% of the total 13.4% waste be due to bottleneck and mechanical stuff while 5.87%-11.63% is due to parallel inefficiency for Rybka?
3. If all this is true then is it safe to say that the kN/s for Stockfish shows is a TRUE reflection on what speed increase doubling the cores gives?
> I saw that doubling cores usually gave 92%-98% speed increase in many newer processors.
That doesn't tell anything about true speed, core 1 and core 2 look at the same moves at points, so there is overhead.
Rybka tries to account for this, and should show about 70% speedup, if she is showing 86.6% it's probably because Vas overshot on his node recalculation.
The only way to check for this is to give a time handicap to the engine that is running at the 4CPU, so that in average it searches the same nodes per second per move (in this case, giving 86.6% more time to the engine running on 2 CPU), the result should be 50-50, if it isn't, Rybka's node recalculation is wrong.
But the 70% number is OLD. I think from Rybka 2. Maybe Vas has improved the scaling much more. So either it is an improvement or wrong node calculation as you said.
Up to 8% loss of efficiency, as show by fritz benchmarks, will happen even if Rybka has 100% efficiency in scaling. The rest of the loss occurs in search/scaling inefficiency.
I would like Vas to comment on this.
> For Intel Q9450 @2.66Ghz and 3200Mhz for 4cpu it gets 265kN/s
> For Intel E6750 @2.66Ghz and 3203Mhz for 2cpu it gets 142kN/s
This comparison makes not too much sense, as E6750 is a Conroe chip and Q9450 a newer Yorkfield. Especially E6750 has 4MB cache while Q9450 has 12MB, an optimized architecture and northbridge/memory-controller and other things, not to speak as their might be big differences in the used OS, motherboard, memory-size&timings and so on and so on... so obviously the speed-up is higher than it would be between E6750 and Q6750.
This is a pretty useless (artifical) benchmark, it even gives nice speed-ups for hyperthreading, so simply forget about it.
> Is this true or just true for the old Rybka?
It is sadly true for all software which calculations of one core has to be synchronized with the others. If an image/movie program can calculate a picture independent of the others, the overhead of parallelization is of course a lot lower than in chess, where calculations have to be "synchronized"...
> would the 2%-8% of the total 13.4% waste be due to bottleneck and mechanical stuff while 5.87%-11.63% is due to parallel inefficiency for Rybka?
The overhead for parallelization in chess (on a serious position) is way above 5.87-11.63%... this has nothing to do with "inefficiency" btw..
> But I heard somewhere that doubling cores leads to about a 70% speed increase.
Speed-up by doubling cores is in each case lower than the 86.6% you printed here, expect a higher 7x% number in best case, average will be in middle/lower 7x%... As you might imagine speedup is also dependent on the used position and how calculation can be parallelized.
As the speedup always has also a small dependence on the used system: If you have a dual core or a quad core or an eight core you can easily check speedup on your system yourself: Simply set the number of threads to 1(,2,4,8), run your test-position, then (in best case reboot or restart your gui/engine) set it to 2(,4,8) and see again.
Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill