The problem is to measure if a non-functional speed optimization (NEW) is faster than current (MASTER):
Running a single speed bench test, as everybody knows, gives very noise information, able to detect only the most grossly differences in speed, but when speed is similar (as in real cases) this is not reliable.
We can run more than one test and average the results, but how much is the speed result reliable? So the problem is:
Given N noisy measures of speed of NEW and N noisy measures of MASTER, we want to know:
1. If NEW is faster than MASTER with 95% reliability
2. Maximum speed up of NEW, defined as (Speed NEW - Speed Master) / Speed MASTER that is guaranteed with 95% reliability
The second point needs a clarification. Suppose NEW is faster then MASTER of 0.5% with 95% guarantee), then it will be faster also of at least 0,4% and lower. On the contrary it will not be faster than 0.8% with 95% reliability, maybe just with 30% reliability. We want to find the max speed-up (in this case 0.5%) that has 95% reliability.
This is what I have cooked up myself:
Code: Select all
Suppose we have N measures and we sum them: SUM A = N * SA + sum(1..N of nai) // Where SA is speed of A, nai is a noise added to the i-th measure SUM B = N * SB + sum(1..N of nbi) SUM A - SUM B = N * (SA - SB) + sum(1..N of nai-nbi) Now we want to know when SUM A - SUM B > 0, assuming nai are guassian noises, so... SUM A - SUM B > 0 => SA - SB > sum(1...N of nbi-nai) / N