I guess the usual definition of a linear function is one where the output of a sum of the inputs is equal to the the output of each individual input.
Not quite. The above is the definition of an
additive function. However, you proceed to give the correct definition:
A function f is linear iff:
f(a+b) = f(a) + f(b).
f(c*a) = c*f(a)
The first of these equations says that the function is additive, and the second that it is homogeneous of degree 1. Together, they say that the function is linear.
At least to me, it is fairly obvious that every additive real number function is linear. Nevertheless, it is wrong, if you accept the axiom of choice. The proof is simple and beautiful:
Regard R (the set of real numbers) as a vector space over Q (the rational numbers). Pick two nonzero real numbers, one rational and one irrational. The numbers 1 and pi will do, and I'll use those for the rest of the discussion. Because 1 and pi are linearly independent over Q, we can construct a basis B for R over Q with 1 and pi as two of the basis elements (this is where we need the axiom of choice; the axiom of choice is equivalent to the existence of a basis for all vectorspaces).
Now construct a Q-linear map f: R -> R by f(1) = 0, and f(b) = 1 for all other basis vectors b. Because f is Q-linear, it is additive. However, it is not R-linear: We see that get f(pi) = 1, but pi * f(1) = 0.
Like all proofs using the axiom of choice, this proof is non-constructive: It doesn't help us find a particular real number function which is additive but non-linear. Indeed, I am fairly sure that noone will ever find such a function. This is a good example of why the axiom of choice is somewhat controversial: While the AC by itself looks intuitive and obvious, it can be used to give non-constructive proofs for a lot of very counterintuitive results.
Of course, in computer chess, most functions are not real number functions, but integer functions (modulo 2^32 or 2^64), and for such functions additivity and linearity are of course the same thing.
So I'd guess an evaluation function is linear if
f(feature1_a + feature1_b,feature2_a + feature2_b,feature3_a + feature3_a...) = f(feature1_a,feature2_a,feature3_a) + f(feature1_b,feature2_b,feature3_b) = f(feature1_a)+ f(feature2_a) + f(feature3_a)+f(feature1_b)+f(feature2_b)+f(feature3_b) ...
PSTs would be nonlinear because PST[sq1+sq2] != PST[sq1] + PST[sq2]
Simple material eval and mobility eval could be linear:
materialEval(onepawn + onepawn) = materialEval(onepawn) + materialEval(onepawn)
You can't define what it means that en evaluation function is linear without first defining what a "feature" means. For a program using a material + piece square table eval, it would be most natural to let the features be the set of all (piece, square) pairs on the board. Write this set as {(p_1, s_1), ..., (p_n, s_n)}. Then the evaluation function would look like this:
Code: Select all
eval = MaterialValue[p_1] + ... + MaterialValue[p_n] + PST[p_1][s_1] + ... + PST[p_n][s_n]
This function is clearly linear. However, as soon as you add something like a penalty for doubled pawns, the evaluation function is no longer linear, because eval({(WhitePawn, c3), (WhitePawn, c4)}) does not equal eval({(WhitePawn, c3)}) + eval({(WhitePawn, c4)}). Of course, you could make it linear again simply by extending your set of features, and regarding each doubled pawn on the board as an individual feature.
In other words, all evaluation functions can be regarded as linear, or as non-linear. It always depends on how you define the features. For the complex evaluation functions found in strong chess programs, I don't think it makes much sense to talk about linearity of the evaluation function as a whole, but discussing the linearity of some of the components of the evaluation function could still be interesting.