User:IssaRice/Shapley value

From Machinelearning

most expositions of the Shapley value SUCK BALLS because they try to sum over the subsets excluding the playing in question (usually called "player i"). so here we go, here's a TRUE REDPILLED exposition of the shapley value!

first of all, what's the shapley value even trying to do? once we understand it in words, we can just convert our verbal understanding into symbols. and then we will be done.

...


So, the shapley value is an average. but what kind of average? an arithmetic average. well, an arithmetic average takes a specific form. it looks like this. if you're averaging the elements of some set , then the arithmetic average is

We throw in the function f because the elements of X might not be numbers. or even if they are numbers, you might want to apply some weighting other than the default one (the identity function).


Now, let's take the ugly-ass formula for the shapley value that you always see:

how is that supposed to be an average? well first of all, we said above that the shapley value is averaging over all sequences of ways to add the n players. one way to formalize the concept of a "sequence" or "ordering" is to use permutations. a permutation is just a function that reorders the elements of of a set. so each sequence corresponds to a permutation. we can recover a sequence by defining the permutation .

So in what sense is the shapley value an average? if is the set of players, then we can define the set of all permutations on . (This is also denoted as and called the "symmetric group of degree n" since is the "default" set of size n.)

since the shapley value is an average and we are in particular averaging over all sequences, we want to rewrite the formula as something that looks like:

And in fact, at this point we know enough to convert our verbal understanding into a formula like the one above.

a relevant fact is that the size of is .


the Shapley value is