- Part 1. Shapley Value
- Part 2. Shapley Value as Feature Contribution
- Part 3. KernelSHAP
- Part 4. TreeSHAP
- Part 5. Python Example
SHAP means SHapley Additive exPlanations.
From “SHapley Additive exPlanations” we can get two clues
(1) Two key words SHapley and Additive
(2) SHAP’s purpose is to explain something
So let’s start from understanding the two key words, then we go back to explanation purpose.
SHapley
Shapley refer to Shapley values, to understand it, I think the best way is to see how Shapley value method answer the following question
If a team of three employee(Allan, Bob, Cindy) together make $160 profit, then how to fairly allocate the profit to each individual employee?
The challenge for this question is how to achieve fairness, right?
Let’s go through the Shapley value calculation process to answer the above question, and during the process, the intuition will be naturally build up.
Step 1 Suppose we know or estimate individual employee’s profitability(working alone) and pair of employee’s profitability(in our total three employee case) as below table.
Step 2 We need understand what is and how to calculate marginal contribution and let’s see the following examples to understand them
Allan & Bob can make $95, so Bob’s marginal contribution is Allan & Bob-Allan = $95-$40 = $55
Cindy & Bob can make $120, so Bob’s marginal contribution is Cindy & Bob-Cindy = $120-$60 = $60
Bob & Cindy can make $120, so Bob’s marginal contribution is Bob & Nobody -Nobody = $50-$0 = $50 (here because Bob is first in sequence, so the one(s) before Bob is Nobody, and Nobody make no profit)
From the above examples we can conclude the rules to calculate marginal contribution.
Note: In this part, Let’s assume combination is ordered sequence, there is another term coalition which does not consider element order, coalition is just like set.
- Marginal contribution for X is (Other members + X)-(Other members)
- Combination -we can see Bob gets different marginal contributions in different employee combinations(Allan & Bob vs Cindy & Bob).
- Combination Order-This is due to Margin calculation is based on sequence order. In our example, We can see Bob’s gets different marginal contributions in Cindy & Bob and Bob & Cindy.
Step 3 Shapley Values Calculation. Shapley Values for a player(or employee in our example)is weighted average all marginal contribution of the player in all possible combinations, and the weight is the occurring probability of the combination. Below, the table clearly shows the Shapley value calculation processes:
Additive
Based on above calculation, the profit allocation based on Shapley Values is Allan $42.5, Bob $52.5 and Cindy $65, note the sum of three employee’s Shapley values is 42.5+52.5+65 = 160. This actually proves Shapley Value’s additive property, all the attribute(or player)’s Shapley values must added to the total gain .
Now after explain SHapley and Additive from an example, I wish we get the intuition of what is Shapley Values and how to calculate it. Next step let’s summarize Shapley Value Mathematics Formula, please be patient I will try to explain the formula in very details to help you get it quickly :)
Understand Definition of Shapley Value Formula
Below is the formal definition of Shapley value from wikipedia, the formula calculates the Shapley value for player i
- n is the total number of players
- N contains all the possible player subsets not containing player i, here the subset will not consider element order, actually set has no order
- S is one set from N
- |S|is the number of players in set S
- v(S) is the function to calculate the contribution from set S
Before I start discuss the formula, let’s review the Shapley Value in a big picture.
Shapley value is weighted average the marginal contribution from all possible coalitions, and the weight is coalition occurring possibility.
In the above formula, under a specific subset or coalition S, the RED BOX is to calculate Coalition Probability weight, the YELLOW BOX is to calculate marginal contribution.
The yellow box is to calculate player i’s marginal contribution for a given set S. It differentiate the contributions between S with player i and S without player i.
RED BOX Explanation
Denominator n! is total number of ordered combinations for n players. Imagine we build an ordering sequence from n players, first position in the sequence can choose from n players, second position can choose from n-1 players, third position n-2, fourth n-3… so total number of ordered combinations will be n*(n-1)*(n-2)*(n-3)…*1
To understand Numerator |S|! *(n-|S|-1)!, we can divide the full set elements sequence into 3 parts:
[0,|S|), [|S| ,|S|+1), [|S|+1, n)
First part is for the given set S, so it has|S|! different ordered combinations, the second part only has one choice i, the third part has (n-|S|-1)! different ordered combinations.
Then the total number of ordered combinations derived from Set S is
|S|! *1*(n-|S|-1)!
Finally the probability an ordered combination derived from given set S is
= |S|! *(n-|S|-1)! / n!
Shapley Value Properties or Axioms
Generally Shapley value has the following three major axioms
- Efficiency(Additive) — All the attribute(or player)’s Shapley values must added to the total gain. Like in our above example the sum of Shapley values from Allan, Bob and Cindy equal to the total gain(160)
- Symmetry — The Shapley value of two attribute(or player) should be the same if they contribute equally to all possible coalitions.
- Dummy — The Shapley value should be 0 if the attribute(or player) contribute nothing. The No.2 and No.3 axioms justify Shapley value’s fairness.
Conclusion
In this part, we explore the intuition of Shapley value and its calculation, in part 2, we will see how to apply Shapley value model to get feature contribution for machine leaning model.
Complete SHAP tutorial for model explanation:
- Part 1. Shapley Value
- Part 2. Shapley Value as Feature Contribution
- Part 3. KernelSHAP
- Part 4. TreeSHAP
- Part 5. Python Example
REFERENCES
- Interpretable Machine Learning: https://christophm.github.io/interpretable-ml-book/shap.html
- A Unified Approach to Interpreting Model Prediction: https://arxiv.org/abs/1705.07874
- Consistent Individualized Feature Attribution for Tree
Ensembles: https://arxiv.org/abs/1802.03888 - SHAP Part 3: Tree SHAP: https://medium.com/analytics-vidhya/shap-part-3-tree-shap-3af9bcd7cd9b
- PyData Tel Aviv Meetup: SHAP Values for ML Explainability — Adi Watzman: https://www.youtube.com/watch?v=0yXtdkIL3Xk
- The Science Behind InterpretML- SHAP: https://www.youtube.com/watch?v=-taOhqkiuIo
- Game Theory (Stanford) — 7.3 — The Shapley Value : https://www.youtube.com/watch?v=P46RKjbO1nQ
- Understanding SHAP for Interpretable Machine Learning: https://medium.com/ai-in-plain-english/understanding-shap-for-interpretable-machine-learning-35e8639d03db
- Kernel SHAP:https://www.telesens.co/2020/09/17/kernel-shap/
- Understanding the SHAP interpretation method: Kernel SHAP:https://data4thought.com/kernel_shap.html