Learn Before
Concept

Information Gain

We define information gain as IG(X,Y)=H(Y)H(YX)IG(X,Y) = H(Y) - H(Y\mid X).

We interpret information gain as the amount of information we learn about the label YY, given a specific feature XX.

Note that if the feature XX is completely uncorrelated with the label YY, then H(YX)=H(Y)H(Y\mid X) = H(Y). Therefore IG(X,Y)=0IG(X,Y) = 0.

As such when we choose the next feature to split on we should choose the one that maximizes information gained.

0

1

Updated 2020-03-15

Tags

Data Science