SQL Server Gems

Sunday, March 12, 2006

Decision Trees (Part 3d) - Selecting the Split Attribute for Decision Trees

Using what we have learnt about information gain earlier, we can compute the information gain for each of the attributes, and use this to determine which is the split attribute.

For example,



Gain(S, PayBillOnTime)
= .940 – (7/14)0.985 – (7/14)0.592
= 0.151

Gain(S, PropertyType)
= .940 – (8/14)0.811 – (6/14)1.0
= 0.048

Note: From the above, we can observe that PayBillOnTime provides more information gain than PropertyType.

By determining the attribute which maximize the information gain, we can then construct a similar decision tree as Part a.

0 Comments:

Post a Comment

<< Home