The goal of classification is to build rules or models and represent them in a simple readable form using past decisions. There are multiple techniques that can accomplish this task of gleaning intelligence from existing data –Neural Networks, SVMs , Decision Trees and so on. We’ll focus on Decision Trees in this article since they sit well with the Marketing use cases like predicting conversion rates, buy decisions etc. Sounds obtuse? We’ll demystify this in a moment.
Who will buy a computer???Let’s Look at a computer dealer who has data about his customers and purchases – Age group, Credit Status and Customer Status ( is he a student or not) . He is trying to figure out if a new set of customers will be interested in buying a computer- eventually to send them discounted offers. Given his constraints on the level of discounting he can provide, how does he make use of past data about other customers who have already bought a computer to answer this question?Simple: He uses a decision tree which takes in the age group, income, credit status and customer type and tries to predict an outcome based on past customer decisions around buying a computer. The goal is to get to a state where the entropy is minimum ie information is as unambiguous as possible like the areas marked in red on the diagrambelow . Some interesting observations emerge.
Eg: One rule could be Students with a Low Income group at an age group < 30 never buy a computer. The 100 indicates the total number in the segment and 0 the no. of people who actually bought a computer.Another possible rule could be Non –Students with Poor credit histories rarely buy a computer.This model built on past decisions can then be extrapolated to answer a similar question about a new customer / prospect. Will the new guy be interested in buying a computer or does it make no sense sending him an offer? One caveat is that some of these insights could really be no-brainers – something that is so logical, you wouldn’t need to go through all this hassle to find out- like blind customers don’t buy TV OR Turkey sales go up in Easter !!
We will get back to our behavioral story around purchase and engagement behavior aggregation using RFM scores–We scored customers on a scale of 1to 5, derived from a clustering algorithm. One way of categorizing the clustering output could use a combination of Purchase and Engagement behavior as shown below.
The figure above is a Classification model from a two year “simulated” data set of a grocery chain. The scores which are an aggregation of engagement & purchase behaviors are predicted based on the consumer’s location, gender and age. The decision tree spits out a set of rules that culminate in the pink boxes that denote the behavioral scores. Some rules that can be gleaned from this analysis are.
- Florida -> Female -> Age <= 63 ->Category 3 -> Highly Engaged – Loyal Purchasers.
- Tennesse -> Female -> Age >54 -> Category 4-> Semi- Engaged – Valuable Purchasers.
No comments:
Post a Comment