Friday, June 26, 2009

When Customers Start and End

In texts on credit scoring, some effort almost always goes into defining what is to be considered as a "bad" credit. The Basel framework provides rather a precise definition of what is to be considered a default.

But I have rarely seen the same in predicting cross-sell, up-sell or churn. I do however, remember attending an SPSS conference where churn of pre-paid cards was discussed. Churn, in that case, was defined as a number of consecutive periods where the number of calls fell below a certain level.

In the past, I've used start and end dates of contracts, as well as a simple increase (or decrease) in the number of products that a customer has over time as indicators of what to target.

I'd be really interested in hearing how you define and extract targets, be it in telecom, banking, cards or any other business where you use prediction. For instance, how would you go looking for customers that have churned? Or for that matter, customers where up-sell has been successful?

This may be too simple a question, but if there are standard methods that you use, I'd be really interested in learning about them.


This is not a simple question at all. Or rather, the simplest questions are often the most illuminating.

The place where I see the biggest issues in defining starts and stops is in survival data mining (obligatory plug for my book Data Analysis Using SQL and Excel, which has two chapters on the subject). For the start date, I try to use (or approximate as closely as possible) the date when two things have occurred: the company has agreed to provide a product or service, and the customer has agreed to pay for it. In the case of post-pay telecoms, this would be the activation date -- and there are similar dates in many other industries, as varied as credit cards, cable subscriptions, and health insurance.

The activation date is often well-defined because the number of active customers gets reported through some system tied to the financial systems. Even so, there are anomalies. I recently completed a project at a large newspaper, and used their service start date as the activation date. Alas, at time, customers with start dates did not necessarily actually receive the paper on the date -- often because the newspaper delivery person could not find the address.

The stop date is even more fraught with complication, because there are a variety of different dates to choose from. For voluntary churn, there is the date the customer requests termination of the service. There is also the date when the service is actually turned off. Which to use? It depends on the application. To count active customers, we want the service cut-off date. To plan for customer retention efforts, we want to know when they call in.

Involuntary churn is also complicated, because there are a series of steps, often called the Dunning Process, which keeps track of customers who do not pay. At what point does a non-paying customer stop? When the service stops? When the bill is written off or settled? At some arbitrary point, such as 60 or 90 days of non-payment? To further confuse the situation, the business may change its rules over time. So, during some periods of time or for some customers, 60 days of non-payment results in service cutoff. For other periods or customers, 90 days might be the rule.

Often, I find multiple time-to-event problems in this scenario. How long does it take a non-paying customer to stop, if ever? How long after customers sign up do they begin?

In your particular case, the contract start date is probably a good place to start. However, the contract end date might or might not be appropriate, since this might not be updated to reflect when a customer actually stops.



  1. Thank you. I really appreciate your comments on this, and your book will find it's way here shortly. I recently heard a representative of IBM discussing their model of financial business. As a consequence, I now define the noteworthy events as indicators of advancement in the contract life cycle. The problem of start and end dates persist, just as you describe them, but I ended up with the arbitrary 90-day definition of significant increase or attrition. Let's hope I get away with this. Thank you again, and I'll make sure to check in again for other comments and new topics. // Ola

  2. Working in telecommunications, I try to use events controlled by IT or network mechanisms where possible. For example, when a good paying mobile customer leaves they can transfer their mobile phone number to a competitor. This is called number 'porting' and is a fairly relieable way of defining churn (cusomers leaving). Other definitions of churn might depend upon a call centre agent or someone entering the correct label into a CRM system, and can often be incorrect.

    Inactivity over a period of time is a common way to define customers leaving, but we treat this differently (with a different model) from sudden number transfer (porting).

    Bare in mind the lost of value rather than just the loss of customer. Not all churn is worth the same. The issues Gorgon mentions obviously impact what we might perceive as customer value (if you include the 90 days inactivity it will impact your average spend or value for the customer).




Your comment will appear when it has been reviewed by the moderators.