Back to Blog

Using Data to Predict Subscription Cancellations

Jack Riewe

Obviously, everyone in life has to leave at some point. Sorry to start out with a bummer sentence, but what I mean is customers churning is just a part of business. Like an episode of Modern Love, churning teaches us about trust, loyalty, breaking points, and out-of-our control circumstances that lead to someone leaving. But what if—with machine learning—you can predict a customer leaving?

Customer Churn Definition

Real quick, let’s define what customer churn is:

Customer Churn: When a customer ends their relationship with a company, product or service. This could be in many different capacities, such as website churn, subscription churn, newsletter churn, content churn, etc.

Churn Rate: The percentage or ratio of users who churn within a time period. This is important to measure because you can use this data to identify trends and relationships between variables. Learning about and dissecting your churn rate is the key to developing customer retention and which customers to put your time and resources into. The churn rate impacts the bottom line and can make it or break it for businesses.  

What Kind of Data Do You Need to Predict Customer Churn?

The obvious answer is data, but what kind of data?

As mentioned, you want to measure churn rate, but also variables you can easily track that might attribute to churn. 

In one of our first posts on an introduction to creative data predictions, we outline the data prediction process. The first step is defining a business problem. 

In this case, the business problem can be: How do we identify which high-value customers will churn and what attributes affect them churning?

Beneficial Variables to Collect for Churn

For this business problem, let’s list out valuable data points. 

We can largely classify a dataset into five different sections:

  1. Identifying Data
  2. Demographical Data
  3. Product Data
  4. Support Data
  5. Payment Data

You don't need all of them but if you do have all of them it is helpful. Here is a spreadsheet explaining that in relation to a Churn Dataset from AT&T.

After exporting the data into a Google Sheet, you can see the inputs attributing to the "Exited-probability" percentage.

Charges are a great data point to record when creating a dataset. We wrote a post a couple a weeks ago on applying dynamic pricing with machine learning and discussed the relationship between price point and churn using telecom data. We uncovered “monthly charges” and “total charges” were directly proportional to churn. 

Other attributes that would be interesting to discover relationships between are:

  • Overdue balances
  • Tenure as a customer
  • Geography
  • Number of products used
  • Age
  • Estimated salary
  • Type of plan or subscription

What Kind of Questions Should You Ask Your Data?

Sometimes the conversations you have about your data is hard and can uncover some ugly truths.

In the technical machine learning realm, the process would be to prepare and clean data, build a Logistic Regression model, train it and test it, and then start making predictions. However, we’re going to take a simpler approach where you ask our platform a question in plain English. 

Going back to the problem, Input this question: "Which customers will churn and why?" into Obviously AI platform

It makes sense to ask your data questions you can gain valuable business insight on and create actions from. Using Obviously AI, you can literally ask the same question in plain English.

Some other good questions might be:

  1. Which one of my top 100 customers are likely to churn in the next month?”
  2. “What is the persona of the customer who’s churn possibility is the highest?”
  3. “What are the attributes directly proportional to churn?”

From these answers, you can accurately predict which customers are most likely to churn and what factors in your business are causing them to churn.

Diving Deeper Into Why a Customer Churned

Because data is your most valuable resource to predictive analytics you might as well get the most mileage out of it. Instead of just predicting a probability to churn for each customer, why not go a step further and learn more about why they churned?

We took data from a credit card company and plugged it into the Obviously platform to explore attribute by attribute.

Here are the top attributes directly proportional to a customer exiting.

In this case age, balance, geography, estimated salary, if they have a credit card, tenure, number of products, and gender in that order related to churn in some capacity.

We can take these attributes and see the average number of an attribute by average exited percentage.

For example, say you wanted to know if a customer using multiple accounts is beneficial to you. You can look at the number of products a customer is using and see how it relates to churn rate.

A customer who uses 2 products has an above 80% chance of churning, whereas a customer who uses 4 products has less than 75% chance of churning.

With this information, you can see if the customer is using more than 2 products, they are less likely to churn. For example, this is great to know when planning marketing campaigns to promote opening up another account. 

Let’s now look at age. Because age was directly proportional to customer churn, it would be smart to look deeper into this attribute.

This graph tells us customers who are younger are more likely to churn with those ~19 years old have about a 95% chance of churning. This is can help marketing or sales teams target and build a persona around high-value customers who are less likely to churn.

We could go into each individual attribute, but I’m trying to keep this post as short and digestible as possible. If you have any other questions, feel free to reach out to us!

Some Customers Just Have To Go

It’s a hard fact of business that customers churn. You can’t always prevent it, but you can predict it with historical data. 

If you want to learn more on how to be creative with your data, sign up for our newsletter at the bottom of this post to get exclusive machine learning insights. Scroll down!

Share on social media 

More from the Blog

A Guide On How to Talk to Your Data

Instead of talking to data in a SQL query, it’s much more manageable and less time consuming to ask your data questions another way.

Read Story

Collaborative Machine Learning Fights Bias

With collaborative machine learning, there is an increase in the transparency of your data predictions, avoiding bias and unfair outputs.

Read Story

We’re Celebrating 1 Year of Obviously AI

We've been around a full year. Feel old yet? See our past accomplishments as we look towards the future.

Read Story

Never miss a minute.

We will never share your email address with third parties.