Applying Accurate Predictive Retail Analytics - A Whitepaper

This article looks at advanced analytics in a retail setting. The scenario we are discussing in this document is a multi-store retail chain that has a loyalty program, i.e. some of the transactions are linked to individuals.

While we discuss retail loyalty in detail, a lot of this applies equally to other domains. We provide guidelines and best practices on how to create an advanced analytic framework. It is important to realise that none of this is difficult once the steps are followed in sequence.

You may have seen this diagram before, which shows the analytical maturity level for an organisation, this version is from Gartner.

Gartner Value vs Difficulty of Analytics

As the maturity level increases, the value to the organisation increases, but unfortunately so does the difficulty level. The implication is that the most valuable type of analytics are the most difficult to produce. We will look at each of these areas in turn and detail the mistakes that we have seen companies make over and over again.

Descriptive Analytics

The lowest level of analytics is Descriptive Analytics, which is focused on looking at:

"What happened?"

Descriptive analytics in a retail loyalty scenario typically mean Business Intelligence reports that focus on store sales and customer activity. Reports on sales might include breakdown by date, department, etc. Customer activity might include reports on new sign-ins, visit patterns, etc.

Although these reports may not seem important, they are essential to the success of the loyalty program. A complete set of reports with the ability to see trends and use drill-downs are essential. We all know the disclaimers on financial adverts - "past performance is no guarantee of future performance". Well that is not true for retail analytics - past performance is a very good guide to future performance! It is only when you have a complete understanding of the current state and trends that you can apply more advanced techniques successfully. If you only take one thing from this paper, this is it:

Applying predictive analytics without having a reliable and thorough understanding of what occurred in the past is a recipe for failure.

If you do not know where you are, you are likely to apply the wrong techniques. We have seen this mistake in retail analytics numerous times. The administrators of the system continue to use the same techniques continuously with increasingly declining results. They use a specific technique (win-back campaigns, coupon offers) and get good results initially. They continue to apply those techniques but the pay-back steadily declines. They are then forced into either increasing the incentive (which has a direct effect on costs), or trying something else and hoping for success.

Therefore in this article we describe how to set-up a reporting structure for retail analytics before we move onto advanced techniques. These are not specific to any type of store. They are based on our experience in retail analytics from a number of domains including grocery chains, coffee chains, fuel retailers, department stores and clothing stores. We will cover Dashboards and some advanced reports in this section.


A dashboard should provide a snapshot of the current status of the platform. It should show a number of metrics and also to provide some guidelines of why a metric is over or under performing. There is an art to creating effective and useful dashboards, unfortunately many dashboards are done really badly! If you want good guidelines on how to create dashboards, then Stephen Few has a number of books that are excellent references. Below is a good example of a dashboard that follows some of his principles.

The above example is for a fictional Telecoms provider, but can easily be adapted for retail platforms. We can see 4 separate primary goals based on key metrics (Subscriber Acquisition Costs, Average Revenue Per User, Churn and Executive Summary). For each of the 4 metrics, we have a number of charts that show different viewpoints into the data. Color is used sparingly and only to highlight the most important metrics.

A good dashboard provides early warning signs of declining metrics. We have experience in developing these types of dashboards and have found that they increase the executive buy-in for loyalty systems. Indeed, it is often the first thing that senior managers request every day!

Advanced Reports

There are two advanced reports specific to loyalty schemes that work across all domains. They are the BathTub Report and the RFM report. First we will look at the BathTub Report.

The BathTub report looks at how loyal your customers are. The water in the 'BathTub' is a metaphor for your customers, some will leave (leak from the BathTub), you should ensure that these are replaced by either new or reactivated customers to ensure the level of customers does not fall. It shows the trend over three months, and also critically for the same time period in the previous year, so accurate comparisons can be made at a glance.

Although the BathTub report looks at past activity, it is very useful to see how many customers are active in your program. If new and reactivated customers are less than the inactive customers, then the loyalty program engagement is declining and action should be taken. The appropriate action could be a drive to entice more sign-ups or to reactivate lapsed customers by giving them special offers.

The RFM report is used to segment customers based solely on their purchasing activity. It splits the customers into 5 equal buckets (1 worst, 5 best) based on three metrics:

  • Recency: How long has it been since this customer has visited?

  • Frequency: How often has this customer visited?

  • Monetary Value: How much as this customer spent?

This creates 125 different segments, which seems to be a lot. In reality, only a few of these segments are really interesting.

Diagnostic Analytics

While many business intelligence systems can answer "what happened", they often have problems in answering "why it happened". Often the data exists in the reports, just not in a format that is easily understandable. This often leads to "shadow reporting" systems, where managers / executives task Data Analysts with running queries or multiple reports, and then combining them in Excel. This is obviously error-prone, resource intensive and causes a delay in getting the information. Any manager can relate to the story of "that piece of information is not available because it needs Mary to run a report" to relate to this problem! The two most important considerations here are Trend Analysis and Drill-down Reports.

Trend Analysis

The most common complaint we have heard about business intelligence reports is that they do not provide the answers needed. However the problem is not that the information is missing but it is presented in the wrong format. The two most common problems we have seen are

  • A large amount of data is presented in tabular format.

  • The absence of trend analysis in reporting.

In both cases data is reported in absolute terms without a baseline to compare against. People are terrible at interpreting data in this format. This often leads to a shadow reports set-up, where a data analyst is tasked with running multiple reports over different timeframes and combining them in Excel for senior managers and executives. While effective, this is error-prone and a waste of resources. [more work needed]

Drill-down Analysis

One of the key frustrations in any reporting system is not being able to drill-down further into the data. Drill-down is the ability to dig further down to the data and perform "what-if" analysis. For example, sales is obviously a combination of:

[Average Revenue Per User] x [# Users]

If sales is declining is it because we have fewer customers or that they are spending less? If it is less customers, have we a problem with some groups (e.g. by age, geographic location, etc)? If they are spending less, is it across all departments or just some? This causes the same problem and frustrations as mentioned above in trend analysis; a manager sees a figure but is unable to go further. They then task a data analyst to investigate the problem, which causes a delay in getting the information.

We have experience in providing custom "self-service" business intelligence applications to our customers. These allow non-technical users investigate what is happening in the system. They reduce the reliance on technical report writers, for one client, we shortened the development cycle for reports from 3 weeks to 3 hours by training them to use self-service business intelligence applications!

These most self-service business intelligence applications include Tableau, PowerPivot and PowerView.

Predictive Analytics

Once you have the solid foundations of descriptive analytics (what happened) and diagnostic analytics (why it happened), you have seen evidence of trends and / or issues and you will have a good idea on what metrics that you want to improve. Here is where you can move onto advanced techniques. These include member segmentation, churn analysis, custom offers and gamification.

Prescriptive Analytics

You have made it! You have a solid reporting foundation and the ability to perform advanced techniques. You can now move on to the most advanced stage - Prescriptive Analytics, i.e. making it happen. The question is what do you want "it" to be? It really can be anything you want! Remember above we gave the simple formula for sales:

Sales = [Average Revenue Per User] x [# Users]

In almost all cases, it is difficult to influence either [Average Revenue Per User] x [# Users] directly. But you now have the tools to investigate the components that make up these metrics and to take specific actions. For example, here are some scenarios, key metrics that relate to those scenarios (from reports described previously) and recommended actions :

  • Problem: Sales are down because we are not signing up enough new customers (from BathTub Report).
    Action: Create new customer acquisition program.

  • Problem: Sales are down because customers are not returning (from BathTub Report / Churn Analysis).
    Action: Create new customer retention program.

  • Problem: Sales are up YoY (year on year), but one department is underperforming (from Drill-down analysis).
    Action: Investigate cause (e.g. is it across all stores / all customer demographics).

  • Problem: Sales are static YoY (year on year), but this hides very different differences on a store level. Some are up 20% YoY, while one store is down 50% YoY (from Drill-down analysis).
    Action: Investigate why stores are performing differently.

For each one of these problems described above, we can investigate further and take even more refined actions:

  • Is the issue with customers specific to gender and / or age?
    Then target the acquisition / retention program to that demographic.

  • Is there a problem in the item mix (certain lines do not sell) in some, but not all stores?
    Then maybe alter the store layout or products.

Now that you have the ability to see trends and to investigate issues independently you can take targeted action. However, there is one important consideration - measurement. As a great man once said "with great power comes great responsibility". It is important that you can calculate the effect of any action you take. This leads us to the discussion on measurement.


Measurement is important for the obvious reason in that you need to know what worked! A more indirect reason is that without measurement, you get caught in an endless cycle of "paralysis by analysis". You see potential problems or areas that can be improved but you do not take action because you are worried that they will have no effect. Or you do not want to take action because if revenue declines then it will be attributed to the action even if revenue declined because of an unrelated issue.

We have seen this attitude take hold during different times of the year due to seasonality. Managers are reluctant to try new initiatives coming up to peak periods (e.g. before Christmas), which is somewhat understandable. However, they also do not want to try new ideas during troughs (e.g. post Christmas) because they are worried that because sales will decline naturally, it will be seen as a waste of resources. However, this is often the wrong course of action, the right targeted message sent in a quiet period can have dramatic effects.

The bad news is that measuring the precise uplift in sales due to one specific action is difficult and may require statistical knowledge.

The good news is that it if you do not need 100% precision, there are shortcuts you can take.

When we have worked with companies in the past, we have enabled them to set-up an experiment framework that allowed them to test campaigns in a controlled manner before launching them to the entire customer base. This framework also simplified the maths needed and provided an early-warning system to continue or cancel the initiative. It shortened the life-cycle in running campaigns and therefore allows more campaigns to be run concurrently.

This approach is less precise for each individual action, but allows you to fail fast and more often and generally leads to higher revenues over the long-term. If you are interested in getting more details, please get in touch with us.


In this paper, we have gone through the basic of creating an advanced analytical platform for a particular domain - retail loyalty. The bad news is that from our experiences, very few companies are getting the best value from such platforms. The good news is that the data from such platforms can be on-going competitive advantage to these companies if used correctly.

As always, if this paper was of interest to you and you want to know more, please contact us for an informal conversation.

Further Reading: