This post may be of interest to executives of e-commerce companies who are hearing increasing chatter about predictive analytics, but are far from convinced that it is mission critical to their business.
You may even have heard of disparate bits of work being done using predictive analytics within your organization. If they are typical of many predictive analytics projects, they were relatively small scale, showed a healthy - albeit small in absolute terms - ROI, but failed to succeed in catalyzing you or your colleagues into doing cartwheels around the board room table, and certainly didn't earn anyone a promotion. In fact you are probably still wondering what all the fuss was about. Certainly there is no compelling feeling that predictive analytics is anything other than a bit part in the great production that is your e-commerce enterprise.
I am hoping that by the time you've finished reading this you may view things a little differently.
The first part of the problem is coming up with a reasonable definition of what predictive analytics is. Let's look at what Wikipedia says on that:
Predictive analytics encompasses a variety of statistical techniques from modeling, machine learning, data mining and game theory that analyze current and historical facts to make predictions about future events.
In business, predictive models exploit patterns found in historical and transactional data to identify risks and opportunities. Models capture relationships among many factors to allow assessment of risk or potential associated with a particular set of conditions, guiding decision making for candidate transactions.
Predictive analytics is used in actuarial science, marketing, financial services, insurance, telecommunications, retail, travel, healthcare, pharmaceuticals and other fields.
Well. If we didn't forgive you for not being excited about it prior to reading the definition, we can certainly forgive you after...suddenly the small print in your liability insurance policy is compelling reading.
The two words themselves are quite pleasant. Predictive analytics is one of those terms that rolls off the tongue quite easily, falls into that whole category of corporate-speak designed to impress, in as much as it conjures up associations such as: progressive, scientific, interesting, positive and for sure something any switched on operator would need to know about. It also meets the important criteria of being sufficiently vague that it cannot in and of itself be descriptive of anything specific. It is a term that is relatively amorphous and can blend to many different situations. Unfortunately - and I have the scars to prove it - this same characteristic means that if you don't have the problem for predictive analytics to solve squarely in your mind then you are not really going to understand what it is without some reasonable description, and usually by anology.
As introduction to what it means to you as e-commerce executive, let's consider one hugely impactful area it could be deployed in:
That would be the area of
personalization.
And I am not talking about "Welcome Tom" type personalization. I am talking about the mother of all personalization - a site that is a chameleon, one that is tailored uniquely to every users every demographic, interest, and aspiration. One that allows one million users to have one million unique experiences.
To frame this discussion, it is useful to look at a couple of concepts from traditional retail that may or may not have successfully translated through to e-commerce:
User Experience - If retail is about Location, Location, Location - e-commerce is about User Experience, User Experience, User Experience. There are many dimensions we can measure user experience on, and many of them at this stage of the web's evolution can be considered hygene factors. The challenges of user experience increase as a site gets larger. Incredible selection which used to be a drawcard now becomes a double-edged sword, in as much as people are overwhelmed by the sheer quantity of items available and have to spend more time on the site to find the item they actually seek. As anyone in e-commerce will tell you the more time that it takes a user to find what they want the less credit card transactions tend to get rung up.
Floor Space Optimization - Another tenet of retail is use your real estate absolutely as efficiently as you possibly can. In bricks and mortar retail over a hundred years of evolution (albeit with marked acceleration in the last twenty years) has gone into working out how to most efficiently yield the maximum revenue out of every square foot of floor space, indeed every square inch of shelf space. In e-commerce your floor space is your screen real estate: what your customers sees when they visit your site. So clearly we want the most efficient allocation of screen space possible.
Interestingly then we see that the junction of enhancing user experience and screen space optimzation both squarely intersect at one place:
personalization.
This is about the point it gets mildly interesting. In a retail store we do not have the luxury, nor ability, to reconfigure the store for every single customer that walks in the door. However on an e-commerce site we have no such constraints - one million customers can have one million unique versions of the site served up to them. With an e-commerce site we have a massive advantage over every bricks and mortar location on the planet, namely: we can literally be somethng different to every person: our site can be pink to girls, it can be blue to boys, it can be green to people who are environmentally aware, it can be red to firemen, blue to policemen, it can have grease on it for mechanics.
While those examples are very interesting, they are decorative, and nothing compared to the opportunity to create the perception that our mega e-commerce store finds a way to look like a personally curated shop for each and every individual customer. A shop that somehow anticipates what you are most likely to buy next, a shop that somehow anticipates your preferences. A shop that was your like your own guided personal shopping session through Harrods. Now that would indeed be a magical shop.
This is about the point that in most executives minds theory and reality tend to begin diverging. All easier said than done you say. How could we possibly be expected to understand what every customer wants next and what is best to offer them.
This is the stage, my learned executive, where you may want to start thinking about predictive analytics a little harder than you have to this point. What predictive analytics is to you as the executive of an e-commerce site is the enabler of major personalization, personalization on the scale that is seldom seen.
So how do we do this? You probably don't want to understand it in the level of its gory details - but I find that without some fundamental understanding of how it works, it is easy to dismiss the whole concept as science fiction and revert back to considering more real things. Which incidentally is another way predictive analytics often shoots itself in the foot - a massive aura of mystiscm veiled over the process. If Winston Churchill described Russia as
'a riddle, wrapped in a mystery, inside an enigma' then I don't know what that makes predictive analytics to any sane person.
So I will give you a conceptual view of how we would go about achieving this massively personalized site; imagine it like this: You and every other serious e-commerce company on earth have a massive customer database. It may be Oracle, Teradata, Microsoft SQL Server, or something else - doesn't matter much what - the point is you have a central repository of all your customer data and everything else you have going on relative to your business. This data is typically used for backward looking analytics. That is to say that you use it as a tool to find out what is happening, and what has happened in your business. 'Who are we selling the most power tools to?', "What time of the day is busiest?' and a thousand other ad-hoc and scheduled questions that may be of interest, and infact may be vital, in running your business.
What we want to do is make a transformational leap in what the data warehouse can do -
we want to convert the data warehouse into a forward looking analytics tool. Besides having a data warehouse full of data informing what that the customer has done in the past, we are now going to have a data warehouse that tells us
what our customers are likely to do in the future. What is the likelihood of them buying product a, product b, etc. The measuring unit for this will be something called a propensity score - this is best thought of as a probability, it is not quite a true probability, but close enough for our purposes.
Our game plan will be to assign a propensity score to every activity we can possibly think may be useful to us. Propensity for Tom to buy Product A is 0.95, to buy Product B is 0.75, Product C 0.85, etc. If we have 100,000 SKUs imagine this is repeated 100,000 times with 100,000 propensities (100,000 new entries attached to each individual customer record). Under the above scenario if we had screen real estate sufficient to suggest
one complementary item we would insert Product A as it has the hightest propensity score. If we had space for two, we would insert Product A and Product C. You get the idea.
So that sounds just grand, but how are propensity scores with any credibility actually developed?
We are now descending into the realms of advanced mathematics and machine learning algorithms that risk making you abruptly stop reading this post. However, if we avoid that and focus at a conceptual level, it is actually not that difficult to understand.
The propensity scores are developed in an evidence based manner, based upon likeness to previous occurrences. If we have one million customers exhibiting all sorts of different behaviours, some buying this, some buying that, some of this age, some of that age, some clicking here, some clicking there, we actually typically find patterns and over-represented relationships. For example (and to use a far too simple example) we may find that people buying a new bike tend to buy bike pumps at a far higher rate than the population as a whole. We synthesize all this information over the behaviour of a million customers and millions of transactions and this is what enables the development of the propensity scores - which are really just proxies for the relative likelihood of a customer to take a certain action. If a customer purchases a bike, their score for propensity to buy a bike pump is likely much higher than if they didn't. Note we did not have to have any prior knowledge of this fact, rather it is purely evidence based, based upon patterns and occurrences found by analysis of the data by algorithms. In that case all I had to know was that the customer purchased a bike. However there are far more subtle and complex relationships than this that algorithms will automatically find.
Once we have assigned these scores, we use them as building blocks and lay some rules over the top - these rules help us to decide what to offer up at the point of meeting the customer - on the site. For example if the customer has already bought somethng with the highest score, and it is not a repeatable purchase, we do not offer that same item, but the next item down.
And with that I bring you full circle - by putting all these concepts together and using them as the drivers of what is served up on our site, we have created exactly what we described: a highly personalized and uniquely relevant site.
In my next post I will discuss how you move from a standing start to actualizing this scenario - with a predictive analytics road map.