Tuesday, July 31, 2012

Putting Predictive Analytics to Work


Excellent article by Robert Mitchell at Computer World titled 'Putting predictive analytics to work - Contrary to popular opinion, you don't need a huge budget to get started.'

I couldn't agree more. It begins:

The Orlando Magic's analytics team spent two years honing its skills on the business side.
"Eighteen to 20 months ago, we knew virtually nothing about predictive analytics," says Anthony Perez, director of business strategy for the National Basketball Association franchise. While his team was in fact working on predictive analytics well before that, Perez added, their tools weren't powerful enough to give them insights they needed, and the group needed to scale up its efforts.

You can read the entire article here.

Thursday, July 12, 2012

Banks know all about you - and forget

Article in Sydney Morning Herald by Michael Pascoe. Discusses how much about us banks should know, yet how little in reality many of them actually do know. Most of the people I know who work in insight in banks are as frustrated as Michael is, in as much as they know they can be doing a lot more interesting things that really help both their customer and their bank - but the entire infrastructure on which the bank rests just does not enable it; and the type of spend that it takes to remedy it is one that has to be decided upon at the very highest of levels.

The comments down the bottom are worth reading too, for balance.

The article begins:

It's labour market lotto again tomorrow with the nation's market economists making apparently random guesses about how many jobs have been created or lost.
The strange thing is, the big four banks' economists should know.
That they don't is symptomatic of our banking cartel's inability to use the information they possess.
The big four collectively should know just about everything there is to know about us as we can barely sneeze without making an entry in one of their vast databases. (And even when we sneeze, we're likely to use a tissue that was purchased with a piece of plastic processed by a bank.)

You can read the full article here.

Wednesday, June 27, 2012

On Orbitz, Mac Users Steered to Pricier Hotels

Predictive Analytics related article in the Wall Street Journal. The article begins:

Orbitz Worldwide has found that people who use Apple Inc.'s Mac computers spend as much as 30% more a night on hotels, so the online travel agency is starting to show them different, and sometimes costlier, travel options than Windows visitors see.

The Orbitz effort, which is in its early stages, demonstrates how tracking people's online activities can use even seemingly innocuous information—in this case, the fact that customers are visiting Orbitz.com from a Mac—to start predicting their tastes and spending habits.

It's a short, but actually thought provoking article. The headline would lead one to believe that Mac users were being exploited, however in my view this is far from the case. Rather what is happening is that Mac users are being delivered to those hotels which they are proven to be more likely to book; and if we know anything, it is that getting somebody to what they actually want to buy faster on an e-commerce site is a win for everyone - both the buyer and the seller.

It begs thinking more on the fact that too often we are working on the assumption that price is the most important factor to the buyer - when clearly it is only one consideration.

You can read the full article here.

Thursday, June 21, 2012

Predictive Analytics and Personalization at e-commerce Sites

This post may be of interest to executives of e-commerce companies who are hearing increasing chatter about predictive analytics, but are far from convinced that it is mission critical to their business.

You  may even have heard of disparate bits of work being done using predictive analytics within your organization. If they are typical of many predictive analytics projects, they were relatively small scale, showed a healthy - albeit small in absolute terms - ROI, but failed to succeed in catalyzing you or your colleagues into doing cartwheels around the board room table, and certainly didn't earn anyone a promotion. In fact you are probably still wondering what all the fuss was about. Certainly there is no compelling feeling that predictive analytics is anything other than a bit part in the great production that is your e-commerce enterprise.

I am hoping that by the time you've finished reading this you may view things a little differently.

The first part of the problem is coming up with a reasonable definition of what predictive analytics is. Let's look at what Wikipedia says on that:

Predictive analytics encompasses a variety of statistical techniques from modeling, machine learning, data mining and game theory that analyze current and historical facts to make predictions about future events.
In business, predictive models exploit patterns found in historical and transactional data to identify risks and opportunities. Models capture relationships among many factors to allow assessment of risk or potential associated with a particular set of conditions, guiding decision making for candidate transactions.
Predictive analytics is used in actuarial science, marketing, financial services, insurance, telecommunications, retail, travel, healthcare, pharmaceuticals and other fields.

Well. If we didn't forgive you for not being excited about it prior to reading the definition, we can certainly forgive you after...suddenly the small print in your liability insurance policy is compelling reading.

The two words themselves are quite pleasant. Predictive analytics is one of those terms that rolls off the tongue quite easily, falls into that whole category of corporate-speak designed to impress, in as much as it conjures up associations such as: progressive, scientific, interesting, positive and for sure something any switched on operator would need to know about. It also meets the important criteria of being sufficiently vague that it cannot in and of itself be descriptive of anything specific. It is a term that is relatively amorphous and can blend to many different situations. Unfortunately - and  I have the scars to prove it - this same characteristic means that if you don't have the problem for predictive analytics to solve squarely in your mind then you are not really going to understand what it is without some reasonable description, and usually by anology.

As introduction to what it means to you as e-commerce executive, let's consider one hugely impactful area it could be deployed in:

That would be the area of personalization.

 And I am not talking about "Welcome Tom" type personalization.  I am talking about the mother of all personalization - a site that is a chameleon, one that is tailored uniquely to every users every demographic, interest, and aspiration. One that allows one million users to have one million unique experiences.

To frame this discussion, it is useful to look at a couple of concepts from traditional retail that may or may not have successfully translated through to e-commerce:

User Experience - If retail is about Location, Location, Location - e-commerce is about User Experience, User Experience, User Experience. There are many dimensions we can measure user experience on, and many of them at this stage of the web's evolution can be considered hygene factors. The challenges of user experience increase as a site gets larger. Incredible selection which used to be a drawcard now becomes a double-edged sword, in as much as people are overwhelmed by the sheer quantity of items available and have to spend more time on the site to find the item they actually seek. As anyone in e-commerce will tell you the more time that it takes a user to find what they want the less credit card transactions tend to get rung up.

Floor Space Optimization - Another tenet of retail is use your real estate absolutely as efficiently as you possibly can. In bricks and mortar retail over a hundred years of evolution (albeit with marked acceleration in the last twenty years) has gone into working out how to most efficiently yield the maximum revenue out of every square foot of floor space, indeed every square inch of shelf space. In e-commerce your floor space is your screen real estate: what your customers sees when they visit your site.  So clearly we want the most efficient allocation of screen space possible.

Interestingly then we see that the junction of enhancing user experience and screen space optimzation both squarely intersect at one place: personalization.


This is about the point it gets mildly interesting. In a retail store we do not have the luxury, nor ability, to reconfigure the store for every single customer that walks in the door. However on an e-commerce site we have no such constraints - one million customers can have one million unique versions of the site served up to them. With an e-commerce site we have a massive advantage over every bricks and mortar location on the planet, namely: we can literally be somethng different to every person: our site can be pink to girls, it can be blue to boys, it can be green to people who are environmentally aware, it can be red to firemen, blue to policemen, it can have grease on it for mechanics.

While those examples are very interesting, they are decorative, and nothing compared to the opportunity to create the perception that our mega e-commerce store finds a way to look like a personally curated shop for each and every individual customer. A shop that somehow anticipates what you are most likely to buy next, a shop that somehow anticipates your preferences. A shop that was your like your own guided personal shopping session through Harrods. Now that would indeed be a magical shop.

This is about the point that in most executives minds theory and reality tend to begin diverging. All easier said than done you say. How could we possibly be expected to understand what every customer wants next and what is best to offer them. This is the stage, my learned executive, where you may want to start thinking about predictive analytics a little harder than you have to this point. What predictive analytics is to you as the executive of an e-commerce site is the enabler of major personalization, personalization on the scale that is seldom seen.

So how do we do this?  You probably don't want to understand it in the level of its gory details - but I find that without some fundamental understanding of how it works, it is easy to dismiss the whole concept as science fiction and revert back to considering more real things.  Which incidentally is  another way predictive analytics often shoots itself in the foot - a massive aura of mystiscm veiled over the process. If Winston Churchill described Russia as 'a riddle, wrapped in a mystery, inside an enigma' then I don't know what that makes predictive analytics to any sane person.

So I will give you a conceptual view of how we would go about achieving this massively personalized site; imagine it like this: You and every other serious e-commerce company on earth have a massive customer database. It may be Oracle, Teradata, Microsoft SQL Server, or something else - doesn't matter much what - the point is you have a central repository of all your customer data and everything else you have going on relative to your business. This data is typically used for backward looking analytics. That is to say that you use it as a tool to find out what is happening, and what has happened in your business. 'Who are we selling the most power tools to?',  "What time of the day is busiest?' and a thousand other ad-hoc and scheduled questions that may be of interest, and infact may be vital, in running your business.

What we want to do is make a transformational leap in what the data warehouse can do - we want to convert the data warehouse into a forward looking analytics tool. Besides having a data warehouse full of data informing what that the customer has done in the past, we are now going to have a data warehouse that tells us what our customers are likely to do in the future. What is the likelihood of them buying product a, product b, etc. The measuring unit for this will be something called a propensity score - this is best thought of as a probability, it is not quite a true probability, but close enough for our purposes.

Our game plan will be to assign a propensity score to every activity we can possibly think may be useful to us. Propensity for Tom to buy Product A is 0.95, to buy Product B is 0.75, Product C 0.85, etc. If we have 100,000 SKUs imagine this is repeated 100,000 times with 100,000 propensities (100,000 new entries attached to each individual customer record). Under the above scenario if we had screen real estate sufficient to suggest one complementary item we would insert Product A as it has the hightest propensity score. If we had space for two, we would insert Product A and Product C. You get the idea.

So that sounds just grand, but how are propensity scores with any credibility actually developed?

We are now descending into the realms of advanced mathematics and machine learning algorithms that risk making you abruptly stop reading this post. However, if we avoid that and focus at a conceptual level, it is actually not that difficult to understand. 

The propensity scores are developed in an evidence based manner, based upon likeness to previous occurrences. If we have one million customers exhibiting all sorts of different behaviours, some buying this, some buying that, some of this age, some of that age, some clicking here, some clicking there, we actually typically find patterns and over-represented relationships. For example (and to use a far too simple example) we may find that people buying a new bike tend to buy bike pumps at a far higher rate than the population as a whole. We synthesize all this information over the behaviour of a million customers and millions of transactions and this is what enables the development of the propensity scores - which are really just proxies for the relative likelihood of a customer to take a certain action. If a customer purchases a bike, their score for propensity to buy a bike pump is likely much higher than if they didn't. Note we did not have to have any prior knowledge of this fact, rather it is purely evidence based, based upon patterns and occurrences found by analysis of the data by algorithms. In that case all I had to know was that the customer purchased a bike. However there are far more subtle and complex relationships than this that algorithms will automatically find.

Once we have assigned these scores, we use them as building blocks and lay some rules over the top - these rules help us to decide what to offer up at the point of meeting the customer - on the site. For example if the customer has already bought somethng with the highest score, and it is not a repeatable purchase, we do not offer that same item, but the next item down.

And with that I bring you full circle - by putting all these concepts together and using them as the drivers of what is served up on our site, we have created exactly what we described: a highly personalized and uniquely relevant site.

In my next post I will discuss how you move from a standing start to actualizing this scenario - with a predictive analytics road map.



Wednesday, March 14, 2012

Big Data and the Stalker Economy

Interesting Forbes article - which should be of interest to people in the industry and people who are not...as is quoted “if you’re not paying for the service, you’re the product.”!

A worthwhile read. The article begins:

Crack is being served in Silicon Valley. An enthusiastic crowd of geeks and suits — all of them “data scientists” — just spent three days at the O’Reilly Strata conference (#strataconf) in Santa Clara. All over the event’s menu is the crack cocaine of our day: big data.

The full article can be found here.


Saturday, February 18, 2012

How Companies Learn Your Secrets

A fascinating and in-depth article in the New York Times - covering predictive analytics, habit formation, new product marketing psychology, ethical considerations and scaring customers.  Definitely worth a read by anyone interested in understanding how sophisticated the analytics initiatives are at some of the world's largest retailers. Beyond that it goes into a very interesting discussion on the psychology of habit which would be broadly interesting to anyone.

It begins... Andrew Pole had just started working as a statistician for Target in 2002, when two colleagues from the marketing department stopped by his desk to ask an odd question: “If we wanted to figure out if a customer is pregnant, even if she didn’t want us to know, can you do that? ”       

The full article can be found here.


Friday, February 17, 2012

Predicting Bounce Rates in Sponsored Search Advertisements

Here is an interesting paper on research into utilizing predictive analytics  to predict the likelihood of someone bouncing from an online ad (i.e. specifically Google Adwords). The practical definition of bounce is someone clicking on your ad, coming to your site, and leaving pretty much immediately. That is to say, they took one look, were not impressed (generally because they found it irrelevant to what they were looking for or did not like the look of your site).

The researchers found that predictive analytics could be used to increase the ability to predict the likelihood of someone bouncing. To quote an excerpt from the conclusion "We have also shown that even in the absense of substantial clickthrough data, bounce rate may be estimated through machine learning when applied to features extracted from sponsored search advertisements and their landing pages"

The full paper written by D. Sculley, Robert Malkin, Sugato Basu and Roberto Bayardo can be found here.

Friday, January 20, 2012

Requirements for Advanced Analytics

Very good article by James Taylor published at Information Management.

Gathering requirements for advanced analytics, data mining and predictive analytics is an interesting topic. It is interesting not because interviewing techniques are different for advanced analytics nor because the various techniques for helping technical teams elicit real requirements from businesspeople are different. The “how” of requirements is the same for advanced analytics as it is for anything else. It is interesting because what you need to ask about - the “what” – is unique. Full article here.

_