Thursday, December 23, 2010

Predicting Patients at Risk of Being Hospitalized

Fantastic initiative launched by Heritage Provider Network, a US managed care organization. 

They believe very strongly that predictive analytics can help them significantly reduce the number of unnecessary (or preventable) hospitalizations.  They are backing this belief to the tune of $3 million in prize money, offering it to anyone in the world who can find and exploit the most useful patterns in health records and claims data to help predict who is most at risk of early hospitalization. They believe there is $30 Billion per year being spent in the US alone on unnecessary health admissions - so in many ways, the more obvious question becomes why hasn't anyone done this before? I imagine that we will begin to see a lot more of this sort of thing, following in the footsteps of last years NetFlix Challenge where NetFlix offered a prize of $1 million to the team/individual who was able to improve NetFlix's ability to predict which movie someone was likely to want to like (again based upon patterns in their data).
 



More than 71 Million individuals in the United States are admitted to hospitals each year, according to the latest survey from the American Hospital Association.  Studies have concluded that in 2006 well over $30 billion was spent on unnecessary hospital admissions.  How many of those hospital admissions could have been avoided if only we had real-time information as to which patients were at risk for future hospitalization?  This is more than just an academic question:  every unnecessary admission to the hospital places the patient at risk and uses scarce medical resources unwisely.


The Heritage Provider Network (HPN) launched the $3 million Heritage Health Prize with one goal in mind: to develop a breakthrough algorithm that uses available patient data, including health records and claims data, to predict and prevent unnecessary hospitalizations.  Heritage believes that incentivized competition – one that includes the involvement of those with passionate minds that don’t know what can’t be done – is the best way to achieve the radical breakthroughs and innovations necessary to reform our health care system.  Sponsoring this prize is simply one way that Heritage believes it can help solve a societal problem.


The winning Team will create a predictive algorithm that can identify patients who are at risk for hospital admissions.  Once known, health care providers can develop new care plans and strategies to reach patients before emergencies occur, thereby reducing the number of unnecessary hospitalizations.  This will result in increasing the health of patients while decreasing the cost of care.  In short, a winning solution will change health care delivery as we know it – from an emphasis on caring for the individual after they get sick to a true HEALTH care system.

Monday, December 13, 2010

Predictive Analytics to Detect Missed Charges in a Hospital System

Hospitals are complex environments with numerous charges for a multitude of small items, all administered by humans, so in the industry it is just about considered a given that some charges will slip through the cracks.  Karen Minich-Pourshadi recently wrote a very interesting article about a hospital in Washington which was able to pick up one million dollars of revenue just in the first 90 days of implementing a predictive analytics system to detect which hospital bills were most likely to have been undercharged – charge recovery. There were two benefits to it, the transactions were detected and corrected prior to billing, and it highlighted the areas and doctors most at risk of under-charging so that they could focus on improving in the future.

“For instance, the system flagged specific diagnosis which usually have lab tests associated with them if the lab test codes were missing. In doing so, they were able to capture all the charges associated with a diagnosis and then alert clinicians to be aware of their mistakes.”

This is a great example of searching for patterns in the screeds of data residing in an organization, to deliver massive value. This example of course doesn’t only apply to healthcare, but to any complex billing system that is handled by humans. Conversely you  can imagine that this same approach is very useful for those who are paying the hospital bills (i.e. insurance companies) – those parties of course are more interested in identifying over-billing rather than under-billing.

 At 11Ants Analytics we have recently done some interesting work, the specific application is unfortunately confidential by the customers request, but it is in the space of combing millions of transactions, looking for anomalies which yield up opportunities for major savings.

The advantages of taking a predictive analytics approach of learning from the patterns in the data, as opposed to a rules-based approach, is that a rules based approach requires knowing every potential problem area before starting, while with a predictive analytics approach, there are no assumptions going into it, and the patterns are ‘learned’ whatever they may be. The other benefit is that deployment of the solution is usually much faster, both on the development side and the implementation side as it is used every month.

Sunday, November 7, 2010

Predictive Analytics in Education - Predicting University Drop Outs

Why does the world need this blog?
 
I can’t be sure that it does. However it is fitting that in the first post I describe the encounter which led to my thinking that it may have at least something to offer.
 
Recently I was sitting in the lounge at LAX, after attending Predictive Analytics World and some meetings in the US. I struck up a conversation with a nice couple sitting across from me. They were both specialists in education. Specialists would actually be an understatement, they were both academics with several degrees each – and well respected in their fields. We got onto the topic of some research one of them was interested to do.

The research was to try to get further insight into what causes a student to abandon University study (otherwise known as drop out...). With the overall objective being to predict those who are at risk of dropping out at an early enough stage to intervene; and further to try to identify those factors most likely to  cause the abandonment. This has broad implications for both increasing the effectiveness of education delivered, and the financial results of the Universities themselves. I asked him what data he had, and he said he had several million records from a number of Universities, which included full data about students, courses studied, etc and whether they had dropped out or not. I asked him a little about the data, and then said to him that this should actually be quite easy.
 
At which point he looked at me like I had three heads!

Needless to say, I couldn’t help myself, and pulled out my laptop, and showed him how software like 11Ants Model Builder could actually make a job like this quite trivial. It was something like  watching someone have a religious experience - I could actually see a sense of excitement and enthusiasm cross his face as it dawned on him that he would actually be able to perform this data analysis himself, and without it taking months.

 (Just briefly to ensure that I deliver on the promise of the title of the blog: in simple terms, what you do is consolidate all the historical data by student. If you consider it in Excel format: all the rows would be individual students, and all the columns would be data points about the student, which could include the specific classes they have taken, previous education, etc. even the brand of cell phone they carry (if you suspected this could have causality) – we refer to these as ‘input columns’. Then the final column would be what we refer to as a ‘target column’ – that would be one of two values “Abandoned” or “Completed” – which you would tag every student with, depending upon whether they abandoned or completed. Then you would use a tool like 11Ants Model Builder to begin analyzing the data for patterns, trying to conduct a relationship between the input columns and the target columns. If the patterns were sufficiently strong, the result would be a Predictive Model which could then be applied to unseen students, and a prediction made as to whether they fall into the category of Abandon or Complete. You can actually go further and apply this to a propensity model, which would rank every student from most likely to abandon to least likely to abandon. This means that with limited resources you can just work your way down the list with intervention programs, knowing that your resources are focused on those most at risk. You can also get some concrete sense of which of the inputs are the most useful predictors to a student dropping out. This all sounds rather complex, but the reality is it is not – if you want to see how something like this works, there are some quite good short videos with other data at www.11AntsAnalytics.com  or feel free to email me.)

So returning to the airport lounge party... suddenly there was a new linkage between two (until that point disjointed) areas of specialty – namely (1) his long-time interest and understanding of educational factors and data and now (2) the area of predictive analytics.

 His new-found understanding of what was possible (and more importantly accessible to him) with predictive analytics would be an example of where 2+2 equals significantly more than four.

So rather than living with the knowledge that students were dropping out of college because I wasn’t spending enough time in airport lounges, I thought it would be good to create a forum where people from business, science and government could learn more about predictive analytics and its applications.
 
It is my intention to continue posting examples of real world applications for predictive analytics.

You will probably finding me referring to our software 11Ants Model Builder now and then, but that is only because I genuinely believe it is the easiest way on the planet for anyone to begin to understand and harness the power of predictive analytics - the software is used by complete beginners all the way through to PhDs in data mining. However, first and foremost this is a blog about educating people as to real world applications for predictive analytics - I definitely welcome any questions, suggestions or requests.