Why does the world need this blog?
I can’t be sure that it does. However it is fitting that in the first post I describe the encounter which led to my thinking that it may have at least something to offer.
Recently I was sitting in the lounge at LAX, after attending Predictive Analytics World and some meetings in the US. I struck up a conversation with a nice couple sitting across from me. They were both specialists in education. Specialists would actually be an understatement, they were both academics with several degrees each – and well respected in their fields. We got onto the topic of some research one of them was interested to do.
The research was to try to get further insight into what causes a student to abandon University study (otherwise known as drop out...). With the overall objective being to predict those who are at risk of dropping out at an early enough stage to intervene; and further to try to identify those factors most likely to cause the abandonment. This has broad implications for both increasing the effectiveness of education delivered, and the financial results of the Universities themselves. I asked him what data he had, and he said he had several million records from a number of Universities, which included full data about students, courses studied, etc and whether they had dropped out or not. I asked him a little about the data, and then said to him that this should actually be quite easy.
The research was to try to get further insight into what causes a student to abandon University study (otherwise known as drop out...). With the overall objective being to predict those who are at risk of dropping out at an early enough stage to intervene; and further to try to identify those factors most likely to cause the abandonment. This has broad implications for both increasing the effectiveness of education delivered, and the financial results of the Universities themselves. I asked him what data he had, and he said he had several million records from a number of Universities, which included full data about students, courses studied, etc and whether they had dropped out or not. I asked him a little about the data, and then said to him that this should actually be quite easy.
At which point he looked at me like I had three heads!
Needless to say, I couldn’t help myself, and pulled out my laptop, and showed him how software like 11Ants Model Builder could actually make a job like this quite trivial. It was something like watching someone have a religious experience - I could actually see a sense of excitement and enthusiasm cross his face as it dawned on him that he would actually be able to perform this data analysis himself, and without it taking months.
(Just briefly to ensure that I deliver on the promise of the title of the blog: in simple terms, what you do is consolidate all the historical data by student. If you consider it in Excel format: all the rows would be individual students, and all the columns would be data points about the student, which could include the specific classes they have taken, previous education, etc. even the brand of cell phone they carry (if you suspected this could have causality) – we refer to these as ‘input columns’. Then the final column would be what we refer to as a ‘target column’ – that would be one of two values “Abandoned” or “Completed” – which you would tag every student with, depending upon whether they abandoned or completed. Then you would use a tool like 11Ants Model Builder to begin analyzing the data for patterns, trying to conduct a relationship between the input columns and the target columns. If the patterns were sufficiently strong, the result would be a Predictive Model which could then be applied to unseen students, and a prediction made as to whether they fall into the category of Abandon or Complete. You can actually go further and apply this to a propensity model, which would rank every student from most likely to abandon to least likely to abandon. This means that with limited resources you can just work your way down the list with intervention programs, knowing that your resources are focused on those most at risk. You can also get some concrete sense of which of the inputs are the most useful predictors to a student dropping out. This all sounds rather complex, but the reality is it is not – if you want to see how something like this works, there are some quite good short videos with other data at www.11AntsAnalytics.com or feel free to email me.)
So returning to the airport lounge party... suddenly there was a new linkage between two (until that point disjointed) areas of specialty – namely (1) his long-time interest and understanding of educational factors and data and now (2) the area of predictive analytics.
His new-found understanding of what was possible (and more importantly accessible to him) with predictive analytics would be an example of where 2+2 equals significantly more than four.
So rather than living with the knowledge that students were dropping out of college because I wasn’t spending enough time in airport lounges, I thought it would be good to create a forum where people from business, science and government could learn more about predictive analytics and its applications.
Needless to say, I couldn’t help myself, and pulled out my laptop, and showed him how software like 11Ants Model Builder could actually make a job like this quite trivial. It was something like watching someone have a religious experience - I could actually see a sense of excitement and enthusiasm cross his face as it dawned on him that he would actually be able to perform this data analysis himself, and without it taking months.
(Just briefly to ensure that I deliver on the promise of the title of the blog: in simple terms, what you do is consolidate all the historical data by student. If you consider it in Excel format: all the rows would be individual students, and all the columns would be data points about the student, which could include the specific classes they have taken, previous education, etc. even the brand of cell phone they carry (if you suspected this could have causality) – we refer to these as ‘input columns’. Then the final column would be what we refer to as a ‘target column’ – that would be one of two values “Abandoned” or “Completed” – which you would tag every student with, depending upon whether they abandoned or completed. Then you would use a tool like 11Ants Model Builder to begin analyzing the data for patterns, trying to conduct a relationship between the input columns and the target columns. If the patterns were sufficiently strong, the result would be a Predictive Model which could then be applied to unseen students, and a prediction made as to whether they fall into the category of Abandon or Complete. You can actually go further and apply this to a propensity model, which would rank every student from most likely to abandon to least likely to abandon. This means that with limited resources you can just work your way down the list with intervention programs, knowing that your resources are focused on those most at risk. You can also get some concrete sense of which of the inputs are the most useful predictors to a student dropping out. This all sounds rather complex, but the reality is it is not – if you want to see how something like this works, there are some quite good short videos with other data at www.11AntsAnalytics.com or feel free to email me.)
So returning to the airport lounge party... suddenly there was a new linkage between two (until that point disjointed) areas of specialty – namely (1) his long-time interest and understanding of educational factors and data and now (2) the area of predictive analytics.
His new-found understanding of what was possible (and more importantly accessible to him) with predictive analytics would be an example of where 2+2 equals significantly more than four.
So rather than living with the knowledge that students were dropping out of college because I wasn’t spending enough time in airport lounges, I thought it would be good to create a forum where people from business, science and government could learn more about predictive analytics and its applications.
It is my intention to continue posting examples of real world applications for predictive analytics.
interesting
ReplyDelete