Tuesday, February 1, 2011

Predictive Analytics for Border Protection

Imagine that you’ve been given responsibility for protecting a nation’s borders.  Your job may be any or all of the following: stop bad people from entering, stop illegal substances from entering, stop weapons from entering, or it may be to ensure that importers pay their fair share of customs duties. 

Whatever the objective, you will face a similar problem: how to block the most bad guys, with minimum disruption to the good guys. The minimum disruption to the good guys is important – both from a goodwill and from a cost standpoint – no government in the world has unlimited resources. Blocking the bad guys is obviously also very important - the more time you waste on good guys, the less likely you are to get an extra bad guy.

Predictive analytics hold massive potential to assist customs services, border protection agencies, and homeland security agencies to more efficiently sort through the ever increasing amount of noise thrown at them and isolate those individuals/shipments that should be given special attention.

Let’s look at an example of determining which incoming shipment is at risk of having under-declared customs duties. You can equally apply this to any other type of inspection task:

How is this done now?  Depending upon where in the world you are, the answer is largely a combination of ‘gut feel’, rules based instructions, and/or random checking. Depending upon the experience of the staff, there becomes an implicit understanding as to which shipments are perceived to be of greater risk. However, this experience is not evenly distributed – an inspector with 20 years of experience is by definition likely to have a better ‘gut feel’ than one on her second week on the job.

Fortunately, what many customs services around the world do have is a massive database recording every shipment that has ever entered their jurisdiction, along with every known case of  under-declared customs duties. Accordingly they possess a very valuable (yet under-utilized) asset: 1) lots of examples of ‘events’ (i.e. in this case: under declared customs duties) and 2) lots of potential inputs (or predictors) to help predict that event - the potential inputs are all the other information we have about the shipment, the bill of lading provides a plethora of such information (e.g. country of origin, ship name, port of loading, etc)

Though not many people automatically think of it like this, if correctly analyzed, this database holds the cumulative knowledge of every single customs officer that has ever found an under-declared shipment in the history of data gathering within that organization. The problem is this is literally impossible for a human to synthesize and therefore exploit - which is why not many people think of it in this way, this capability is relatively new.

So imagine if we could take all this historical data, throw some sophisticated predictive machine learning algorithms at it, and start analyzing that data for patterns which can help predict which shipments are likely to be under-declared. The emergent patterns are described in a 'predictive model'.

The next step is to run our database of today's incoming shipments against this model.  The model will apply a score to every shipment, the score would be a measure of risk – the higher the score, the more statistically likely it is to have under-declared its customs duties.  The output would be a list sorted by score, ranking the incoming shipment from most at risk to least at risk. Our inspection officers start at the top of the list (not the bottom, or the middle) and work their way down.  We can even statistically generate the optimal point in the list to stop.

This is not science fiction, but quite achievable. In some work I was involved with along these lines, at 11Ants Analytics,  a computer picked candidate for inspection was three times more likely to require inspection than a randomly selected one. The ultimate implementation of such a system would continue learning all the time, and continue having inspectors knowledge fed into it.

We can equally apply this principal to profiling of individuals, dangerous shipments, or many other things – the opportunities are massive.  All we require is: 1) an understanding of what would be useful for us to predict and 2) ancillary data relating to the examples which we can interrogate for some form of correlation.

If you are in a customs agency anywhere the world and interested in learning more about this type of application, please don’t hesitate to drop me an email.

No comments:

Post a Comment