Related Blogs


« Adeptra Acquisition Promises Full-Cycle Decision Management | Main | Payment card fraud—a European success story »

Fraud Analytics That Adapt on the Fly

A major implication of Big Data, as I discussed in my last post, is that analytics must rely less on persistent (historical) data and instead adjust dynamically in the stream. This is particularly true for fraud analytics, given the ever-changing nature of fraud.

Here's an example. Development of the traditional neural network fraud model requires access to high-quality historical data. This presents a challenge in emerging markets, which tend to have volatile market dynamics, data availability restrictions, fraud tag reporting issues and constantly changing customer behaviors.

To counter this, we recently improved our patented self-calibrating fraud analytics technology into what we call Multi-Layered Self-Calibrating (MLSC) analytics. This innovation was specifically designed to further improve detection accuracy in the stream. MLSC also has the benefit of being a robust adaptive modeling technology for changing data environments.

The diagram below illustrates our standard self-calibrating outlier technology. First, we must determine normal behavior patterns by computing the distributions of each variable continuously in real-time. From there, we can define the outliers to these normal patterns. We quantify the point(s) in the distribution of variable values that if exceeded would be considered an outlier. There is also a formula that quantifies the size of the outlier value—how far the variable value is along the red tail in the diagram. 


MLSC-Analytics-Blog-450-px
 
These self-calibrating models are designed to learn outlier values in real-time while running in production. This is a huge advantage, allowing fraud detection to adapt on the fly as new fraud patterns emerge.

The model architecture of FICO’s new streaming Multi-Layered Self-Calibrating analytics takes a best-of-both-worlds approach. It merges the properties of neural network models with the traditional self-calibrating models. The MLSC model resembles that of a neural network model in that it is built to detect nonlinear relationships between data variables—critical in fraud detection since fraud patterns are often not linear. It also resembles a self-calibrating model because it can continuously adjust in real-time based on the transaction stream.

Compared to a traditional neural net, the Multi-Layered Self Calibrating model:

  • Is built to adapt in production, unlike neural nets where weights are fixed after initial training. 
  • Can be easily tuned to the needs of a specific market and is more robust to model degradation. This is due to the more flexible, adaptable design of its hidden layer nodes (each a self-calibrating mini-model).
  • Doesn't need tuning as frequently because of its adaptive nature.
  • Requires much less data during model development and has more tolerance for low-quality data.

This last bullet point is an important one. It means we can tune a MLSC model using roughly a week to a month of data, compared to about 18 months of historical data for a neural network model. Indeed, the MLSC models have demonstrated strong performance for clients without large amounts of data and with lower-quality fraud reporting. This makes it the ideal fraud technology for emerging markets where there isn’t sufficient data to leverage neural networks.

First time on the Banking Analytics Blog?
Subscribe to the Banking Analytics Blog Feed or check out some other recent posts:

Comments

Highstone Tower

Scott,

I can see your point. But (and this is a big but) do not underestimate importance of human judgement. You could easily damage model transparency by running this self-calibrating models.

I discuss this topic here

http://www.highstonetower.com/?p=434

Scott Zoldi

Highstone Tower:
Thank you for your comment. I’d like to clarify that technologies like Self-Calibrating analytics do not change the variables, but rather the scaling of the variables based on more accurate real-time population statistics vs. dated historical ones. This allows the model to continue to produce the results as expected vs. entering pre-maturely into a degradation cycle when population variable scalings move about. It is true that transparency is a key consideration for many types of models -- for instance with credit risk models, in part to be able to explain decisions to banking regulators and customers. Of course, this is less of a concern for fraud models, and indeed the unique and dynamic nature of fraud demands a distinctive modeling approach in order to ensure fast detection.

Post a comment

Comments are moderated, and will not appear on this weblog until the author has approved them.

If you have a TypeKey or TypePad account, please Sign In.