Ad

Fraud detection, a common use of AI, belongs to a more general class of problems — anomaly detection.

An anomaly is a generic, not domain-specific, concept. It refers to any exceptional or unexpected event in the data: a mechanical piece failure, an arrhythmic heartbeat, or a fraudulent transaction.

Basically, identifying a fraud means identifying an anomaly in the realm of a set of legitimate transactions. Like all anomalies, you can never be truly sure of the form a fraudulent transaction will take on. You need to take all possible “unknown” forms into account.

Here’s an interesting article on doing anomaly/fraud detection with a neural autoencoder.

Using a training set of just legitimate transactions, we teach a machine learning algorithm to reproduce the feature vector of each transaction. Then we perform a reality check on such a reproduction. If the distance between the original transaction and the reproduced transaction is below a given threshold, the transaction is considered legitimate; otherwise it is considered a fraud candidate (generative approach). In this case, we just need a training set of “normal” transactions, and we suspect an anomaly from the distance value.

Based on the histograms or on the box plots of the input features, a threshold can be identified. All transactions with input features beyond that threshold will be declared fraud candidates (discriminative approach). Usually, for this approach, a number of fraud and legitimate transaction examples are necessary to build the histograms or the box plots.

tt ads

Leave a Reply

Your email address will not be published. Required fields are marked *
You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.