What is Anomaly Detection, and How Can Generative Models Be Applied to It?

What is Anomaly Detection, and How Can Generative Models Be Applied to It?

By Ixeia S.


When we talk about generative models nowadays, we immediately think of ChatGPT or DALL-E by OpenAI. Lately, these large models have made an immense impact on our daily lives. But generative models have been around for quite a while. The first boom was almost ten years ago with the famous Generative Adversarial Network or commonly known as GAN. 

While generative models became more popular for producing art images or text, recent improvements have broadened their application to other topics as well. One of them is anomaly detection. 

What is Anomaly Detection: A Closer Look

Anomaly detection is the process of identifying unusual events or items in a dataset that do not follow the normal pattern of behavior. It’s especially relevant in cybersecurity, fraud detection, or machine vision. 

A common example is the detection of dangerous items in X-ray scans at airports. In this case, anomalies such as guns or knives should be automatically detected.

No alt text provided for this image
Figure 1. Examples of X-ray images at control airports. Source: arxiv.org/abs/1805.06725

In anomaly detection tasks, datasets are highly biased towards one class (normal) due to a lack of examples from the other (abnormal). When this problem is approached as supervised classification, it can be quite challenging and time-consuming to obtain enough abnormal examples. 

So what can we define as an anomaly in this context? Definitions are generally broad, and there may not be enough examples to support them. We call this a class imbalance. Training a classification algorithm with the labeled objects from the images would lead to the misclassification of the less representative class; in this case, anomalies.

The approach that generative models suggest for anomaly detection is quite interesting. It attempts to classify abnormal samples lacking labels. In this article, we’ll dive deeper into the method described by Akcay et al. in 2018 for one of the most important GAN approaches for this field: “GANomaly”.

GANomaly: A Generative Approach for Anomaly Detection

There are three parts to the model architecture:

No alt text provided for this image
Figure 2. GANomaly model architecture. Source: arxiv.org/abs/1805.06725

  1. Generator: following the architecture of a Variational Autoencoder (VAE), this network maps the input image to a lower dimension vector, z, which is then used to reconstruct the generated output image. The hypothesis is that z holds the smallest dimension containing the best representation of the image x.
  2. Discriminator: this network tries to distinguish between the real, x, and fake images, x’,  that the generator has created. This is the well-known behavior of GANs that aims at fooling the discriminator by reconstructing fake images very similar to real images.
  3. Encoder: the novelty network for this architecture. This encoder has the same architecture as the first part of the Generator, GE(x). This network maps the fake image x’ to the lower dimension z’, which has the same size as z. The output is used as the loss to optimize the model.

The main result of the process is the "anomaly score", which measures the similarity between z and z'. The lower the score, the more similar they are.

After training, the generator has learned to create images that follow the data distribution for the normal samples. In other words, as long as the input image does not include any abnormal items, it should be reconstructed without errors.

However, when an abnormal item shows up, it cannot be attributed to the normal distribution. In this case, reproducing the abnormal item in the fake image x' is very difficult, which results in a higher anomaly score. Since the generator compares how similar the examples are, it concludes that the examples are very different. Therefore, the fewer similarities, the higher the score.

This statement is clearer in the following image comparison.

No alt text provided for this image
Figure 3. Results comparison for the input image x, (a), and the fake reconstructed image x’, (b). Source: arxiv.org/abs/1805.06725

In Figure 3, we see 9 different images. In (a), we can see the original images, whereas in (b) we list the fake reconstruction of each image that the model has completed. We can observe that when there’s an abnormal item, a gun, for instance, the model cannot reconstruct it. So for images 1 to 4, the abnormal score is higher. 

The GANomaly model's ability to detect anomalies in a series of images shows just how effective this approach is, especially when dealing with complex datasets. And it can be particularly valuable in real-world scenarios where anomaly detection is crucial, such as cybersecurity, healthcare, financial fraud detection, and even manufacturing. 

What Are the Pros and Cons of Using Generative Models in Anomaly Detection?

Let’s take a look at another example.

A chemical manufacturing company that produces a type of polymer (e.g., polypropylene) used in the automotive industry (e.g., for car batteries, bumpers, cable insulation) relies on a complex production process that involves managing raw materials with sensitive parameters, such as temperature, pressure, and reaction times. 

Often, the quality control (QC) process begins when there are already a few samples available. So by the time the quality control is complete, an entire poor-quality batch has already been produced. 

By using generative models like GANs for anomaly detection, the company would train the models using data from previous production runs and successful batches, including the raw materials’ parameters and the characteristics of a final high-quality polymer. And once trained, the models would monitor the production process and flag in real-time any deviation (e.g., higher temperature, lower pressure) from the expected values.

No alt text provided for this image

What are the advantages of this model?

  • Overcomes the dataset’s label limitations by saving time and broadening its scope. A generative model doesn’t require a labeled example of every possible anomaly and can detect deviations previously unknown.
  • Performs a thorough analysis based on the data it is trained on. As long as it performs well, it can become a tool that can assist humans and help them overcome fatigue or lack of focus.
  • It’s scalable. Once trained, the model can be implemented across the entire organization.
  • Identifies anomalies in real-time or nearly real-time, which helps to avoid costly delays and errors.
  • As the model learns, it can start predicting potential anomalies, preventing unnecessary disruptions.

While using generative models for anomaly detection has many advantages, there are also a few limitations to be aware of:

  • Generative models might be less accurate since assumptions make them more biased. There's also the risk of a false positive or false negative, which could lead to undetected anomalies.
  • Interpretability is difficult to control when the objective is for the model to learn on its own. The "black-box" nature of these models makes it difficult to identify what is causing the model to flag an anomaly.
  • The models are highly dependent on the quality and the quantity of the training data, and abnormal medical data are usually scarce compared with normal data.
  • Generative models, especially GANs, are complex and require a lot of computing power and expertise.

The Role of Bias and Interpretability in Generative Models

As data becomes the key source of information for any generative model – bias, interpretability, and fairness become increasingly important.

Let’s take the example of airport X-ray scans we previously explored. The input images contain some abnormalities, but we don’t have any labels, and we don’t know what the model considers abnormal at this point. For X-ray images, guns are anomalies, but the model cannot reproduce other uncommon ones.

Now imagine similar scenarios involving medical equipment for a rare disease or unique cultural items. Generative models can be helpful but are far from being autonomous. To correct potential errors made by these models, we still need human supervision. 

No alt text provided for this image

As with any other technology, generative models are not without their limitations. The complexity of these models often makes it difficult to understand how they arrive at their outputs, raising questions about accountability and transparency. Additionally, because generative models are trained on large datasets which may contain biases, these biases can be replicated and amplified in the models' outputs.

Researchers and practitioners are actively working to address these issues and improve the transparency and fairness of generative models. As technology continues to evolve, we can expect to see more advanced and more interpretable models that are better equipped to tackle real-world problems and serve the needs of society.

A Practical Approach to Implementing Generative Models for Anomaly Detection

In the meantime, to make the most out of applying generative models to anomaly detection, here is what we recommend:

  • Ensure access to the right expertise and skilled data scientists who recognize the nuances of training and interpreting generative models. Using explainability tools and techniques can also give the upper hand in understanding how these models flag certain anomalies.
  • Because generative models learn from and rely on quality data for accurate detection, invest time in collecting and curating high-quality data.
  • Improve the performance and reliability of anomaly detection by using generative models alongside other methods, such as supervised learning techniques or traditional statistical approaches. And continuously monitor the performance of the model by refining the training system and integrating new data.

At Visium, we developed a system that detects anomalies for Nestlé, using sound-based machine analytics to prevent downtime. You can find a full breakdown of the solution in this video. The AI model we designed for Nestlé relies on one denoising autoencoder, which is a more straightforward architecture than the one used by GANomaly. On the other hand, the method Visium developed is far more complex than merely comparing anomaly scores. It's important to note how theory and practical tools are combined in AI models to apply them to real-life scenarios and tailor each solution to the problem at hand.

And as we continue to explore and enhance generative models, we can expect a deeper understanding and more efficient application in anomaly detection.

Timon Zimmermann

Founder at MageMetrics → your data companion | Serial entrepreneur, 3x founder & 1x exit

1y

Very insightful read Ixeia, congrats !

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics