What is Anomaly Detection, and How Can Generative Models Be Applied to It?
By Ixeia S.
When we talk about generative models nowadays, we immediately think of ChatGPT or DALL-E by OpenAI. Lately, these large models have made an immense impact on our daily lives. But generative models have been around for quite a while. The first boom was almost ten years ago with the famous Generative Adversarial Network or commonly known as GAN.
While generative models became more popular for producing art images or text, recent improvements have broadened their application to other topics as well. One of them is anomaly detection.
What is Anomaly Detection: A Closer Look
Anomaly detection is the process of identifying unusual events or items in a dataset that do not follow the normal pattern of behavior. It’s especially relevant in cybersecurity, fraud detection, or machine vision.
A common example is the detection of dangerous items in X-ray scans at airports. In this case, anomalies such as guns or knives should be automatically detected.
In anomaly detection tasks, datasets are highly biased towards one class (normal) due to a lack of examples from the other (abnormal). When this problem is approached as supervised classification, it can be quite challenging and time-consuming to obtain enough abnormal examples.
So what can we define as an anomaly in this context? Definitions are generally broad, and there may not be enough examples to support them. We call this a class imbalance. Training a classification algorithm with the labeled objects from the images would lead to the misclassification of the less representative class; in this case, anomalies.
The approach that generative models suggest for anomaly detection is quite interesting. It attempts to classify abnormal samples lacking labels. In this article, we’ll dive deeper into the method described by Akcay et al. in 2018 for one of the most important GAN approaches for this field: “GANomaly”.
GANomaly: A Generative Approach for Anomaly Detection
There are three parts to the model architecture:
The main result of the process is the "anomaly score", which measures the similarity between z and z'. The lower the score, the more similar they are.
After training, the generator has learned to create images that follow the data distribution for the normal samples. In other words, as long as the input image does not include any abnormal items, it should be reconstructed without errors.
However, when an abnormal item shows up, it cannot be attributed to the normal distribution. In this case, reproducing the abnormal item in the fake image x' is very difficult, which results in a higher anomaly score. Since the generator compares how similar the examples are, it concludes that the examples are very different. Therefore, the fewer similarities, the higher the score.
This statement is clearer in the following image comparison.
In Figure 3, we see 9 different images. In (a), we can see the original images, whereas in (b) we list the fake reconstruction of each image that the model has completed. We can observe that when there’s an abnormal item, a gun, for instance, the model cannot reconstruct it. So for images 1 to 4, the abnormal score is higher.
The GANomaly model's ability to detect anomalies in a series of images shows just how effective this approach is, especially when dealing with complex datasets. And it can be particularly valuable in real-world scenarios where anomaly detection is crucial, such as cybersecurity, healthcare, financial fraud detection, and even manufacturing.
Recommended by LinkedIn
What Are the Pros and Cons of Using Generative Models in Anomaly Detection?
Let’s take a look at another example.
A chemical manufacturing company that produces a type of polymer (e.g., polypropylene) used in the automotive industry (e.g., for car batteries, bumpers, cable insulation) relies on a complex production process that involves managing raw materials with sensitive parameters, such as temperature, pressure, and reaction times.
Often, the quality control (QC) process begins when there are already a few samples available. So by the time the quality control is complete, an entire poor-quality batch has already been produced.
By using generative models like GANs for anomaly detection, the company would train the models using data from previous production runs and successful batches, including the raw materials’ parameters and the characteristics of a final high-quality polymer. And once trained, the models would monitor the production process and flag in real-time any deviation (e.g., higher temperature, lower pressure) from the expected values.
What are the advantages of this model?
While using generative models for anomaly detection has many advantages, there are also a few limitations to be aware of:
The Role of Bias and Interpretability in Generative Models
As data becomes the key source of information for any generative model – bias, interpretability, and fairness become increasingly important.
Let’s take the example of airport X-ray scans we previously explored. The input images contain some abnormalities, but we don’t have any labels, and we don’t know what the model considers abnormal at this point. For X-ray images, guns are anomalies, but the model cannot reproduce other uncommon ones.
Now imagine similar scenarios involving medical equipment for a rare disease or unique cultural items. Generative models can be helpful but are far from being autonomous. To correct potential errors made by these models, we still need human supervision.
As with any other technology, generative models are not without their limitations. The complexity of these models often makes it difficult to understand how they arrive at their outputs, raising questions about accountability and transparency. Additionally, because generative models are trained on large datasets which may contain biases, these biases can be replicated and amplified in the models' outputs.
Researchers and practitioners are actively working to address these issues and improve the transparency and fairness of generative models. As technology continues to evolve, we can expect to see more advanced and more interpretable models that are better equipped to tackle real-world problems and serve the needs of society.
A Practical Approach to Implementing Generative Models for Anomaly Detection
In the meantime, to make the most out of applying generative models to anomaly detection, here is what we recommend:
At Visium, we developed a system that detects anomalies for Nestlé, using sound-based machine analytics to prevent downtime. You can find a full breakdown of the solution in this video. The AI model we designed for Nestlé relies on one denoising autoencoder, which is a more straightforward architecture than the one used by GANomaly. On the other hand, the method Visium developed is far more complex than merely comparing anomaly scores. It's important to note how theory and practical tools are combined in AI models to apply them to real-life scenarios and tailor each solution to the problem at hand.
And as we continue to explore and enhance generative models, we can expect a deeper understanding and more efficient application in anomaly detection.
Great article - thank you!
Founder at MageMetrics → your data companion | Serial entrepreneur, 3x founder & 1x exit
1yVery insightful read Ixeia, congrats !