Financial Crime Detection Can Now Be Fully Based on AI
Artificial intelligence (AI) and machine learning (ML) are now being leveraged by some major financial institutions due to their promising capability in self-learning and detection capabilities of known cases or typologies, but also of unknown unknowns. Applying AI/ML within a financial institution enables significant cost savings and improved accuracy, particularly in situations where the amount of false positives is so huge that it requires a labour-intensive approach, which is the case in most of the financial institutions of this world, despite their fine tuning efforts of existing supposedly proven detection platforms.
A disruptive approach will add significant value:
AI has proven now that the compliance industry can benefit from it in the fight against financial crime risk. The current approach to anti-money laundering has been limited until now by a rule-based screening and transaction monitoring approach that have been provided by legacy vendors, which has had negative consequences such as an intensive manual case investigation and a challenging decision making process, not always reliable and depending upon a few isolated suspicious transactions only, and not on a broader an holistic understanding of the supervised entity as such, which should be the customer of the financial institution or a network of individuals / companies seen as a logical entity to be risk scored.
In the majority of cases, those rule based approaches, with a huge spending on fine tuning efforts, are deemed to be non cost effective and not able to manage the risk in a fast-changing regulatory environment.
In my recent research on transactional monitoring performance, when dealing with a number of customers over my last 20 years of experience within that domain, I have established the following observations and identified a number of issues:
· The false-positive rate is in average around 96 percent, in the worst cases 99%, best cases after some intensive fine tuning efforts have been deployed, 90 percent.
· More than 90 minutes or more, is spent investigating an alert, which can become a boring, frustrating, and repetitive work for many investigators, and can unfortunately end up in the majority of cases in finding that the alert is a false positive, when even more non productive time is needed to consolidate it into a case if the alert is being labeled as being suspicious enough without having a complete certainty of it´s suspiciousness.
· The CDD review process is painful, and any enhanced due diligence process a real challenge, discounting the fact that customer onboarding time may take many days, unless we are considering a digital channel, in which case the friction can be high and frustrating for the customer.
These are some of the reasons why, among some difficulties to hire and retain employees in those investigative roles, discounting the risk exposure considerations derived from the generation of false negatives, most of the financial institutions are turning to new and somehow disruptive technologies, although already proven, such as artificial intelligence (AI) and machine learning (ML) to fill their compliance gaps, reduce costs, improve operational effectiveness, and ultimately reinforce their capabilities in fighting financial crime risk (FCR) in a context of emerging threats growing in complexity.
About Artificial Intelligence & Machine Learning
Machine learning, is an application of AI that can process data and allows computers to learn on their own without constant supervision, although it is always wise to check how relevant are the outcomes of this approach right from the first iteration, involving subject matter experts in the domain which is being monitored, recommending, and basing myself on my own experience, a particular focus on Trade Based Money Laundering, Correspondent Banking, and Terrorism, which are still unsufficiently explored territories.
And a factor to be taken into consideration being that AI systems are purpose-built; which requires a previous understanding of the risk factors that should be taken into consideration for a given domain, no one size fits all is applicable on that matter.
Most AI systems have three major capability building blocks:
Unsupervised learning
Unsupervised learning, as commonly accepted and formulated, is a type of machine learning in which the algorithm is not provided with any pre-assigned labels or scores for the training data. As a result, unsupervised learning algorithms must first self-discover any naturally occurring patterns in a given training data set. Common examples include clustering, where the algorithm automatically groups its training examples into categories with similar features, and principal component analysis, where the algorithm finds ways to compress the training data set by identifying which features are most useful for discriminating between different training examples, and discarding the rest. This contrasts with supervised learning in which the training data include pre-assigned category labels. But here is the challenge: the labels will have to be created by users or investigators without a complete certainty on the suspiciousness of the alerts, adding to that the fact that not enough true positives will be feeding the algorithms in a context for which a significant amount of data should be feeding them.
Advantages of unsupervised learning
Under that scenario, a major benefit goes with the inclusion of a minimal workload to prepare and audit the training set, in contrast to the supervised learning approach which requires a significant involvement of the investigators who will have to assign and validate the initial tags, with enough experience on their side in order to make that work reliable enough, which is not always an easy play within the financial institutions. Additionnally, another challenge being that you might not spot, despite your expertise as an investigator or auditor, some suspicious patterns that will therefore remain undetected, i.e. not noticed by your experts.
Underlying logic & context of a non supervised based approach & logic
A greater amount of training data may be required and this poses such challenges as the reaching of an acceptable performance level for the algorithms.
However the previous observation is only valid in the case of a loading of unstructured data and we are all aware that, in the majority of the cases, there will be no need for an automated classification to comprehend a supposedly amount of unstructured data and convert them into meaningful, structured data. The financial insitution will already be owning some good amount of structured data comprising clearly defined data types, and there will be no need to load unstructured data comprising data such as audio, video, images, and social media postings in a first stage of the project. However this data can be added very opportunistically in the enviroment of the investigator as a complementary forensic source of information taken from outside public or private sources or from a KYC feed. Also because customer due diligence (CDD) and know your customer (KYC) should be used in conjunction with transaction monitoring and SAR reporting in order to better justify the selection of the suspicious cases to safeguard the financial industry from financial crime risk (FCR). And taking into consideration that CDD is the process of obtaining information and documentation to ensure that we have a reasonable understanding of the customers’ true identity, his business context and expected activity, it will appear legitimate for the financial institution to complement the findings of the AI based detection process with such information.
Why consider the Use Case of Trade Based Money Laundering or Trade Finance
As global trade activities increase, global FIs face a major challenge when it comes to handling high value transactions and related trade documents. Trade Finance teams can be called to facilitate over hundreds of millions of documentary trade for customers every year. This is when an AI platform comes into play in order to enable the teams to get enough understanding into the loaded data in conjunction with the trade related documents.
Recommended by LinkedIn
And given that one of the challenges is about volume, we are already aware that the current transaction monitoring systems which are relying on simplistic rule-based monitoring to detect anomalies are going to contaminate our investigation space with too many false alerts in a context of a weak understanding of the context of the trade, with a challenging identification of who is the ultimate originator and the beneficiary of the trades. Due to the rule-based nature, systems will, despite their capabilities of incorporating a number of data points, not exploit them properly in terms of profiling in order to generate valid enough alerts. And unfortunately, when it comes to inquiring, my experience on the field and observation have lead me to understand that even more data points are needed for an investigator, in some scenarios more than 100 data points can be required for an investigator to complete an investigation, and under that assumption, any well structured true AI based platform, will be able to handle that amount of data, in contrast with rule based systems which have strict formating requirements at the loading stage and almost zero tolerance in regards to incomplete or wrongly formatted data.
And with the help of AI/ML based transaction monitoring almost any kind of data can be considered whether coming from internal databases or external available data in order to build the so much needed by the investigators, context of the trade for a quick and easy understanding of what can be considered legitimate against what is not, in regards, as an example, to some foreign payers and receivers which may well be associated to the same legal entity that initiated a transaction. What a better way to get rid of unnecessary false positives ?
Financial Crime Risk Factors Assessment
This is much needed and vital as a previous step to any AI based approach. The customer should be assessing and in certains cases reviewing / challenging a number of risk factors. Some of them may end up not being so relevant over time, some others might not have been considered and should not be missed.
In any case, the firms have to demonstrate their understanding of the strategy behind the algorithms and the limitation of how data is used and how their risk factors are leveraged. The FI has also to explain how those risk factors are correlated between themselves and weighted in their AI system in order to address the challenge of a reliable detection. It is also key to avoid piling up successive layers of risk indicators, which will only lead to opacity and confusion.
Additionnally, once the risk and compliance controls have have been validated, it is a delicate balance to strike between risk coverage and noise generation, so that to ensure that the AI-related risks can be effectively identified and managed within the limits set by the firms and the regulators.
A rules-based approach will usually be based on aggregating various risk factors, dealing with country risk, channel risk, legal-entity risk for corporates and product risk. The assessments as of now tend to be considering that quantity is key, the more risk factors you will have the better, when the best practices within that domain indicate now that this approach is considered to be a rigid and non constructive approach, pretty much similar to an arms´race, rather than approaching an accurate understanding of the real and prevalent key risks that you are facing as a financial insitution. Know your institution ( KYI ), in that case, is going to be key to your determination of a risk map, and often times, the intrusion of outside domain experts who will challenge the established risk map, goes together with a review of this same risk map and it´s own rationalisation.
Under a proper and standardised risk-based approach (RBA), a risk rating will allow to determine whether some enhanced due-diligence will be required and with which frequency of periodic reviews, not basing itself on a backward-looking risk scoring logic but rather on a predictive assessment of whether the customer is about to commit financial crime in the future or not, which within certain other domains of financial crime prevention, such as tax fraud for government agencies, is tagged as ¨very early detection¨.
But again, all the previous will perform as long as the identified risk factors have been translated or transposed into some appropriate ML features, which won´t have to necessarily match the risk factors on a 1 to 1 basis, given that some of those z score based features may well have the capacity to take into consideration several of the risk factors in a combined way.
Algorithms & Detection Performance Considerations
In order to avoid Algorithm bias, it is key to avoid creating non consistent data sets reflecting groups which may not be so consistent in terms of peer grouping logic, hence the importance at the start of any project of the setting up of a non supervised based segmentation process, which will generate customer segments considered to be consistent enough in terms of behaviour and ML detection, and will revert into an excellent and reliable profiling process.
Algorithms can have built-in biases and preferences, none of them should be considered as perfect and fully reliable:
This is exactly the reason why, In order to eliminate potential biases, a well balanced AI approach should not only take into consideration several algorithms running in parallel, but should also allow an orchestration logic to moderate the outputs of each and everyone of them, at the same time provide a balanced and more reliable overall score of the anomaly, which will be trusted enough when reviewed in it´s own logic by the investigator. It is therefore key to understand the limitations and shortcomings of any algorithm and embrace a holistic approach when it comes to processing the data, at the same time review regularly the performance of those same algorithms enabling the subject matter experts of the monitored domain to decide, following their manual review, whether the AI outcome is still reliable enough, producing enough detection worthy anomalies which will be then converted into valuable alerts, under the approval and review of the investigator, to be then submitted to further and operational detailed investigation. If the trust is in AI, the adoption by the investigators can be massive.
Transparency and Explainability
The opacity of an AI solution has been until very recently, the challenge perceived by the regulators. However, some disruptive AI players have proven and demonstrated to them the appropriateness and accuracy of their AI-based outputs, to such a degree, that some of those regulators have themselves adopted upon their own initiatives, such AI based approaches with some extraordinary results, without having to add more data scientists to their teams, a fact which is perceived as a great benefit to them, given the challenges most of them are facing in terms of recruitment and talent retention. And this move among regulators has added pressure to the Fis, which see themselves as having to adopt in turn and with no further delays, some AI/ML solutions requiring them to develop processes and tools to manage the residual risk in such a way that AI ends up, as a best practice to be adopted, being embedded into the existing risk management framework without necessarily having to set up new processes for dealing with AI, which in turn, helps overcome the challenge of AI transparency and auditability which until now were perceived as being major concerns.
Conclusions
AI/ML technologies are growingly being adopted as core components of the financial institutions’ strategies allowing then to achieve a reliable financial crime detection level and operational efficiency, this is now fully proven and does not require massive implementations services !
Additionnally, not only prevention and detection has seen itself significantly improved, but also those same financial institutions have been able to drastically reduce their operational costs by reducing their investigation workforces or by recycling them into some more added value investigation roles, at all stages, starting in the early stage of elaborating some meaningful customer profiles and segments up to drastically improving their justification when raising a SAR.
Let´s now be confident in the fact that once a number of cognitive features associated to the algorithms have been properly defined in reflection of the risk factors and business context of the financial institution on it´s own market, AI can contribute to eliminate false positives in a very significant way and enables to avoid false negatives going under the radar.
Now it can be said that the adoption of AI/ML is almost like buying off-the-shelf software since it is proven, scalable, and industrialised in a growing number of implemented sites, independently of whether the institution is of a major size or not, among some AI customers, a number of Payment Service Providers have already adopted such an approach, despite their small size and limited resources.