In the process of helping its customers deal with the new variety and volatility of data sources, CRIF has come across some common concerns to think about when considering a new ML classification project. Here are some of them:
This is the most frequently asked question, and the answer is not straightforward because it depends on many factors, and given that the data the categorisation process is working with is highly regulated, this makes things even more difficult.
In general, to create a categorisation engine, the data sample must be “representative”:
In order to ensure that the data to be used is promising, an initial statistical check is recommended to ensure that these requirements are met so that the engine performs as expected in each possible scenario.
The evaluation of performance is essential for the continuous improvement of the categorisation engine. Therefore, CRIF put a lot of effort into studying and defining state-of-the-art metrics to inspect every corner of the system’s algorithm, presenting a summary of the most important metrics for multiclass classification problems to the scientific community (for more information, see the CRIF paper Metrics for Multiclass Classification: an Overview) and developing accountability tools to study the algorithm.
Among all the metrics, the two most important KPIs from a business perspective are Coverage and Accuracy:
Transaction data, by its very nature, is constantly evolving, with new merchants entering the market every day, and spending habits that can change dramatically (think of the impact of the pandemic on food deliveries and, more generally, online shopping). Similarly, the categorisation engine should not be thought of as a static model, but as a product that needs to be constantly tuned and maintained to keep a high level of performance. CRIF models are frequently monitored and finetuned: this constant evolution allows the algorithms used by the categorisation engine to be kept at the cutting edge of technology.
At first glance, rule-based classification systems are more effective: you have absolute certainty of the results and full explainability. In practice, the definition of these rules and their hierarchy is not an easy task: if a rule that filters the keyword “tax” as a “taxes” category is used, this could lead to the incorrect categorisation of “taxi” as a tax instead of transportation. Also, a rule-based system raises performance issues, since rules must be processed one by one until a match is found, and of course, the more rules there are, the greater the computation time.
A machine learning model can differentiate between ambiguous cases by using the other elements of the transactions, such as the description and the amount. Therefore, better classification results can be achieved when the available structured data is limited. In addition, artificial intelligence allows automation and scaling of the solution, with continuous learning over hundreds of millions of transactions, which is otherwise impossible with only human defined rules.
Finally, CRIF’s experience over the past few years suggests that the most effective approach is a hybrid one: rules are more effective when rich metadata is available and can be used to uniquely associate a category with a specific value of a variable, while machine learning excels when less, unstructured information is available.
The CRIF Categorisation Engine is made up of two separate components: