Case Study
Synthetic data for advanced analytics and testing with a leading international bank
About the client
Our customer is a multinational banking and financial services corporation. Its primary focus areas are retail banking, commercial banking, investment banking, wholesale banking, private banking, asset management, and insurance services. It has more than 50 million clients in more than 30 countries. The bank is listed in the top 100 in the list of the World’s 1000 Largest Banks.
The situation
Navigating the data landscape within the banking sector causes challenges. The bank faced difficulties in accessing and utilizing data due to fragmented storage across disparate databases and compliance regulations. Furthermore, the anonymization of data, intended to protect privacy, often resulted in machine learning models underperforming due to loss of valuable contextual information.
The bank’s commitment to stringent data privacy measures further complicates seamless data sharing and collaboration. These obstacles interfere with leveraging data-driven insights for decision-making, innovation, efficient fraud detection strategies, and the realization of its ambition toward open banking.
The solution
Deploying Syntho’s AI-synthetic data generation platform within the bank offers a transformative approach to address complex data challenges. By generating privacy-compliant realistic datasets, synthetic data empowers accurate machine learning model training, elevating fraud detection and risk assessment capabilities. This approach not only speeds up development cycles and enhances model performance, but also allows secure data collaboration among institutions.
The benefits
Upsampling minority groups
Synthetic data offers a powerful strategy to strategically upsampling minority groups within datasets, thereby fostering more balanced and representative input for machine learning models. This approach is used for example in the context of fraud detection and ant money laundry, where often the availability of data could be scarce and limited.
Privacy-by-design
By using synthetic data, banks can adhere to strict data privacy regulations while still achieving accurate results and innovative advancements. By ensuring that sensitive customer information remains protected, this bank is now able to realize data-driven innovation in a privacy-preservative manner. Innovative models are used for example in the field of predicting defaults, marketing optimization, and KYC.
KYC: combating fraud, anti-money laundering and anti-terrorist financing
Synthetic data sharing emerges as a strategic advantage in the fight against financial crime within the banking sector, enabling collaborative efforts without compromising sensitive real-world data. Secure data sharing facilitation and analysis among financial institutions, regulatory bodies, and law enforcement. Also upsampling the often scarce amount of financial crime data (e.g. limited fraud data) allowed the bank to experiment with AI-based upsample technologies in comparison to traditional techniques such as interpolation and SMOTE.
Keeping data value and quality
Legacy anonymization destroys the data and requires domain knowledge. Synthetic data not only mimics the real data, but also keeps the original format, even for complex data structures like time-series data (in transactions) and location data.
Organization: Leading International Dutch Bank
Location: The Netherlands
Industry: Finance
Size: 60,000+ employees
Use case: Analytics
Target data: Financial transaction data
Website: On request
Save your synthetic data guide now!
- What is synthetic data?
- Why do organizations use it?
- Value adding synthetic data client cases
- How to start