At the Erasmus MC, one of the leading hospitals, it is possible to request synthetic data generated by Syntho’s Syntho Engine. The Smart Health Tech Center (SHTC) – Erasmus MC organised the official kick-off last Thursday 30 March, in which Robert Veen (Research Suite) and Wim Kees Janssen (Syntho ) answered the questions: ‘What is synthetic data?‘, ‘Why do we do this?‘ and ‘How does this work within Erasmus MC?’.
What is AI Generated Synthetic Data?
Real data is collected by obtaining information about real patients, employees and internal business processes. Synthetic data, on the other hand, is generated by an algorithm that creates completely new and fictitious data points, where individuals no longer exist.
An important difference is the use of artificial intelligence to mimic and reproduce characteristics, patterns and properties of the real data in the synthetic data.
The result: AI Generated Synthetic Data that is as accurate as the real data. Consequently, it can even be used for analytics as-if it were real data.
That is why Syntho calls it a “Synthetic Data Twin”: the data is as-good-as-real, but can be used without the privacy challenges.
Why do we do this?
Unlock data and reduce the “Time-to-Data”
By using synthetic data instead of real data, we as an organization can reduce risk assessments and associated time-consuming processes. It allows us to unlock more and additional datasets. We can also ensure that requests to access data can be accelerated so that we can reduce the “time-to-data”. With this, Erasmus MC is building a strong foundation to accelerate data-driven innovation.
Representative data for testing purposes
Testing and development with representative test data is essential to deliver state-of-the-art tech solutions. A synthetic data twin based on the production data results in data that can be used as test data. The result: production-like data, privacy by design in a solution that works easy, fast and is scalable. In addition, by making smart use of generative AI in the creation of synthetic data, it is also possible to enlarge and simulate datasets. This can be a solution, for example, when there is insufficient data (data scarcity) or when you want to up-sample edge cases.
Analytics with AI Generated Synthetic Data
AI is applied to model the synthetic data in such a way that the statistical patterns, relationships and characteristics are preserved in such a way that they can even be used for analyses. Especially in the development phase of models, we will prefer the use of synthetic data and always challenge users of data: “why use real data when you can also use synthetic data”?
How does this work at Erasmus MC?
Do you want to use a synthetic dataset? Or do you want to receive more information about the possibilities? Please contact the Research Suite of Erasmus MC.
Interested in AI Generated Synthetic Data and do you want to deepdive in the possibilies? Contact our experts or request a demo.