Co-funded by the European Commission project has been already joined by 100 organizations from all over Europe—it’s one of the leading projects of its kind on a global scale. Since January 2021, over 70,000 data sets have been provided, which then are analysed by AI algorithms on so-called the HARMONY Big Data Platform.
Researchers focus on finding answers to four questions: How can we diagnose patients faster and more precisely? What are the best practices in clinical management that could help doctors make decisions related to treatment? How can we fulfil the neglected needs of patients suffering from blood cancers? How can we accelerate progress in the development of new drugs?
The answers could be hidden in the data. However, the task is challenging—data uploaded into the Platform comes from different sources and is stored in various formats. Firstly, they have to be harmonized to make them comparable and analysable.
The structure and processes on the HARMONY Big Data Platform.[/caption]
This is a time-consuming and technically advanced process. Thus researchers from HARMONY Alliance used the most advanced techniques developed in recent years. One of them is the Observational Medical Outcomes Partnership (OMOP), which uses international standards of medical terminologies, such as SNOMED and LOINC, which makes it possible to unify information with regard to semantics. Another tool is the Spark technology—an analytical engine used to process data on a large scale.
One of the Alliance’s priorities was also to develop data security standards and make sure that procedures comply with the GDPR. Moreover, the HARMONY Alliance has created an internal code of ethics that governs responsibility and transparency in research work.
A new approach towards medical research
Researchers took a closer look at seven types of blood cancers—one of the most complex types of cancer. To get to know the characteristics of the disease better, they analyse data from electronic medical records and clinical trials, the results from the laboratory and imaging tests, and demographic data. Of course, these are anonymous data. This heterogeneous information, which is incomparable despite what it may seem at first sight, needs to be harmonized first. As a result, all inputs are turned into a coherent database. In data laboratories, data engineers first have to analyse thousands of component pieces of data, assess their quality and usefulness, and then standardize them. Now, AI algorithms can do their job. [caption id="attachment_27839" align="aligncenter" width="640"]