Semantic Chemical Data Management – From FAIR to IDMP Compliance
The application of FAIR principles [1,2] to any data asset generated and used in the drug development pipeline in the pharmaceutical industry, to make it findable, accessible, interoperable, and reusable beyond the initial context for which it has been created is now a major driver toward rethinking data management and data usage from traditional data silos to embracing data fabrics. While this transformation is on one hand embraced at large, it has been difficult to utilize these principles consistently for those data entities, which are of foremost interest to especially the pharmaceutical industry, namely the chemical and biological entities which later become a medication’s active pharmaceutical ingredient [3].
Chemantics is presenting a Software and Platform as a Service semantic chemistry infrastructure, which ensures conformity between all ingested chemical and biological entities. This sought-after conformity is ensured through the consistent application of rigorous business rules for analyzing and representing these entities and overcomes known data challenges, like inconsistent chemical drawing conventions and business rules between data from different data sources. The harmonized, semantic chemistry data is the foundation to open data analytics across data sets, regardless of the original quality and origin of data.
This has wide implications on collaborations within private/public and private/private partnerships and allows the participants to open their data for machine learning/artificial intelligence, and/or graph analytics, while having the opportunity to obfuscate the data as little or as much as necessary, whether it is to support early research activities or augmenting the existing Identification of Medical Products (IDMP) framework as it has been defined in the ISO standard 11615:2012 [4].
As the Chemantics infrastructure is data driven and data centric, it is application agnostic and can be integrated into existing and emerging systems in a non-intrusive way. The system thereby enables chemistry centric data generation applying FAIR principles from early idea generation, i.e., substance registration, alongside a cheminformatics, structure-based analytics approach through IDMP compliant submissions of medications to the regulating authorities.
References
[1] FAIR data - Wikipedia
[2] The FAIR Guiding Principles for scientific data management and stewardship - PMC (nih.gov)
[3] Chemical Data in Life Sciences R&D and the FAIR Principles | Zenodo
[4] ISO - ISO 11615:2012 - Health informatics — Identification of medicinal products — Data elements and structures for the unique
identification and exchange of regulated medicinal product information
