EpiMap: Predicting Epigenetic Targets with Triage Queries and KG Embedding Models

Epigenetics is of primary interest for the study of disease mechanisms in oncology. The reversibility of epigenetic modifications makes them interesting candidates for development of next generation cancer therapies. With the field rapidly generating an increasing volume of literature (> 12000 new publications annually), researchers face a critical need for automated and scalable means of extracting knowledge from text, integrating with their internal data, and analysing for insights. The EpiMap project constructed a comprehensive Knowledge Graph (KG) of such literature mined epigenetic relationships across diseases. In collaboration with AstraZeneca, the prepared KG was applied to practical use-cases. It was shown to support the identification of key context-specific epigenetic regulations associated with disease segments in oncology, and in the prediction of potential novel disease regulatory connections. This was done through the tools of i) querying the EpiMap KG for multi-hop paths. These surfaced patterns which help researchers build explainable hypotheses, for say drug resistance mechanism paths that could then be tested in the lab; ii) using ML to learn latent signals in existing graph to predict new links for potentially novel therapeutic targets. In this talk we present i) the EpiMap KG and discuss the opportunities provided by such literature-mined KGs; ii) use-cases that illuminate how KG supported scientific discovery; and iii) discuss the techniques as well as challenges with building KGE models and link prediction for identification of novel therapeutic targets.


Presentation Slides

Payal Mitra

Payal Mitra

Senior Data Scientist

Elsevier