MultiGML | multimodal graph machine learning for drug target prioritization | scapos AG

MultiGML combines relevant knowledge and experimental data (e.g., gene expression data, microscopy images, protein sequences) in a comprehensive, structured, and unbiased manner.

Initial situation and context

Finding the right drug target
Identifying a good target is key for the effectiveness and safety of a drug candidate in pharmaceutical research. Traditionally, the identification of targets is based on human knowledge and understanding of fundamental disease mechanisms. However, given the growing flood of scientific literature, this manual approach is prone to overlooking relevant data and information, which may lead to sub-optimal choices.

Figure 1: MultiGML Knowledge graph generation

Adverse drug events as a risk for clinical trials

Adverse drug events (ADEs) are defined as an injury resulting from the use of a drug, including harm caused by the drug (adverse drug reactions and overdoses) and harm from the use of the drug (including dose reductions and discontinuations of drug therapy). The appearance of an ADE can at least partially be associated with the choice of the primary target protein or properties of the chemical structure of a drug. Experimental approaches to address potential ADEs (e.g., liver toxicity) based on animal and tissue models are well established. Yet, results obtained in such model systems may not always reflect the situation in humans, and there are ethical concerns regarding the use of animal models. Furthermore, reliable model systems do not exist for all ADEs and indication areas.

Limitations of human genetics

One possible way to address the abovementioned concerns is to check whether genetic variants in a candidate drug target have been associated with an unfavorable phenotype. While this approach has been reported to be highly effective in cases where such an association could be identified, it is crucial to see that there is a high risk of missing relevant associations due to a lack of statistical power.

Category:

AI Technologies

Developed by

Fraunhofer SCAI

Your contact person

I will be happy to provide you with information about our software products.

Bettina KellerProduct sales

+49-2241-14-4407
Contact

Request info

Info sheet (PDF)

Learning from comprehensive knowledge and experimental data in an unbiased manner

Our solution combines relevant knowledge and experimental data (e.g., gene expression data, microscopy images, protein sequences) in a comprehensive, structured, and unbiased manner. Technically, this is realized via a semantically harmonized knowledge graph, which we compiled from 14 curated biological databases, resulting in around 400.000 relations between proteins, drugs, and phenotypes, including ADEs (Figure 1). Based on this wealth of information, we trained a multimodal Graph Neural Network architecture capable of accurately predicting new associations between compounds and phenotypes or between protein targets and phenotypes. A distinction to alternative solutions is the possibility of MultiGML to deliver explanations for model predictions (Figure 2).

We offer MultiGML in two variants:

MultiGML_Model: Our fully trained MultiGML model in both RGCN and RGAT variants (PyTorch version 1.9.1) is ready for use with instructions for installing the environment and applying the pre-trained model.
MultiGML_Code: The code to enable running MultiGML with a custom knowledge graph, including our scripts to generate node features of the knowledge graph, employed in our previous publication, as well as scripts to explain predictions post-hoc using the Integrated Gradients method. The node feature generation scripts include scripts for the following node-type features:
1. Drugs: molecular fingerprint, gene expression signature, morphological fingerprint
2. Proteins: sequence embedding, gene ontology fingerprint
3. Phenotypes: medical concept embedding

Instructions for the installation of environments and the usage of the command-line interface are included. This will give you full flexibility, including the possibility to enrich your knowledge graph according to your wishes, train your customized MultiGML instance, and subsequently explain your predictions.

Figure 2: Example of an explanation of a model prediction provided by MultiGML. MultiGML predicted an association between the compound karamycin and the phenotype paralysis. The graph depicts the first-order neighborhood of the compound and the phenotype of interest and highlights the links that MultiGML considered most relevant to make this prediction. The edges‘ thickness indicates the attention MultiGML gave to individual relations to make the prediction.

MultiGML | multimodal graph machine learning for drug target prioritization

MultiGML combines relevant knowledge and experimental data (e.g., gene expression data, microscopy images, protein sequences) in a comprehensive, structured, and unbiased manner.

Initial situation and context

Adverse drug events as a risk for clinical trials

Limitations of human genetics

Category:

Developed by

Your contact person

Learning from comprehensive knowledge and experimental data in an unbiased manner

We offer MultiGML in two variants:

Other software products

scapos AG

scapos Software Portfolio Overview

AutoNester-T – Automatic nesting

AutoNester-L – Nesting on leather hides

PackAssistant – Container planning with identical, complex parts

PUZZLE – Optimization of cardboard boxes and pallet loads

AutoBarSizer – cutting optimization software for steel profiles and other bars

AutoPanelSizer – Optimized cutting layouts for panel sizing saws

CutPlanner – Automatic production planning in the textile industry

CuboNester-P – dynamically optimized packing arrangements

CuboNester-C – dynamically optimized cutting plans

MpCCI – Solving multidisciplinary problems by coupling simulations

SAMG – efficiently solve large linear systems of equations

ModelCompare – Compare FEM models quickly and easily

SimCompare – automatic event detection for crash simulations

SimExplore – Comparison and analysis of CAE simulations

MESHFREE – accelerating complex fluid mechanics simulations

MYNTS – Simulation, Analysis and Optimization of Energy Networks

FemZip – Compression Tool for Simulation Results

DIFF-CRASH – Stability Analysis for Simulation Results

OptoInspect3D Inline – fast inline evaluation of point clouds

Audio mining – transcription of audio and video data

FoundationEHR | a foundation AI model for structured electronic health records

MultiGML | multimodal graph machine learning for drug target prioritization