Stockholm University
April 2019
Daniel Morgan
Why Networks?
Influential/Physical Binding
Intervention: Therapeutic Targets
Measure
Directed Perturbation
Hecker 2009
More
Smet 2010
Protein-Protein Interaction
Metabolic pathways
Signaling pathways
Transcript factors/ Gene regulatory networks
Pathway diagrams
Protein-compound interactions
Protein sequence focused
Genetic interaction networks
Guthke 2015
>35 "top" methods
Â
Hypothesis
Simulation
Learn network parameters
Experiment
Analysis
Infer Network
Knowledge & Hypothesis
inspired by Bellot 2015
Â
YES or NO
conditional probabilities per link > DAGs
I (A; E)
I (B; D | A,E)
I (C; A,D,E | B)
I (D; B,C,E | A)
I (E; A, D)
Friedman 2000
coupled ODE related [mRNA] of gene to all other genes
Penfold 2011
(x) hidden layers connecting input and output
Â
supervised
unsupervised
identify direct interactions
capture information flow to understand control systemÂ
Â
Â
Â
Â
*(not necessarily measured or direct)
Two General, Distinct Aims
Gardner 2005
Gardner 2005
GRN
LASSO (Glmnet),(T)LSCO,
RNI,ARACNe,Genie3,CLR
Perturbation (si/shRNA)
SNR, IAA, Rank
MCC, AUROC, wRSS
inferred network
Â
Â
inference methods
Â
Â
experimental/ data collection
Â
data properties
Â
scoring measures
Generation and Simulation Package for Informative Data ExploRation
via some 200 networks & 600 expression sets
consisting of 4 different topologies
with varied SNR,
IAA degrees,
and sizes
Robust Network Inference  decouples the model selection problem from parameter estimation; is very harsh but among the best methods when noise is low
focuses on mutual information between links in a link by link fashion rather than upon entire system as a whole. Also disregards self-regulating elements
Least Absolute Shrinkage & Selection Operator: minimizes RSS by penalizing |coefficient| rather than their square, thus harshest (zeros possible)
Fit cases to regression line minimizing difference on X and Y axis
with self loops ∴ null ARACNe
A Generalized Framework for Controlling FDR in Gene Regulatory Network Inference
???
Â
LSCO / TLSCO
Glmnet / LASSO
RNI
Â
Genie3,
ARACNe, CLR
A Generalized Framework for Controlling FDR in Gene Regulatory Network Inference
A Generalized Framework for Controlling FDR in Gene Regulatory Network Inference
A Generalized Framework for Controlling FDR in Gene Regulatory Network Inference
5%
https://gitlab.com/Xparx/scikit-grni.git
MATLAB -> Python
A Generalized Framework for Controlling FDR in Gene Regulatory Network Inference
A Generalized Framework for Controlling FDR in Gene Regulatory Network Inference
utilize defined perturbation design to map expression to network topologies
uses a tree ensemble approach drawing on relationships between input gene expression patterns to predict those of target genes, building trees based on bootstrapped samples to return one inferred network with ranked link strengths
applies normal distribution statistics to mutual information scores in order to identify network links
A Generalized Framework for Controlling FDR in Gene Regulatory Network Inference
...
...
...
...
...
...
...
...
...
Perturbation-based gene regulatory network inference to reliably predict oncogenic mechanisms
qPCR of 40 genes, singly & doubly knocked down via siRNA
Â
​= gene fold change
&
variance of expression measures
Y: expression data
A: network
P: perturbation matrix
E: input noise estimate
F: output noise estimate
Perturbation-based gene regulatory network inference to reliably predict oncogenic mechanisms
Perturbation-based gene regulatory network inference to reliably predict oncogenic mechanisms
Perturbation-based gene regulatory network inference to reliably predict oncogenic mechanisms
Perturbation-based gene regulatory network inference to reliably predict oncogenic mechanisms
Perturbation-based gene regulatory network inference to reliably predict oncogenic mechanisms
Perturbation-based gene regulatory network inference to reliably predict oncogenic mechanisms
JQ1 inhibits BRD4
known oncogene: regulates transcription of various targets
encodes G2/M transition regulatory protein
targets chromatin during mitosis
expansion/proliferation proceed via activation of MYC, not just thru MYC mediated protein expression
Perturbation-based gene regulatory network inference to reliably predict oncogenic mechanisms
https://dcolin.shinyapps.io/NestBoot-Viz/
Perturbation-based gene regulatory network inference to reliably predict oncogenic mechanisms
comparison to random networks
Perturbation-based gene regulatory network inference to reliably predict oncogenic mechanisms
Perturbation-based gene regulatory network inference to reliably predict oncogenic mechanisms
Perturbation-based gene regulatory network inference to reliably predict oncogenic mechanisms
ie Making sense of Landmark 978 (L1000)
Subramanian et al. 2017
ie Making sense of Landmark 978 (L1000)
ie Making sense of Landmark 978 (L1000)
HEATMAP=OK
PCA=OK
Degree=OK ...ish
Normalizing Melanoma Data ------ Data ---------- Network
SNR=.0076
SNR=.005
SNR=.008
SNR=.009
SNR=.01
SNR=.05
SNR=.02
Subset Selection for Noisy Data
Inference Accuracy on Less-Noisy Subsets
Inference Accuracy on Less-Noisy Subsets
Subset Selection for Noisy Data
Melanoma Network Degree
Current Lab Members:
Â
Team Members:
Past Lab Members:
Erik Sonnhammer
Next Step
Data
ie Making sense of Landmark 978 (L1000)
Subramanian et al. 2017
ie Making sense of Landmark 978 (L1000)
Overlap
Comparison
ie Making sense of Landmark 978 (L1000)
ie Making sense of Landmark 978 (L1000)
ie Making sense of Landmark 978 (L1000)
subset selection method
4 RUV methods' performance quantified by 7 endpoint measures compared to 4 standard normalization methods
(1) MAD - mean absolute deviation from zero (for reference)
heatmap patterns
(2) SlopeVerti & (3) SlopeHoriz
knockdown controls
(4) AdistKS - Kolmogorov-Smirnov distance between 2 subsets
(5) Q3P -Â third quartile of p-values differentiating targeted knockdowns from zero
p-values
(6) UnifKS -Â Kolmogorov-Smirnov distance between P>0.001 subsets
(7) Lambda - inflation of median p-value
aim: high AdistKS, low lambda, unifKS, slopeHoriz, slopeVerti & MAD
calculating fold change, increasing SNR
Platform
ie Making sense of Landmark 978 (L1000)