An oversight in how linear ODE models infer GRN has lead to scaling issues which introduce mega-hubs, ie would be mega regulators not witnessed to occur in nature, and very propably an artifact of the inference method.
02 Perturbation-based gene regulatory network inference to unravel oncogenic mechanisms biorXiv,
Dataset and Code for use in this ShinyApp
Motivation: Cancer is known to stem from multiple, independent mutations, the effects of which aggregate to drive the cell into a cancerous state. To understand the complex interplay between affected genes, their gene regulatory network (GRN) needs to be uncovered, revealing detailed insights of regulatory mechanisms. We therefore decided to infer a reliable GRN from perturbation responses of 40 genes known or suspected to have a role in human cancers yet whose regulatory interactions are poorly known.
Results: siRNA knockdown experiments of each gene were done in a human squamous carcinoma cell line, after which the transcriptomic response was measured. From these data GRNs were inferred using several methods, and the false discovery rate was controlled by the NestBoot framework. The best GRN was shown to be significantly more predictive than the null model, both in crossvalidated benchmarks and for an independent dataset of the same genes but subjected to double perturbations. It agrees with many known links in addition to predicting a large number of novel interactions, a subset of which were experimentally validated. The inferred GRN captures regulatory interactions central to cancer-relevant processes and thus provides mechanistic insights that are useful for future cancer research.
03 NestBoot: A Generalized Framework for Controlling FDR in Gene Regulatory Network Inference Publication and Code
Motivation: Inference of Gene Regulatory Networks (GRNs) from perturbation data can give detailed mechanistic insights of a biological system. Many GRN inference methods exist, but the topology of their estimates tend to be sensitive to changes in method specific parameters. Even though the inferred network is optimal given the parameters, it has been shown that many links are wrong or missing if the data is not informative. To make GRN inference reliable, a method is needed to estimate the support of each predicted link as the method parameters are varied.
Results: To achieve this we have developed a method called nested bootstrapping, which applies a bootstrapping protocol to GRN inference, and by repeated bootstrap runs assesses the stability of the estimated support values. To translate bootstrap support values to false discovery rates we run the same pipeline with shuffled data as input. This provides a general method to control the false discovery rate
of GRN inference that can be applied to any setting of inference parameters, noise level, or data property. We evaluated nested bootstrapping on a simulated dataset spanning a range of such properties, using the LASSO, Least Squares, and RNI inference methods. An improved inference accuracy was observed in almost all situations. The method is part of the GeneSPIDER package, which was also used for generating the simulated networks and data, as well as running and analyzing the inferences.
04 GeneSPIDER: gene regulatory network inference benchmarking with controlled network and data properties Publication and Code
I have worked in collaborating with other students, namely
Andreas Tjärnberg, on the GeneSpider Package for MATLAB, which hopes to tackle a few
key issues in modern network inference.
Inference of gene regulatory networks (GRNs) is a central goal in systems biology. It is therefore important to evaluate the accuracy of GRN
inference methods in the light of network and data properties. Although several packages are available for modelling, simulate, and analyse GRN inference,
they offer limited control of network topology together with system dynamics, experimental design, data properties, and noise characteristics. Independent
control of these properties in simulations is key to drawing conclusions about which inference method to use in a given condition and what performance to
expect from it, as well as to obtain properties representative of real biological systems.