Learning Generative Causal Models from Sparse Temporal Observations during Cellular Reprogramming


Project Description

​Recent work on stem cells and different mature specialized cells in different systems/organs (neurons, blood cells,) has revealed a stunning plasticity and capacity of reprogramming cells. For example, mature cells can be reprogrammed into pluripotent stem cells, and exciting work on engineered design of tissues and organs (organoids) are underway. On the one hand the community has since the sequencing of the human genome produced very efficient tools to read off the corresponding molecular events accompanying reprogramming and engineering of cells. Recently, the discovery of the CRISPR techniques has equipped us with unprecedented opportunities for precise writing or editing of the genomes. These developments in fundamental biology and biotechnology are currently opening new tools and perspectives of vital significance for drug development, regenerative medicine, synthetic biology, and personalized medicine. Yet, in essence all these efforts require and would be greatly facilitated if we could advance from correlative data-analysis to a predictive discovery of which interventions (edits, engineering) are producing which effects. Thus, we are facing the fundamental problem on how to discover causal relations from data, or in other words, can we derive quantitative predictive laws fromdata?We offer internships forseveral highly motivatedbachelor (B.Sc.) ormaster (M.Sc.) students who will explore this fundamental question primarily from a computational standpoint. This includes using high-performance simulations of dynamical models, and design of algorithms in a controlled in-silico environment. For example, to identify (a) efficient algorithms for generation of ensembles of dynamical models, (b) use supervised deep learning algorithms for pattern discovery in large-scale simulation data-sets, (c) to perform deep data-driven analysis of computational models in biology, (d) pursue investigations of transfer entropy and related techniques for system identification. These tools will be tested utilizing rich and recent molecular data on cellular reprogramming.​​ ​​​​
Program - BioScience
Division - Biological and Environmental Sciences and Engineering
Field of Study - ​computer science, mathematical modeling, machine learning, systems biology, bioscience

About the

Jesper Tegnér

Professor, Bioscience

Jesper Tegnér

Following his PhD (09/1997), he was appointed assistant professor in Computer Science (Dept. of Computer Science and Numerical Analysis, Engineering School, 06/1998). He took a leave of absence, for two postdocs (08/1998-07/2001), Sloan Center for Computational Neuroscience, & Center for Biodynamics, Dept. of Biomedical Engineering, (Boston, US). He was awarded a Swedish Wennergren Foundation Fellowship, 5-year visiting scientist position & faculty position upon return, (first of its kind) & 3-year Alfred P. Sloan Fellowship in Computational Science (US). Upon his return to Sweden (08/2001), awarded a new assistant professor position in Computer Science with special reference to Bioinformatics (Stockholm Center for Bioinformatics), but was awarded a new chaired full professorship in Computational Biology (Dept. of Physics, Engineering School, 02/2002), first of its kind in Sweden. In 10/2009, he was specially recruited to become a strategic chaired full professor in Computational Medicine, appointed a Director for the Computational Medicine Division, at the Dept. of Medicine, Karolinska Institutet & Division of Clinical Epidemiology, Karolinska Hospital. In 06/2014, he was named Faculty at the Science for Life Laboratory (SciLifeLab -  National Center for Molecular Biosciences, Stockholm). Since 08/2016 he is a Professor in Bioscience (BESE) and Professor in Computer Science (CEMSE) at KAUST. He is an ERC co-investigator (2013-) on causal discovery, ranked as outstanding (highest distinction among faculty, ERA 2012) at Karolinska Institutet, winner of the international DREAM competition (2008) on network inference, founder of two BioIT companies, and in 2005 he became the winner of the national award for founding the most promising start-up company of the year.

He serves on several editorial boards including being an Associate Editor – Frontiers in Big Data – Medicine and Public Health (joint section with Machine Intelligence and Artificial Intelligence), Acting Section Editor on Clinical and Translational Systems Biology in Current Opinion on Systems Biology, Editorial Board of Complex Systems (first in the field, founded 1987 by Stephen Wolfram), Editorial Board of BMC Systems Biology, Senior Editor in Progress in Preventive Medicine, and Editorial Board of Neurology: Neuroinflammation & Neurodegeneration

His research targets the circuit architecture and algorithms enabling learning and adaptation in living systems and synthetic machines. Since cells are fundamental building blocks (c.f. atoms in the periodic table) of all living matter, we interrogate their intrinsic circuitry, i.e. networks, by exploiting experimental single cell genomics techniques for temporal multi-molecular profiling, deep imaging, live-cell imaging, and molecular interventions using genomic editing techniques. Such high-dimensional and multi-dimensional data are deciphered by means of advanced bioinformatics, mathematical modeling, and machine learning techniques to uncover the fundamental dynamical equations governing cellular decisions, differentiation, reprogramming, and learning. Theory and algorithms for designing causal discovery machines, are developed by cross-pollinating algorithmic information theory, dynamical systems, inverse modeling, data-driven machine learning techniques, including deep learning architectures. Our applications from this program are threefold; engineered cellular control (reprogramming of stem cells, immune cells, and neurons), software development (data-management, bioinformatics software, causal discovery and machine learning algorithms), and clinical translation (currently Melanoma, Breast Cancer, Multiple Sclerosis, Alzheimer, Frontal Dementia, and Retinal diseases). At the core of our program we posit that such fundamental (causal) dynamical equations drive "breath of life" from matter. Since living systems can learn, represent, predict, and in extension understand their local environments, across several orders-of-magnitude of spatial-temporal scales, we believe that the formal deconstruction and reconstruction of such generative mechanisms, evolved over billions of years, will guide the design of algorithmic autonomous learning machines.

Desired Project Deliverables

​Individual projects will be tailored and narrowly designed from the above palette according to interest of the student, technical proficiency, and level of study. The project is suitable for candidates fascinated by dynamical causal systems, be it computational or those we find in the natural world, i.e. living cells. We expect you (a) to bring enthusiasm, creativity, and hard work, (b) give lab seminars on your work, and (c) produce a final written report.In returnthis facilitates your critical thinking, presentations skills, and scientific writing.Yourresearch, in collaboration and with support of team members, may lead to scientific publications. We publish avidly in both bioscience and computational sciences, not for the fame but rather as steps aiming to and motivated both by our quest of asking fundamental questions of relevance to human nature and discovery of transformative intelligent technologies inspired from nature. You will get a good hands-on perspective on the frontiers in dynamical systems and bioscience using state-of-the-art simulation and machine learning tools.