Deep Learning and Machine Intelligence for Single Cell Genomics


Project Description

Single cell biology and genomics in particular are currently transforming the biosciences. Single cell RNA sequencing (scRNAseq), method of the year 2013 (Nature Methods), has now matured and large amounts of scRNAseq are now available. These data, characterizing living systems at an unprecedented level of resolution, hold the promise to set the stage for a fundamental quantitative understanding of living systems with special reference to genomic regulation and collective computation. Yet, there are a number of open problems on how to think about these data and how to pragmatically analyze them.In parallel, we have witnessed a rapid development in machine learning. The rise of computation, such as supercomputers (shaheen@KAUST) and GPU based techniques, in conjunction with data explosion (often referred to as big data), has fuelled the development of new techniques aiming for machine intelligence. In particular, techniques inspired from livings systems, such as deep convolutional networks, currently experience a renaissance. Driving forces include not only data and computation but also the availability of suite of open source platforms (e.g. Theano, Caffe, Torch7, TensorFlow) supporting machine-learning algorithms. These algorithms represent industry standard for processing images, speech, text, and runs on the majority of services and devices provided by Google, Amazon, Facebook, to name a few big players, as well as a numerous startups.We offer internships for several highly motivated bachelor (B.Sc.) or master (M.Sc.) students who will identify (a) appropriate supervised deep learning architectures and training algorithms for scRNAseq data, (b) explore generative adversarial network (GANs) techniques for estimation of high-dimensional data distribution in the single cell gene expression space. This work will be used to develop new techniques and to address open problems in single cell genomics such as pseudo-temporal ordering of single cell data, clustering of data, investigate representations, transfer learning, and unsupervised feature discovery. ​​​
Program - BioScience
Division - Biological and Environmental Sciences and Engineering
Field of Study - ​computer science, bioscience, machine learning, systems biology, artificial intelligence​

About the

Jesper Tegnér

Professor, Bioscience

Jesper Tegnér

Following his PhD (09/1997), he was appointed assistant professor in Computer Science (Dept. of Computer Science and Numerical Analysis, Engineering School, 06/1998). He took a leave of absence, for two postdocs (08/1998-07/2001), Sloan Center for Computational Neuroscience, & Center for Biodynamics, Dept. of Biomedical Engineering, (Boston, US). He was awarded a Swedish Wennergren Foundation Fellowship, 5-year visiting scientist position & faculty position upon return, (first of its kind) & 3-year Alfred P. Sloan Fellowship in Computational Science (US). Upon his return to Sweden (08/2001), awarded a new assistant professor position in Computer Science with special reference to Bioinformatics (Stockholm Center for Bioinformatics), but was awarded a new chaired full professorship in Computational Biology (Dept. of Physics, Engineering School, 02/2002), first of its kind in Sweden. In 10/2009, he was specially recruited to become a strategic chaired full professor in Computational Medicine, appointed a Director for the Computational Medicine Division, at the Dept. of Medicine, Karolinska Institutet & Division of Clinical Epidemiology, Karolinska Hospital. In 06/2014, he was named Faculty at the Science for Life Laboratory (SciLifeLab -  National Center for Molecular Biosciences, Stockholm). Since 08/2016 he is a Professor in Bioscience (BESE) and Professor in Computer Science (CEMSE) at KAUST. He is an ERC co-investigator (2013-) on causal discovery, ranked as outstanding (highest distinction among faculty, ERA 2012) at Karolinska Institutet, winner of the international DREAM competition (2008) on network inference, founder of two BioIT companies, and in 2005 he became the winner of the national award for founding the most promising start-up company of the year.

He serves on several editorial boards including being an Associate Editor – Frontiers in Big Data – Medicine and Public Health (joint section with Machine Intelligence and Artificial Intelligence), Acting Section Editor on Clinical and Translational Systems Biology in Current Opinion on Systems Biology, Editorial Board of Complex Systems (first in the field, founded 1987 by Stephen Wolfram), Editorial Board of BMC Systems Biology, Senior Editor in Progress in Preventive Medicine, and Editorial Board of Neurology: Neuroinflammation & Neurodegeneration

His research targets the circuit architecture and algorithms enabling learning and adaptation in living systems and synthetic machines. Since cells are fundamental building blocks (c.f. atoms in the periodic table) of all living matter, we interrogate their intrinsic circuitry, i.e. networks, by exploiting experimental single cell genomics techniques for temporal multi-molecular profiling, deep imaging, live-cell imaging, and molecular interventions using genomic editing techniques. Such high-dimensional and multi-dimensional data are deciphered by means of advanced bioinformatics, mathematical modeling, and machine learning techniques to uncover the fundamental dynamical equations governing cellular decisions, differentiation, reprogramming, and learning. Theory and algorithms for designing causal discovery machines, are developed by cross-pollinating algorithmic information theory, dynamical systems, inverse modeling, data-driven machine learning techniques, including deep learning architectures. Our applications from this program are threefold; engineered cellular control (reprogramming of stem cells, immune cells, and neurons), software development (data-management, bioinformatics software, causal discovery and machine learning algorithms), and clinical translation (currently Melanoma, Breast Cancer, Multiple Sclerosis, Alzheimer, Frontal Dementia, and Retinal diseases). At the core of our program we posit that such fundamental (causal) dynamical equations drive "breath of life" from matter. Since living systems can learn, represent, predict, and in extension understand their local environments, across several orders-of-magnitude of spatial-temporal scales, we believe that the formal deconstruction and reconstruction of such generative mechanisms, evolved over billions of years, will guide the design of algorithmic autonomous learning machines.

Desired Project Deliverables

​Individual projects will be tailored and narrowly designed from the above palette according to interest of the student, technical proficiency, and level of study. The project is suitable for candidates fascinated of living systems, interested in cutting edge bioscience, and artificial intelligence for science and not for discovering cats in YouTube. We expect you (a) to bring enthusiasm, creativity, and hard work, (b) give lab seminars on your work, and (c) produce a final written report.In returnthis facilitates your critical thinking, presentations skills, and scientific writing.Yourresearch, in collaboration and with support of team members, may lead to scientific publications. We publish avidly in both bioscience and computational sciences, not for the fame but rather as steps aiming to and motivated both by our quest of asking fundamental questions of relevance to human nature and discovery of transformative intelligent technologies inspired from nature. You will also get a good hands-on perspective at the frontier of bioscience and machine intelligence in an interdisciplinary research group and environment.