Statistics and Machine Learning
Dr. Sach Mukherjee
Group Leader
Venusberg-Campus 1
Gebäude 99
53127  Bonn
 +49 228 43302-853

Areas of investigation/research focus

We work at the interface between the hard data sciences (computational statistics and machine learning) and biomedicine. Our long-term goal is to deeply integrate the hard data sciences with systems approaches to disease biology and medicine. To this end we have over several years studied key statistical issues that arise in complex, high-dimensional biomedical data. At the DZNE our focus is on prediction, stratification and systems analyses for neurodegenerative diseases. Working with colleagues across the DZNE (in fundamental, population and clinical research) and internationally, we are developing and applying approaches at the statistical frontier to help realize the promise of next-generation biomedicine.

Advances in high-throughput molecular assays and deep phenotyping coupled with systems-level analyses have the potential to transform biomedical research. Such approaches can inform stratification into disease subtypes, allow prediction of disease state and help elucidate relevant biology at a systems level. Our efforts are directed towards developing high-dimensional statistical andmachine learning methods to realize this potential. This involves working on novel methods motivated by, and applied to, specific applications but also working with colleagues across research areas to clarify conceptual issues that arise in moving towards truly scalable and data-intensive approaches. In the area of systems biology, we are currently working on principled yet highly scalable approaches by which to build and test global molecular networks that are specific to biological or disease context. Furthermore, we are working on methods for prediction and stratification of neurodegenerative diseases, with an emphasis on integrative analyses using diverse high-dimensional data types.

Key Publications

Hill SM, Oates CJ, Blythe D, Mukherjee S. Causal Learning via Manifold Regularization. JMLR. 2019 Aug 01; 20 doi: 10.17863/CAM.44718
Steven M. Hill, Nicole K. Nesser, Katie Johnson-Camacho, Mara Jeffress, Aimee Johnson, Chris Boniface, Simon E.F. Spencer, Yiling Lu, Laura M. Heiser, Yancey Lawrence, Nupur T. Pande, James E. Korkola, Joe W. Gray, Gordon B. Mills, Sach Mukherjee, Paul T. Spellman. Context Specificity in Causal Signaling Networks Revealed by Phosphoprotein Profiling. Cell Systems. 2017 Jan 24; 4:73-83.e10. doi: 10.1016/j.cels.2016.11.013
Städler N and Mukherjee S. Two-sample testing in high dimensions. J. R. Stat. Soc. Series B. 2017 Jan 01; 79 doi: 10.1111/rssb.12173
Hill SM, Heiser LM, Cokelaer T, Unger M, Nesser NK, Carlin DE, Zhang Y, Sokolov A, Paull EO, Wong CK, Graim K, Bivol A, Wang H, Zhu F, Afsari B, Danilova LV, Favorov AV, Lee WS, Taylor D, Hu CW, Long BL, Noren DP, Bisberg AJ; HPN-DREAM Consortium, Mills GB, Gray JW, Kellen M, Norman T, Friend S, Qutub AA, Fertig EJ, Guan Y, Song M, Stuart JM, Spellman PT, Koeppl H, Stolovitzky G, Saez-Rodriguez J, Mukherjee S. Inferring causal molecular networks: Empirical assessment through a community-based effort. Nature Methods. 2016 Mar 29; 13:310-322. doi: 10.1038/nmeth.3773
Robert J. B. Goudie, Sach Mukherjee. A Gibbs sampler for learning DAGs. Journal of Machine Learning Research. 2016 Mar 31; 17


Thursdays 1:30-4:30 pm

Patients +49 800-7799001

(free of charge)

Professionals +49 180-779900

(9 Cent/Min. German landline, mobile and out of Germany possibly more expensive)

Welcome to our website, here you can inform yourself basically cookie-free.

We would be pleased if you would allow a cookie to be set for analysis purposes in order to optimise our provided information. All data are pseudonymous and are only used by the DZNE. We deliberately avoid third-party cookies. You can deselect this setting at any time here.

Your browser allows the setting of cookies: