Biomedical Data Science Research Theme: Epidemiology & Clinical Trials Dr Catalina Vallejos Reader / Group Leader Contact details Email: catalina.vallejos@ed.ac.uk Web: Vallejos Group Research in a NutshellWhile biomedical data sometimes classifies as “big data” (where the number of samples and/or variables is large), complexity is its most prominent feature. This arises from a combination of different sources of heterogeneity: heterogeneity across individuals in a population (e.g. response to treatment), heterogeneity in terms of the type of data we collect (e.g. health records & genomics) and heterogeneity that is introduced by the data collection process (e.g. measurement error).We focus on the development of novel statistical methodology to address and study these sources of heterogeneity. This is a highly multidisciplinary task: from the understanding of complex biomedical problems and technologies, to the development of new methodology and the implementation of open-source analysis tools. Our current research focuses on two areas of application. Firstly, single-cell RNA-sequencing, a cutting-edge experimental technique that allows genome-wide quantification of gene expression on a cell-by-cell basis. Secondly, electronic health records research, to develop predictive models based on observational data that is routinely collected by health providers (e.g. NHS). Developing computational tools that can make full advantage of the rich information provided by these data sources is ought to improve our understanding of health and disease, playing an important role in precision medicine initiatives. People NameRole Dr Catalina VallejosGroup LeaderBegoña BolosCRUK PhD student (co-supervised)Veronica FinazziEpiCrossBorders PhD student (co-supervised; based in Munich)Yipeng ChengEdinburgh Helsinki Program in Human Genomics PhD student (co-supervised)Louis ChislettHDRUK/Turing Wellcome programme in Health Data Science PhD studentDr Nathan Constantine-CookePostdoctoral Research AssociateDr Karla Monterrubio-GomezPostdoctoral Research AssociateLinda NguyenMRC Precision Medicine PhD student (co-supervised)Emma YangMRC HGU PhD student (co-supervised) Key Publications Liley, J., Emerson, S. R., Mateen, B. A., Vallejos, C. A., Aslett, L. J. M., & Vollmer, S. J. (2021). Model updating after interventions paradoxically introduces bias. Paper presented at 24th International Conference on Artificial Intelligence and Statistics.Kapourani, A., Argelaguet, R., Sanguinetti, G., & Vallejos, C. A. (2021). scMET: Bayesian modelling of DNA methylation heterogeneity at single-cell resolution. Genome Biology. 10.1186/s13059-021-02329-8Lähnemann, D., Köster, J., Szczurek, E., McCarthy, D. J., Hicks, S. C., Robinson, M. D., Vallejos, C. A., Campbell, K. R., Beerenwinkel, N., Mahfouz, A., Pinello, L., Skums, P., Stamatakis, A., Attolini, C. S-O., Aparicio, S., Baaijens, J., Balvert, M., Barbanson, B. D., Cappuccio, A., ... Schönhuth, A. (2020). Eleven grand challenges in single-cell data science. Genome Biology, 21(1), 31. 10.1186/s13059-020-1926-6Richter, M. L., Deligiannis, I. K., Yin, K., Danese, A., Lleshi, E., Coupland, P., Vallejos, C. A., Matchett, K. P., Henderson, N. C., Colome-Tatche, M., & Martinez-Jimenez, C. P. (2021). Single-nucleus RNA-seq2 reveals a functional crosstalk between liver zonation and ploidy. Nature Communications. 10.1038/s41467-021-24543-5Maniatis C, Vallejos CA, Sanguinetti G. SCRaPL: A Bayesian hierarchical framework for detecting technical associates in single cell multiomics data. PLoS Comput Biol. 2022 Jun 21;18(6):e1010163. doi: 10.1371/journal.pcbi.1010163. PMID: 35727848; PMCID: PMC9249169.Full publication list can be found on Research Explorer: Catalina Vallejos Meneses — University of Edinburgh Research Explorer Partners and Funders The Alan Turing InstituteBritish Heart Foundation Scientific Themes Statistical genomics, single cell sequencing, risk prediction, electronic health records This article was published on 2024-09-23