Beyond loss-of-function: developing missense-aware gene constraint models

Supervisor: Professor Joe Marsh

Gene constraint measures how strongly damaging variants are depleted in the population by purifying selection, providing valuable insight into disease mechanisms and drug target discovery. Current models focus almost entirely on predicted loss-of-function (pLOF) variants such as stop-gains and frameshifts. While these variants are relatively straightforward to interpret, they are rare and do not capture genes where disease results from other mechanisms, such as gain-of-function or dominant-negative effects driven by missense variants.

This PhD project will develop a new generation of gene constraint models that account for the selective depletion of damaging missense variants. Building on recent advances in variant effect prediction, you will integrate features from protein structure, sequence conservation, and deep learning-based variant effect predictors to estimate the impact of missense changes at scale. Using large population genome datasets, you will evaluate how well these models detect functionally important genes and protein domains overlooked by traditional pLOF-based approaches.

The project will suit a student with an interest in computational biology, genetics, or statistical genomics, and will involve data analysis, model development, and collaboration with leading researchers in variant interpretation and population genetics.

Joe Marsh Research Project Image 2025 300x400