Researchers have used advanced artificial intelligence (AI) technology, developed by the popular image-sharing platform Pinterest, to link genetic markers and disease. In recent years, genome-wide association studies (GWAS) have successfully uncovered many associations between genetic variants and disease. However, to find new associations now often requires extremely large sample sizes – making such studies hugely expensive and logistically, extremely challenging to run.In a study published in PLOS One, researchers at the Institute of Genetics and Cancer used PinSage, a graphical convolutional neural network-based recommender system, previously used at Pinterest, to identify additional disease associations by using information contained in the genetic associations of molecular traits.Just like suggesting potential interests to site users, models trained on molecular data can suggest new genetic variants that might be linked to a disease when given those that are already known to be disease associated, with prompts such as: “your genome-wide association study highlighted that genetic variant, what about these?”.Using this framework, they first demonstrated that a model trained on genetic association data involving one type of molecular trait (DNA methylation) could find over half of the genetic associations identified in a separate study involving a different molecular trait (RNA expression). They then showed that a model trained only on molecular data could predict height and disease-associated genetic variants. On a set of 64 disease outcomes measured in the UK Biobank, they identified 143 independent new disease associations, with at least one additional association for 64% of the disease outcomes examined. Excluding the MHC, a complex genomic region, this was a total uplift of over 8%. Finally, the researchers successfully replicated 38% of their new disease associations in an independent sample. For many GWAS, attaining such an enhancement by simply increasing sample size would be associated with significant additional costs and may be prohibitively expensive, or impossible, depending on disease prevalence. Here, we begin to address this problem by identifying disease associations based on information contained in the genetic associations of molecular traits. Dr Andrew Bretherick Senior Clinical Research Fellow, MRC Human Genetics Unit Dr Andrew Bretherick Read the full study Tags 2025 Publication date 12 Jun, 2025