New guidelines aim to improve transparency and trust in genetic prediction tools

Researchers from the Institute of Genetics and Cancer have been working with the Atlas of Variant Effects Alliance to provide practical guidelines for releasing new computational tools known as variant effect predictors (VEPs).

Head shot of Joe Marsh

Every person carries thousands of genetic variants, but only a few have important effects on health. VEPs are used to identify which ones matter by estimating whether a genetic change might be harmful, such as by damaging a protein or increasing disease risk.

However, VEPs vary greatly in how they work, what they predict, and how easily they can be used or evaluated – for example, many are evaluated using data that overlaps with the data they were trained on, leading to overly optimistic results.

This can confuse researchers and slow progress in using genetic information to guide healthcare.

The Alliance brings together researchers working to systematically map how genetic variants affect biological function.

In the paper ‘Guidelines for releasing a variant effect predictor’ published in Genome Biology, researchers recommend making code and data openly available, clearly explaining predictions, and ensuring results are easy to access and interpret. 

The paper also stresses the importance of fair benchmarking, avoiding ‘circularity’ where tools are evaluated using the same data they were trained on. 

By following these principles, developers can create more transparent and trustworthy tools. This will help scientists, clinicians and patients make better use of genetic data to understand disease and tailor treatments.

Alongside these guidelines, the group also tested how well existing tools work in practice using a different type of data from lab experiments known as deep mutational scanning (DMS), which tests the effects of thousands of genetic variants in a controlled setting. 

They compared nearly 100 different prediction tools against 36 DMS studies. They found that the most accurate tools were not those trained using human genetic data, but rather newer methods that rely on powerful artificial intelligence, including approaches inspired by language models and three-dimensional protein structures. 

These cutting-edge tools were not only better at matching experimental results, but also were highly reliable when classifying real patient variants, suggesting that these lab-based tests can be a reliable way to judge VEP quality.

This work provides a clearer picture of which tools can best support clinicians and researchers in interpreting genetic variants, while highlighting the need for transparency and careful evaluation in tool development.

Tags

2025