Technology

New protein-folding AI predicts the structures of 1 billion proteins

North America / United States0 views1 min
New protein-folding AI predicts the structures of 1 billion proteins

Researchers at the Chan Zuckerberg Initiative’s Biohub in San Francisco unveiled the ESM Atlas, an AI-generated database of over 1.1 billion predicted protein structures and 6.8 billion protein sequences, surpassing AlphaFold’s database by 800 million entries. The tool, ESMFold2, outperforms AlphaFold3 in predicting protein complexes, including antibodies, and was used to design functional proteins for medical applications like cancer and immunological treatments, with the atlas freely accessible to scientists worldwide.

Researchers at the Chan Zuckerberg Initiative’s Biohub in San Francisco have released the ESM Atlas, an AI-generated database containing over 1.1 billion predicted protein structures and sequences for 6.8 billion proteins. The database, developed using ESMFold2, an open-source AI model, exceeds the AlphaFold Database by more than 800 million entries and a previous ESM Atlas by 300 million. ESMFold2 was trained on billions of proteins from diverse environments, including metagenomic sequences from soil and ocean samples, which are not included in AlphaFold’s database. The AI model outperforms AlphaFold3 in predicting complex protein structures, particularly antibody-antigen interactions, according to the Biohub team led by science head Alex Rives. The researchers used ESMFold2 to design new antibodies and proteins targeting molecules linked to cancers and immunological conditions, with lab tests confirming high success rates. The atlas also revealed structural similarities between CRISPR microbial defense proteins and a gene-editing protein discovered in a soil fungus in 2023. The ESM Atlas is fully open-source and freely accessible, aiming to bridge gaps in protein biology by providing insights into poorly characterized metagenomic sequences. Biohub researchers hope it will accelerate discoveries in protein design and fundamental biology. Computational biologists, including Gemma Atkinson of Lund University and Christine Orengo of University College London, praised the atlas as a groundbreaking resource for understanding protein functions and structures. ESMFold2’s ability to predict protein structures with high accuracy, even in complex interactions, sets it apart from competing models. The tool’s open-source nature and inclusion of diverse protein sequences could drive advancements in medical research and biotechnology. The atlas’s release marks a significant step forward in leveraging AI to explore the unknown aspects of protein biology, potentially unlocking new therapeutic and scientific opportunities.

This content was automatically generated and/or translated by AI. It may contain inaccuracies. Please refer to the original sources for verification.

Comments (0)

Log in to comment.

Loading...