AI Can Now Create Proteins That Don’t Exist in Nature

AI Learns to Create Proteins Unseen in Nature

Researchers from EvolutionaryScale, a company founded by former Meta scientists, have introduced a new artificial intelligence model called ESM3 that can design proteins from scratch. The model operates by predicting sequences, much like how ChatGPT generates text. Their research was published on July 2 in the bioRxiv database.

New Opportunities in Synthetic Biology

The ESM3 model enables the development of proteins that do not exist in nature, opening up vast possibilities for synthetic biology. Notably, scientists managed to create a new fluorescent protein that differs by 58% from its natural counterparts. This protein glows in a new shade of green and has been named “esmGPF.”

Comparison with ChatGPT

ESM3 is a large language model similar to OpenAI’s GPT-4 and was trained on 2.78 billion proteins. The model extracts information about protein sequences, structures, and functions, then predicts missing fragments. As a result, ESM3 can not only forecast existing proteins but also generate entirely new ones with specific functions.

Breakthroughs in Protein Research

In 2022, a team from Meta introduced ESMFold, the predecessor to ESM3, which predicted the structures of microbial proteins. That same year, DeepMind announced AlphaFold3, capable of predicting the structures of 200 million proteins. However, these models had limitations and their predictions required further validation.

Advantages of ESM3

ESM3 utilizes information from 771 billion unique proteins and can generate proteins with specialized functions. This dramatically speeds up the process of discovering and creating protein structures, which would otherwise be slow and expensive.

Example: Creating a New Protein

During their research, scientists tasked the model with creating a new fluorescent protein. ESM3 generated 96 protein variants, from which the most distinct was selected. Although this protein was 50 times less bright than natural analogs, subsequent iterations led to the creation of a brighter protein, “esmGPF.”

Potential Across Industries

The ESM3 technology could be applied in various fields, from developing new medicines to creating chemicals that break down plastics. A smaller version of the model is already available under a non-commercial license, while the larger version will be offered to commercial researchers.

Conclusion

The development of ESM3 marks a significant step forward in synthetic biology and demonstrates the potential of artificial intelligence in creating new biological structures. This breakthrough could lead to important discoveries and innovations across scientific and industrial sectors.

Leave a Reply