
TechRxiv
A preprint server for health sciences.

Frontier AI for understanding, imagining, and creating proteins.

EvolutionaryScale's core product, ESM3, is a family of generative AI models designed for protein sequence modeling. It leverages a vast dataset of 2.78 billion natural proteins and 771 billion unique tokens to enable scientists to understand, imagine, and create proteins with emergent reasoning capabilities. ESM3 models can simultaneously reason over sequence, structure, and function, allowing users to input mixed data types and explore diverse possibilities. The models are available in small, medium, and large sizes through an API, with the ESM3-open model offered with weights and source code under a non-commercial license. Use cases include designing novel proteins, enzymes for plastic breakdown, and new medicines, using chain-of-thought prompting to evolve proteins beyond natural limits, like the esmGFP, a vast departure from naturally occurring fluorescent proteins. ESM3's architecture supports integration via Forge, AWS Sagemaker, Omics platform, AWS Bedrock, BioNemo, and NVIDIA NIM microservices, enabling deployment across varied scientific environments.
EvolutionaryScale's core product, ESM3, is a family of generative AI models designed for protein sequence modeling.
Explore all tools that specialize in protein design. This domain focus ensures EvolutionaryScale delivers optimized results for this specific requirement.
Enables ESM3 to generate proteins with vast evolutionary departures from natural sequences, surpassing natural evolution speeds.
Simultaneously reasons over sequence, structure, and function, allowing for holistic protein design.
Creates entirely new protein sequences with desired properties, extending beyond known biological data.
Trained on 2.78 billion natural proteins, providing a comprehensive understanding of biological data.
Available through Forge and other platforms like AWS and NVIDIA, facilitating seamless integration into existing workflows.
1. Apply for access to the ESM3 API via Forge (closed Beta).
2. Configure your development environment with the necessary libraries and SDKs.
3. Authenticate your API requests using the provided credentials.
4. Input protein sequence, structure, or function data to prompt the model.
5. Process the model's output, which includes new protein sequences, structural predictions, or functional insights.
6. Iterate and refine your inputs based on the results to optimize protein design.
All Set
Ready to go
Verified feedback from other users.
"Users praise the tool's ability to generate novel proteins and accelerate research, but some note that the API access requires beta program enrollment."
Post questions, share tips, and help other users.

A preprint server for health sciences.

Connect your AI agents to the web with real-time search, extraction, and web crawling through a single, secure API.

A large conversational telephone speech corpus for speech recognition and speaker identification research.

STRING is a database of known and predicted protein-protein interactions.

A free and open-source software package for the analysis of brain imaging data sequences.

Complete statistical software for data science with powerful statistics, visualization, data manipulation, and automated reporting in one intuitive platform.