Open In Colab Get Notebook View In GitHub

Using ESMFold#

This tutorial shows you how to use the ESMFold model to create a PDB of your protein sequence of interest. We recommend using ESMFold with single-chain sequences. If you have a multi-chain sequence, please visit Using AlphaFold2.

What you need before getting started#

Specify a sequence of interest whose structure you want to predict. The example used here is interleukin 2:

[1]:
import openprotein

# Login to your session
session = openprotein.connect()

sequence = "MYRMQLLSCIALSLALVTNSAPTSSSTKKTQLQLEHLLLDLQMILNGINNYKNPKLTRMLTFKFYMPKKATELKHLQCLEEELKPLEEVLNLAQSKNFHLRPRDLISNINVIVLELKGMYRMQLLSCIALSLALVTNSAPTSSSTKKTQLQLEHLLLDLQMILNGINNYKNPKLTRMLTFKFYMPKKATELKHLQCLEEELKPLEEVLNLAQSKNFHLRPRDLISNINVIVLELKGSEP"

Predicting your sequence#

Call ESMFold on your sequence. The num_recycles hyperparameter allows the model to further refine structures using the previous cycle’s output as the new cycle’s input. This parameter accepts integers between 1 and 48.

Create the model object for ESMFold:

[2]:
esmfoldmodel = session.fold.get_model('esmfold')
esmfoldmodel.fold?
Signature:
esmfoldmodel.fold(
    sequences: collections.abc.Sequence[bytes | str],
    num_recycles: int | None = None,
) -> openprotein.fold.future.FoldResultFuture
Docstring:
Fold sequences using this model.

Parameters
----------
sequences : Sequence[bytes | str]
    sequences to fold
num_recycles : int | None
    number of times to recycle models
Returns
-------
    FoldResultFuture
File:      ~/Projects/openprotein/openprotein-python-private/openprotein/fold/esmfold.py
Type:      method

Send the sequence of interest to ESM for folding:

[3]:
esm = esmfoldmodel.fold([sequence.encode()], num_recycles=1)

esm
[3]:
FoldJob(num_records=1, job_id='184e52a3-7eb5-4105-890e-9dcf41525382', job_type=<JobType.embeddings_fold: '/embeddings/fold'>, status=<JobStatus.PENDING: 'PENDING'>, created_date=datetime.datetime(2025, 8, 21, 8, 59, 51, 384873, tzinfo=TzInfo(UTC)), start_date=None, end_date=None, prerequisite_job_id=None, progress_message=None, progress_counter=0, sequence_length=None)

Wait for the job to complete with wait_until_done():

[4]:
esm.wait_until_done(verbose=True, timeout=300)
Waiting: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [04:57<00:00,  2.98s/it, status=SUCCESS]
[4]:
True

Fetch the results with get()

The results display a tuple containing the query sequence and the contents of the resulting PDB file. Note that ESMFold returns results in PDB format:

[5]:
result = esm.get()
result = result[0][1]
print("\n".join(result.decode().splitlines()[10:20]))
ATOM     10  CA  TYR A   2      -0.479 -21.837 -10.591  1.00 69.89           C
ATOM     11  C   TYR A   2      -1.672 -21.177  -9.910  1.00 51.02           C
ATOM     12  CB  TYR A   2       0.209 -20.836 -11.525  1.00 53.86           C
ATOM     13  O   TYR A   2      -1.519 -20.513  -8.882  1.00 48.75           O
ATOM     14  CG  TYR A   2       1.660 -21.154 -11.794  1.00 49.22           C
ATOM     15  CD1 TYR A   2       2.645 -20.852 -10.856  1.00 49.69           C
ATOM     16  CD2 TYR A   2       2.048 -21.757 -12.985  1.00 51.99           C
ATOM     17  CE1 TYR A   2       3.983 -21.141 -11.101  1.00 50.78           C
ATOM     18  CE2 TYR A   2       3.384 -22.051 -13.240  1.00 47.59           C
ATOM     19  OH  TYR A   2       5.666 -22.029 -12.540  1.00 42.47           O

Visualize the structure using molviewspec

[6]:
%pip install molviewspec
]4;0;#1B1A1C\]1;0;#1B1A1C\]4;1;#B071FF\]4;2;#64DCF0\]4;3;#FFDCF3\]4;4;#9AA9D8\]4;5;#B59EEA\]4;6;#9DCEFF\]4;7;#E8D3DE\]4;8;#C3B5C0\]4;9;#D5B1FF\]4;10;#F7FDFF\]4;11;#FFFFFF\]4;12;#D1DCF9\]4;13;#E3D2FF\]4;14;#F8FAFF\]4;15;#E5E0E9\]10;#E8D3DE\]11;[100]#1B1A1C\]12;#E8D3DE\]13;#E8D3DE\]17;#E8D3DE\]19;#1B1A1C\]4;232;#E8D3DE\]4;256;#E8D3DE\]708;[100]#1B1A1C\]11;#1B1A1C\Requirement already satisfied: molviewspec in /home/jmage/Projects/openprotein/openprotein-python-private/.pixi/envs/dev/lib/python3.12/site-packages (1.6.0)
Requirement already satisfied: pydantic<3,>=1 in /home/jmage/Projects/openprotein/openprotein-python-private/.pixi/envs/dev/lib/python3.12/site-packages (from molviewspec) (2.11.4)
Requirement already satisfied: annotated-types>=0.6.0 in /home/jmage/Projects/openprotein/openprotein-python-private/.pixi/envs/dev/lib/python3.12/site-packages (from pydantic<3,>=1->molviewspec) (0.7.0)
Requirement already satisfied: pydantic-core==2.33.2 in /home/jmage/Projects/openprotein/openprotein-python-private/.pixi/envs/dev/lib/python3.12/site-packages (from pydantic<3,>=1->molviewspec) (2.33.2)
Requirement already satisfied: typing-extensions>=4.12.2 in /home/jmage/Projects/openprotein/openprotein-python-private/.pixi/envs/dev/lib/python3.12/site-packages (from pydantic<3,>=1->molviewspec) (4.13.2)
Requirement already satisfied: typing-inspection>=0.4.0 in /home/jmage/Projects/openprotein/openprotein-python-private/.pixi/envs/dev/lib/python3.12/site-packages (from pydantic<3,>=1->molviewspec) (0.4.0)
Note: you may need to restart the kernel to use updated packages.
[7]:
from molviewspec import create_builder
builder = create_builder()
structure = builder.download(url="mystructure.pdb")\
    .parse(format="pdb")\
    .model_structure()\
    .component()\
    .representation()\
    .color(color="blue")
builder.molstar_notebook(data={'mystructure.pdb': result}, width=500, height=400)

Next steps#

Use the predicted structure to compare with query structure, or try another structure predictor like AlphaFold2 or save your structure for future use:

[8]:
with open("mystructure.pdb", "wb") as f:
    f.write(result)