Open In Colab Get Notebook View In GitHub

Using Boltz#

This tutorial demonstrates how to use the Boltz-2 model to predict the structure of a molecular complex, including proteins and ligands. We will also show how to request and retrieve predicted binding affinities and other quality metrics.

What you need before getting started#

First, ensure you have an active OpenProtein session. Then, import the necessary classes for defining the components of your complex.

[1]:
import openprotein
from openprotein.protein import Protein
from openprotein.chains import Ligand

# Login to your session
session = openprotein.connect()

Defining the Molecules#

Boltz-2 can model various molecule types, including proteins, ligands, DNA, and RNA. For this example, we’ll predict the structure of a protein dimer in complex with a ligand.

We will define a dimer and one ligand. When using Boltz models, we can specify that a Protein is meant to be an oligomer by specifying multiple ids in the chain_id. In this case, the protein is a dimer since we have ["A", "B"].

Note that for affinity prediction, the ligand that is binding must have a single, unique string for its chain_id.

[2]:
# Define the proteins
proteins = [
    Protein(sequence="MVTPEGNVSLVDESLLVGVTDEDRAVRSAHQFYERLIGLWAPAVMEAAHELGVFAALAEAPADSGELARRLDCDARAMRVLLDALYAYDVIDRIHDTNGFRYLLSAEARECLLPGTLFSLVGKFMHDINVAWPAWRNLAEVVRHGARDTSGAESPNGIAQEDYESLVGGINFWAPPIVTTLSRKLRASGRSGDATASVLDVGCGTGLYSQLLLREFPRWTATGLDVERIATLANAQALRLGVEERFATRAGDFWRGGWGTGYDLVLFANIFHLQTPASAVRLMRHAAACLAPDGLVAVVDQIVDADREPKTPQDRFALLFAASMTNTGGGDAYTFQEYEEWFTAAGLQRIETLDTPMHRILLARRATEPSAVPEGQASENLYFQ"),
]
proteins[0].chain_id = ["A", "B"]

# You can also specify the proteins to be cyclic by setting the property
# proteins[0].cyclic = True

# Define the ligand
# We use the three-letter code for S-adenosyl-L-homocysteine (SAH)
# The chain_id 'C' is the "binder" we will reference later.
ligands = [
    Ligand(ccd="SAH", chain_id="C")
]

Predicting the Complex Structure and Affinity#

Now, we can call the fold method on the Boltz-2 model.

The key steps are:

  1. Access the model via session.fold.boltz2.

  2. Pass the defined proteins and ligands.

  3. To request binding affinity prediction, include the properties argument. This argument takes a list of dictionaries. For affinity, you specify the binder, which must match the chain_id of a ligand you defined.

[4]:
# Request the fold, including an affinity prediction for our ligand.
fold_job = session.fold.boltz2.fold(
    proteins=proteins,
    ligands=ligands,
    properties=[{"affinity": {"binder": "C"}}]
)
fold_job
[4]:
FoldJob(num_records=1, job_id='ab5f4f6f-6acc-4dcb-a008-523fdea02c5b', job_type=<JobType.embeddings_fold: '/embeddings/fold'>, status=<JobStatus.PENDING: 'PENDING'>, created_date=datetime.datetime(2025, 8, 21, 14, 15, 1, 358781, tzinfo=TzInfo(UTC)), start_date=None, end_date=None, prerequisite_job_id=None, progress_message=None, progress_counter=0, sequence_length=None)

The call returns a FoldComplexResultFuture object immediately. This is a reference to your job running on the OpenProtein platform. You can monitor its status or wait for it to complete.

[5]:
# Wait for the job to finish
fold_job.wait_until_done(verbose=True)
Waiting: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [09:26<00:00,  5.66s/it, status=SUCCESS]
[5]:
True

Retrieving the Results#

Once the job is complete, you can retrieve the various outputs from the future object.

Getting the Structure File
The primary result is the predicted structure, which you can retrieve as a mmCIF file. Note that we only implemented mmCIF output format for Boltz.
[6]:
# Get the result as a PDB bytestring
result = fold_job.get()

print('\n'.join(result.decode().splitlines()[500:510])) # Print a few lines
ATOM 46 O O . ASN A 1 7 ? -9.37756 -6.15156 -7.90108 1 68.851 ? 7 A 1
ATOM 47 C CB . ASN A 1 7 ? -8.77594 -5.10971 -10.70351 1 68.851 ? 7 A 1
ATOM 48 C CG . ASN A 1 7 ? -8.48088 -3.63107 -10.55621 1 68.851 ? 7 A 1
ATOM 49 O OD1 . ASN A 1 7 ? -8.00682 -3.16609 -9.515 1 68.851 ? 7 A 1
ATOM 50 N ND2 . ASN A 1 7 ? -8.77729 -2.86805 -11.60447 1 68.851 ? 7 A 1
ATOM 51 N N . VAL A 1 8 ? -7.36867 -5.19251 -7.56931 1 78.911 ? 8 A 1
ATOM 52 C CA . VAL A 1 8 ? -7.60041 -4.94319 -6.14561 1 78.911 ? 8 A 1
ATOM 53 C C . VAL A 1 8 ? -7.8049 -3.45969 -5.84513 1 78.911 ? 8 A 1
ATOM 54 O O . VAL A 1 8 ? -7.64426 -3.02562 -4.70211 1 78.911 ? 8 A 1
ATOM 55 C CB . VAL A 1 8 ? -6.46385 -5.52776 -5.2703 1 78.911 ? 8 A 1

Visualize the structure using molviewspec

[7]:
from molviewspec import create_builder
builder = create_builder()
structure = builder.download(url="mystructure.cif")\
    .parse(format="mmcif")\
    .model_structure()\
    .component()\
    .representation()\
    .color(color="blue")
builder.molstar_notebook(data={'mystructure.cif': result}, width=500, height=400)
Getting Confidence Metrics (pLDDT and PAE)
Boltz models provide confidence metrics compatible with AlphaFold.
  • pLDDT (predicted Local Distance Difference Test) gives a per-residue confidence score from 0-100.

  • PAE (Predicted Aligned Error) provides an (1x) N x N matrix of expected error between every pair of residues.

[8]:
# Retrieve the pLDDT scores
plddt_scores = fold_job.plddt
print("pLDDT scores shape:", plddt_scores.shape)
print("First 10 scores:", plddt_scores[0, :10])

# Retrieve the PAE matrix
pae_matrix = fold_job.pae
print("\nPAE matrix shape:", pae_matrix.shape)

pLDDT scores shape: (1, 794)
First 10 scores: [0.53493166 0.57479316 0.6206771  0.6189912  0.65894026 0.6582905
 0.68851155 0.7891059  0.873299   0.95369035]

PAE matrix shape: (1, 794, 794)
Getting Predicted Binding Affinity
Since we requested it, we can now retrieve the predicted binding affinity. The result is a BoltzAffinity object containing detailed predictions.
[9]:
# Retrieve the affinity prediction
affinity_data = fold_job.affinity

print("Affinity for binder 'C':")
print(f"  predicted: {affinity_data.affinity_pred_value}")
print(f"  probability: {affinity_data.affinity_probability_binary}")
print(f"  per model: {affinity_data.per_model}")

Affinity for binder 'C':
  predicted: -1.828442931175232
  probability: 0.9927777051925659
  per model: {'affinity_pred_value1': -2.108689308166504, 'affinity_probability_binary1': 0.9957777261734009, 'affinity_pred_value2': -1.54819655418396, 'affinity_probability_binary2': 0.9897776246070862}

Next Steps#

You can use examine the predicted structure, or work on binder design with RFdiffusion on our platform. You can save your predicted structure like so:

[10]:
with open("mystructure.cif", "wb") as f:
    f.write(result)