openprotein.fold#

Create PDBs of your protein sequences via our folding models!

Note that for AlphaFold2 Models, you will also need to utilize our align. workflow.

Endpoints#

class openprotein.fold.FoldAPI[source]#

Fold API provides a high level interface for making protein structure predictions.

esmfold: ESMFoldModel#
alphafold2: AlphaFold2Model#
af2: AlphaFold2Model#
boltz_1: Boltz1Model#
boltz1: Boltz1Model#
boltz_1x: Boltz1xModel#
boltz1x: Boltz1xModel#
boltz_2: Boltz2Model#
boltz2: Boltz2Model#
__init__(session)[source]#
Parameters:

session (APISession)

list_models()[source]#

list models available for creating folds of your sequences

Return type:

list[FoldModel]

get_model(model_id)[source]#

Get model by model_id.

FoldModel allows all the usual job manipulation: e.g. making POST and GET requests for this model specifically.

Parameters:

model_id (str) – the model identifier

Returns:

The model

Return type:

FoldModel

Raises:

HTTPError – If the GET request does not succeed.

get_results(job)[source]#

Retrieves the results of a fold job.

Parameters:

job (Job) – The fold job whose results are to be retrieved.

Returns:

An instance of FoldResultFuture

Return type:

FoldResultFuture

Models#

class openprotein.fold.Boltz2Model[source]#

Class providing inference endpoints for Boltz-2 structure prediction model which jointly models complex structures and binding affinities.

fold(proteins=None, dnas=None, rnas=None, ligands=None, diffusion_samples=1, recycling_steps=3, sampling_steps=200, step_scale=1.638, use_potentials=False, constraints=None, templates=None, properties=None, method=None)[source]#

Post sequences to Boltz-2 model.

Parameters:
  • proteins (List[Protein] | MSAFuture | None) – List of protein sequences to include in folded output. Protein objects must be tagged with an msa, which can be a Protein.single_sequence_mode for single sequence mode. Alternatively, supply an MSAFuture to use all query sequences as a multimer.

  • dna (List[DNA] | None) – List of DNA sequences to include in folded output.

  • rna (List[RNA] | None) – List of RNA sequences to include in folded output.

  • ligands (List[Ligand] | None) – List of ligands to include in folded output.

  • diffusion_samples (int) – Number of diffusion samples to use

  • recycling_steps (int) – Number of recycling steps to use

  • sampling_steps (int) – Number of sampling steps to use

  • step_scale (float) – Scaling factor for diffusion steps.

  • use_potentials (bool = False.) – Whether or not to use potentials.

  • constraints (list[dict] | None = None) – List of constraints.

  • templates (list[dict] | None = None) – List of templates to use for structure prediction.

  • properties (list[dict] | None = None) – List of additional properties to predict. Should match the BoltzProperties

  • method (str | None) – The experimental method or supervision source used for the prediction. Defults to None. Supported values (case-insensitive) include: ‘MD’, ‘X-RAY DIFFRACTION’, ‘ELECTRON MICROSCOPY’, ‘SOLUTION NMR’, ‘SOLID-STATE NMR’, ‘NEUTRON DIFFRACTION’, ‘ELECTRON CRYSTALLOGRAPHY’, ‘FIBER DIFFRACTION’, ‘POWDER DIFFRACTION’, ‘INFRARED SPECTROSCOPY’, ‘FLUORESCENCE TRANSFER’, ‘EPR’, ‘THEORETICAL MODEL’, ‘SOLUTION SCATTERING’, ‘OTHER’, ‘AFDB’, ‘BOLTZ-1’. View the documentation on Boltz for upstream details.

  • dnas (list[DNA] | None)

  • rnas (list[RNA] | None)

Returns:

Future for the folding result.

Return type:

FoldComplexResultFuture

__init__(session, model_id, metadata=None)#
Parameters:
  • session (APISession)

  • model_id (str)

  • metadata (ModelMetadata | None)

classmethod create(session, model_id, default=None, **kwargs)#

Create and return an instance of the appropriate FoldModel subclass based on the model_id.

Parameters:
  • session (APISession) – The API session to use.

  • model_id (str) – The model identifier.

  • default (type[FoldModel] or None, optional) – Default FoldModel subclass to use if no match is found.

  • **kwargs (dict, optional) – Additional keyword arguments to pass to the model constructor.

Returns:

An instance of the appropriate FoldModel subclass.

Return type:

FoldModel

Raises:

ValueError – If no suitable FoldModel subclass is found and no default is provided.

get_metadata()#

Get model metadata for this model.

Returns:

The metadata associated with this model.

Return type:

ModelMetadata

classmethod get_model()#

Get the model_id(s) for this FoldModel subclass.

Returns:

List of model_id strings associated with this class.

Return type:

list of str

property metadata#

Model metadata for this model.

Type:

ModelMetadata

class openprotein.fold.Boltz1xModel[source]#

Class providing inference endpoints for Boltz-1x open-source structure prediction model, which adds the use of inference potentials to improve performance.

fold(proteins=None, dnas=None, rnas=None, ligands=None, diffusion_samples=1, recycling_steps=3, sampling_steps=200, step_scale=1.638, constraints=None)[source]#

Post sequences to Boltz-1x model. Uses potentials with Boltz-1 model.

Parameters:
  • proteins (List[Protein] | MSAFuture | None) – List of protein sequences to include in folded output. Protein objects must be tagged with an msa, which can be a Protein.single_sequence_mode for single sequence mode. Alternatively, supply an MSAFuture to use all query sequences as a multimer.

  • dna (List[DNA] | None) – List of DNA sequences to include in folded output.

  • rna (List[RNA] | None) – List of RNA sequences to include in folded output.

  • ligands (List[Ligand] | None) – List of ligands to include in folded output.

  • diffusion_samples (int) – Number of diffusion samples to use

  • recycling_steps (int) – Number of recycling steps to use

  • sampling_steps (int) – Number of sampling steps to use

  • step_scale (float) – Scaling factor for diffusion steps.

  • constraints (Optional[List[dict]]) – List of constraints.

  • dnas (list[DNA] | None)

  • rnas (list[RNA] | None)

Returns:

Future for the folding complex result.

Return type:

FoldComplexResultFuture

__init__(session, model_id, metadata=None)#
Parameters:
  • session (APISession)

  • model_id (str)

  • metadata (ModelMetadata | None)

classmethod create(session, model_id, default=None, **kwargs)#

Create and return an instance of the appropriate FoldModel subclass based on the model_id.

Parameters:
  • session (APISession) – The API session to use.

  • model_id (str) – The model identifier.

  • default (type[FoldModel] or None, optional) – Default FoldModel subclass to use if no match is found.

  • **kwargs (dict, optional) – Additional keyword arguments to pass to the model constructor.

Returns:

An instance of the appropriate FoldModel subclass.

Return type:

FoldModel

Raises:

ValueError – If no suitable FoldModel subclass is found and no default is provided.

get_metadata()#

Get model metadata for this model.

Returns:

The metadata associated with this model.

Return type:

ModelMetadata

classmethod get_model()#

Get the model_id(s) for this FoldModel subclass.

Returns:

List of model_id strings associated with this class.

Return type:

list of str

property metadata#

Model metadata for this model.

Type:

ModelMetadata

class openprotein.fold.Boltz1Model[source]#

Class providing inference endpoints for Boltz-1 open-source structure prediction model.

fold(proteins=None, dnas=None, rnas=None, ligands=None, diffusion_samples=1, recycling_steps=3, sampling_steps=200, step_scale=1.638, use_potentials=False, constraints=None)[source]#

Post sequences to Boltz-1 model.

Parameters:
  • proteins (List[Protein] | MSAFuture | None) – List of protein sequences to include in folded output. Protein objects must be tagged with an msa, which can be a Protein.single_sequence_mode for single sequence mode. Alternatively, supply an MSAFuture to use all query sequences as a multimer.

  • dna (List[DNA] | None) – List of DNA sequences to include in folded output.

  • rna (List[RNA] | None) – List of RNA sequences to include in folded output.

  • ligands (List[Ligand] | None) – List of ligands to include in folded output.

  • diffusion_samples (int) – Number of diffusion samples to use

  • recycling_steps (int) – Number of recycling steps to use

  • sampling_steps (int) – Number of sampling steps to use

  • step_scale (float) – Scaling factor for diffusion steps.

  • use_potentials (bool = False.) – Whether or not to use potentials.

  • constraints (Optional[List[dict]]) – List of constraints.

  • dnas (list[DNA] | None)

  • rnas (list[RNA] | None)

Returns:

Future for the folding complex result.

Return type:

FoldComplexResultFuture

__init__(session, model_id, metadata=None)#
Parameters:
  • session (APISession)

  • model_id (str)

  • metadata (ModelMetadata | None)

classmethod create(session, model_id, default=None, **kwargs)#

Create and return an instance of the appropriate FoldModel subclass based on the model_id.

Parameters:
  • session (APISession) – The API session to use.

  • model_id (str) – The model identifier.

  • default (type[FoldModel] or None, optional) – Default FoldModel subclass to use if no match is found.

  • **kwargs (dict, optional) – Additional keyword arguments to pass to the model constructor.

Returns:

An instance of the appropriate FoldModel subclass.

Return type:

FoldModel

Raises:

ValueError – If no suitable FoldModel subclass is found and no default is provided.

get_metadata()#

Get model metadata for this model.

Returns:

The metadata associated with this model.

Return type:

ModelMetadata

classmethod get_model()#

Get the model_id(s) for this FoldModel subclass.

Returns:

List of model_id strings associated with this class.

Return type:

list of str

property metadata#

Model metadata for this model.

Type:

ModelMetadata

class openprotein.fold.AlphaFold2Model[source]#

Class providing inference endpoints for AlphaFold2 structure prediction models, based on the implementation by ColabFold.

__init__(session, model_id, metadata=None)[source]#
Parameters:
  • session (APISession)

  • model_id (str)

  • metadata (ModelMetadata | None)

fold(proteins=None, num_recycles=None, num_models=1, num_relax=0, **kwargs)[source]#

Post sequences to alphafold model.

Parameters:
  • proteins (List[Protein] | MSAFuture) – List of protein sequences to fold. Protein objects must be tagged with an msa. Alternatively, supply an MSAFuture to use all query sequences as a multimer.

  • num_recycles (int) – number of times to recycle models

  • num_models (int) – number of models to train - best model will be used

  • max_msa (Union[str, int]) – maximum number of sequences in the msa to use.

  • relax_max_iterations (int) – maximum number of iterations

  • num_relax (int)

Returns:

job

Return type:

Job

classmethod create(session, model_id, default=None, **kwargs)#

Create and return an instance of the appropriate FoldModel subclass based on the model_id.

Parameters:
  • session (APISession) – The API session to use.

  • model_id (str) – The model identifier.

  • default (type[FoldModel] or None, optional) – Default FoldModel subclass to use if no match is found.

  • **kwargs (dict, optional) – Additional keyword arguments to pass to the model constructor.

Returns:

An instance of the appropriate FoldModel subclass.

Return type:

FoldModel

Raises:

ValueError – If no suitable FoldModel subclass is found and no default is provided.

get_metadata()#

Get model metadata for this model.

Returns:

The metadata associated with this model.

Return type:

ModelMetadata

classmethod get_model()#

Get the model_id(s) for this FoldModel subclass.

Returns:

List of model_id strings associated with this class.

Return type:

list of str

property metadata#

Model metadata for this model.

Type:

ModelMetadata

class openprotein.fold.ESMFoldModel[source]#

Class providing inference endpoints for Facebook’s ESMFold structure prediction models.

model_id: str = 'esmfold'#
__init__(session, model_id, metadata=None)[source]#
Parameters:
  • session (APISession)

  • model_id (str)

  • metadata (ModelMetadata | None)

fold(sequences, num_recycles=None)[source]#

Fold sequences using this model.

Parameters:
  • sequences (Sequence[bytes | str]) – sequences to fold

  • num_recycles (int | None) – number of times to recycle models

Return type:

FoldResultFuture

classmethod create(session, model_id, default=None, **kwargs)#

Create and return an instance of the appropriate FoldModel subclass based on the model_id.

Parameters:
  • session (APISession) – The API session to use.

  • model_id (str) – The model identifier.

  • default (type[FoldModel] or None, optional) – Default FoldModel subclass to use if no match is found.

  • **kwargs (dict, optional) – Additional keyword arguments to pass to the model constructor.

Returns:

An instance of the appropriate FoldModel subclass.

Return type:

FoldModel

Raises:

ValueError – If no suitable FoldModel subclass is found and no default is provided.

get_metadata()#

Get model metadata for this model.

Returns:

The metadata associated with this model.

Return type:

ModelMetadata

classmethod get_model()#

Get the model_id(s) for this FoldModel subclass.

Returns:

List of model_id strings associated with this class.

Return type:

list of str

property metadata#

Model metadata for this model.

Type:

ModelMetadata

Results#

class openprotein.fold.FoldResultFuture[source]#

Fold results represented as a future.

job#

The fold job associated with this future.

Type:

FoldJob

__init__(session, job, sequences=None, max_workers=10)[source]#

Initialize a FoldResultFuture instance.

Parameters:
  • session (APISession) – The API session to use for requests.

  • job (FoldJob) – The fold job associated with this future.

  • sequences (list[bytes], optional) – List of sequences submitted for the fold request. If None, sequences will be fetched.

  • max_workers (int, optional) – Maximum number of concurrent workers. Default is config.MAX_CONCURRENT_WORKERS.

classmethod create(session, job, **kwargs)[source]#

Factory method to create a FoldResultFuture or FoldComplexResultFuture.

Parameters:
  • session (APISession) – The API session to use for requests.

  • job (FoldJob) – The fold job associated with this future.

  • **kwargs – Additional keyword arguments.

Returns:

An instance of FoldResultFuture or FoldComplexResultFuture depending on the model.

Return type:

FoldResultFuture or FoldComplexResultFuture

property sequences: list[bytes]#

Get the sequences submitted for the fold request.

Returns:

List of sequences.

Return type:

list[bytes]

property model_id: str#

Get the model ID used for the fold request.

Returns:

Model ID.

Return type:

str

property id#

Get the ID of the fold request.

Returns:

Fold job ID.

Return type:

str

keys()[source]#

Get the list of sequences submitted for the fold request.

Returns:

List of sequences.

Return type:

list[bytes]

get(verbose=False)[source]#

Retrieve the fold results as a list of tuples mapping sequence to PDB-encoded string.

Parameters:

verbose (bool, optional) – If True, print verbose output. Default is False.

Returns:

List of tuples mapping sequence to PDB-encoded string.

Return type:

list[tuple[str, str]]

get_item(sequence)[source]#

Get fold results for a specified sequence.

Parameters:

sequence (bytes) – Sequence to fetch results for.

Returns:

Fold result for the specified sequence.

Return type:

bytes

cancelled()#

check if job is cancelled

Return type:

bool

done()#

Check if job is complete

Return type:

bool

refresh()#

Refresh job status.

stream()#

Retrieve results for this job as a stream.

wait(interval=5, timeout=None, verbose=False)#

Wait for job to complete, then fetch results.

Parameters:
  • interval (int, optional) – time between polling. Defaults to config.POLLING_INTERVAL.

  • timeout (int, optional) – max time to wait. Defaults to None.

  • verbose (bool, optional) – verbosity flag. Defaults to False.

Returns:

results of job

Return type:

results

wait_until_done(interval=5, timeout=None, verbose=False)#

Wait for job to complete. Do not fetch results (unlike wait())

Parameters:
  • interval (int, optional) – time between polling. Defaults to config.POLLING_INTERVAL.

  • timeout (int, optional) – max time to wait. Defaults to None.

  • verbose (bool, optional) – verbosity flag. Defaults to False.

Returns:

results of job

Return type:

results

class openprotein.fold.FoldComplexResultFuture[source]#

Future for manipulating results of a fold complex request.

job#

The fold job associated with this future.

Type:

FoldJob

__init__(session, job, model_id=None, proteins=None, ligands=None, dnas=None, rnas=None)[source]#

Initialize a FoldComplexResultFuture instance.

Parameters:
  • session (APISession) – The API session to use for requests.

  • job (FoldJob) – The fold job associated with this future.

  • model_id (str, optional) – Model ID used for the fold request.

  • proteins (list[Protein], optional) – List of proteins submitted for fold request.

  • ligands (list[Ligand], optional) – List of ligands submitted for fold request.

  • dnas (list[DNA], optional) – List of DNAs submitted for fold request.

  • rnas (list[RNA], optional) – List of RNAs submitted for fold request.

property model_id: str#

Get the model ID used for the fold request.

Returns:

Model ID.

Return type:

str

property proteins: list[Protein] | None#

Get the proteins submitted for the fold request.

Returns:

List of Protein objects or None.

Return type:

list[Protein] or None

property dnas: list[DNA] | None#

Get the DNAs submitted for the fold request.

Returns:

List of DNA objects or None.

Return type:

list[DNA] or None

property rnas: list[RNA] | None#

Get the RNAs submitted for the fold request.

Returns:

List of RNA objects or None.

Return type:

list[RNA] or None

property ligands: list[Ligand] | None#

Get the ligands submitted for the fold request.

Returns:

List of Ligand objects or None.

Return type:

list[Ligand] or None

property pae: ndarray#

Get the Predicted Aligned Error (PAE) matrix.

Returns:

PAE matrix.

Return type:

np.ndarray

Raises:

AttributeError – If PAE is not supported for the model.

property pde: ndarray#

Get the Predicted Distance Error (PDE) matrix.

Returns:

PDE matrix.

Return type:

np.ndarray

Raises:

AttributeError – If PDE is not supported for the model.

property plddt: ndarray#

Get the Predicted Local Distance Difference Test (pLDDT) scores.

Returns:

pLDDT scores.

Return type:

np.ndarray

Raises:

AttributeError – If pLDDT is not supported for the model.

property confidence: list[BoltzConfidence]#

Retrieve the confidences of the structure prediction.

Note

This is only currently supported for Boltz models.

Returns:

List of BoltzConfidence objects.

Return type:

list[BoltzConfidence]

Raises:

AttributeError – If confidence is not supported for the model.

property affinity: BoltzAffinity#

Retrieve the predicted binding affinities.

Note

This is only currently supported for Boltz models.

Returns:

BoltzAffinity object containing the predicted affinities.

Return type:

BoltzAffinity

Raises:

AttributeError – If affinity is not supported for the model.

property id#

Get the ID of the fold request.

Returns:

Fold job ID.

Return type:

str

get(format='mmcif', verbose=False)[source]#

Retrieve the fold results as a single bytestring.

Defaults to mmCIF for complexes. Additional predicted properties like plddt and pae should be accessed from their respective properties, i.e. .plddt and .pae.

Parameters:
  • format ({'pdb', 'mmcif'}, optional) – Output format. Default is ‘mmcif’.

  • verbose (bool, optional) – If True, print verbose output. Default is False.

Returns:

Fold result as a bytestring.

Return type:

bytes

cancelled()#

check if job is cancelled

Return type:

bool

done()#

Check if job is complete

Return type:

bool

refresh()#

Refresh job status.

wait(interval=5, timeout=None, verbose=False)#

Wait for job to complete, then fetch results.

Parameters:
  • interval (int, optional) – time between polling. Defaults to config.POLLING_INTERVAL.

  • timeout (int, optional) – max time to wait. Defaults to None.

  • verbose (bool, optional) – verbosity flag. Defaults to False.

Returns:

results of job

Return type:

results

wait_until_done(interval=5, timeout=None, verbose=False)#

Wait for job to complete. Do not fetch results (unlike wait())

Parameters:
  • interval (int, optional) – time between polling. Defaults to config.POLLING_INTERVAL.

  • timeout (int, optional) – max time to wait. Defaults to None.

  • verbose (bool, optional) – verbosity flag. Defaults to False.

Returns:

results of job

Return type:

results