openprotein.fold#
Create PDBs of your protein sequences via our folding models!
Note that for AlphaFold2 Models, you will also need to utilize our align. workflow.
Endpoints#
- class openprotein.fold.FoldAPI[source]#
Fold API provides a high level interface for making protein structure predictions.
- esmfold: ESMFoldModel#
- alphafold2: AlphaFold2Model#
- af2: AlphaFold2Model#
- boltz_1: Boltz1Model#
- boltz1: Boltz1Model#
- boltz_1x: Boltz1xModel#
- boltz1x: Boltz1xModel#
- boltz_2: Boltz2Model#
- boltz2: Boltz2Model#
- __init__(session)[source]#
- Parameters:
session (APISession)
- list_models()[source]#
list models available for creating folds of your sequences
- Return type:
list[FoldModel]
- get_model(model_id)[source]#
Get model by model_id.
FoldModel allows all the usual job manipulation: e.g. making POST and GET requests for this model specifically.
- Parameters:
model_id (str) – the model identifier
- Returns:
The model
- Return type:
FoldModel
- Raises:
HTTPError – If the GET request does not succeed.
Models#
- class openprotein.fold.Boltz2Model[source]#
Class providing inference endpoints for Boltz-2 structure prediction model which jointly models complex structures and binding affinities.
- fold(proteins=None, dnas=None, rnas=None, ligands=None, diffusion_samples=1, recycling_steps=3, sampling_steps=200, step_scale=1.638, use_potentials=False, constraints=None, templates=None, properties=None, method=None)[source]#
Post sequences to Boltz-2 model.
- Parameters:
proteins (List[Protein] | MSAFuture | None) – List of protein sequences to include in folded output. Protein objects must be tagged with an msa, which can be a Protein.single_sequence_mode for single sequence mode. Alternatively, supply an MSAFuture to use all query sequences as a multimer.
dna (List[DNA] | None) – List of DNA sequences to include in folded output.
rna (List[RNA] | None) – List of RNA sequences to include in folded output.
ligands (List[Ligand] | None) – List of ligands to include in folded output.
diffusion_samples (int) – Number of diffusion samples to use
recycling_steps (int) – Number of recycling steps to use
sampling_steps (int) – Number of sampling steps to use
step_scale (float) – Scaling factor for diffusion steps.
use_potentials (bool = False.) – Whether or not to use potentials.
constraints (list[dict] | None = None) – List of constraints.
templates (list[dict] | None = None) – List of templates to use for structure prediction.
properties (list[dict] | None = None) – List of additional properties to predict. Should match the BoltzProperties
method (str | None) – The experimental method or supervision source used for the prediction. Defults to None. Supported values (case-insensitive) include: ‘MD’, ‘X-RAY DIFFRACTION’, ‘ELECTRON MICROSCOPY’, ‘SOLUTION NMR’, ‘SOLID-STATE NMR’, ‘NEUTRON DIFFRACTION’, ‘ELECTRON CRYSTALLOGRAPHY’, ‘FIBER DIFFRACTION’, ‘POWDER DIFFRACTION’, ‘INFRARED SPECTROSCOPY’, ‘FLUORESCENCE TRANSFER’, ‘EPR’, ‘THEORETICAL MODEL’, ‘SOLUTION SCATTERING’, ‘OTHER’, ‘AFDB’, ‘BOLTZ-1’. View the documentation on Boltz for upstream details.
dnas (list[DNA] | None)
rnas (list[RNA] | None)
- Returns:
Future for the folding result.
- Return type:
- __init__(session, model_id, metadata=None)#
- Parameters:
session (APISession)
model_id (str)
metadata (ModelMetadata | None)
- classmethod create(session, model_id, default=None, **kwargs)#
Create and return an instance of the appropriate FoldModel subclass based on the model_id.
- Parameters:
session (APISession) – The API session to use.
model_id (str) – The model identifier.
default (type[FoldModel] or None, optional) – Default FoldModel subclass to use if no match is found.
**kwargs (dict, optional) – Additional keyword arguments to pass to the model constructor.
- Returns:
An instance of the appropriate FoldModel subclass.
- Return type:
FoldModel
- Raises:
ValueError – If no suitable FoldModel subclass is found and no default is provided.
- get_metadata()#
Get model metadata for this model.
- Returns:
The metadata associated with this model.
- Return type:
ModelMetadata
- classmethod get_model()#
Get the model_id(s) for this FoldModel subclass.
- Returns:
List of model_id strings associated with this class.
- Return type:
list of str
- property metadata#
Model metadata for this model.
- Type:
ModelMetadata
- class openprotein.fold.Boltz1xModel[source]#
Class providing inference endpoints for Boltz-1x open-source structure prediction model, which adds the use of inference potentials to improve performance.
- fold(proteins=None, dnas=None, rnas=None, ligands=None, diffusion_samples=1, recycling_steps=3, sampling_steps=200, step_scale=1.638, constraints=None)[source]#
Post sequences to Boltz-1x model. Uses potentials with Boltz-1 model.
- Parameters:
proteins (List[Protein] | MSAFuture | None) – List of protein sequences to include in folded output. Protein objects must be tagged with an msa, which can be a Protein.single_sequence_mode for single sequence mode. Alternatively, supply an MSAFuture to use all query sequences as a multimer.
dna (List[DNA] | None) – List of DNA sequences to include in folded output.
rna (List[RNA] | None) – List of RNA sequences to include in folded output.
ligands (List[Ligand] | None) – List of ligands to include in folded output.
diffusion_samples (int) – Number of diffusion samples to use
recycling_steps (int) – Number of recycling steps to use
sampling_steps (int) – Number of sampling steps to use
step_scale (float) – Scaling factor for diffusion steps.
constraints (Optional[List[dict]]) – List of constraints.
dnas (list[DNA] | None)
rnas (list[RNA] | None)
- Returns:
Future for the folding complex result.
- Return type:
- __init__(session, model_id, metadata=None)#
- Parameters:
session (APISession)
model_id (str)
metadata (ModelMetadata | None)
- classmethod create(session, model_id, default=None, **kwargs)#
Create and return an instance of the appropriate FoldModel subclass based on the model_id.
- Parameters:
session (APISession) – The API session to use.
model_id (str) – The model identifier.
default (type[FoldModel] or None, optional) – Default FoldModel subclass to use if no match is found.
**kwargs (dict, optional) – Additional keyword arguments to pass to the model constructor.
- Returns:
An instance of the appropriate FoldModel subclass.
- Return type:
FoldModel
- Raises:
ValueError – If no suitable FoldModel subclass is found and no default is provided.
- get_metadata()#
Get model metadata for this model.
- Returns:
The metadata associated with this model.
- Return type:
ModelMetadata
- classmethod get_model()#
Get the model_id(s) for this FoldModel subclass.
- Returns:
List of model_id strings associated with this class.
- Return type:
list of str
- property metadata#
Model metadata for this model.
- Type:
ModelMetadata
- class openprotein.fold.Boltz1Model[source]#
Class providing inference endpoints for Boltz-1 open-source structure prediction model.
- fold(proteins=None, dnas=None, rnas=None, ligands=None, diffusion_samples=1, recycling_steps=3, sampling_steps=200, step_scale=1.638, use_potentials=False, constraints=None)[source]#
Post sequences to Boltz-1 model.
- Parameters:
proteins (List[Protein] | MSAFuture | None) – List of protein sequences to include in folded output. Protein objects must be tagged with an msa, which can be a Protein.single_sequence_mode for single sequence mode. Alternatively, supply an MSAFuture to use all query sequences as a multimer.
dna (List[DNA] | None) – List of DNA sequences to include in folded output.
rna (List[RNA] | None) – List of RNA sequences to include in folded output.
ligands (List[Ligand] | None) – List of ligands to include in folded output.
diffusion_samples (int) – Number of diffusion samples to use
recycling_steps (int) – Number of recycling steps to use
sampling_steps (int) – Number of sampling steps to use
step_scale (float) – Scaling factor for diffusion steps.
use_potentials (bool = False.) – Whether or not to use potentials.
constraints (Optional[List[dict]]) – List of constraints.
dnas (list[DNA] | None)
rnas (list[RNA] | None)
- Returns:
Future for the folding complex result.
- Return type:
- __init__(session, model_id, metadata=None)#
- Parameters:
session (APISession)
model_id (str)
metadata (ModelMetadata | None)
- classmethod create(session, model_id, default=None, **kwargs)#
Create and return an instance of the appropriate FoldModel subclass based on the model_id.
- Parameters:
session (APISession) – The API session to use.
model_id (str) – The model identifier.
default (type[FoldModel] or None, optional) – Default FoldModel subclass to use if no match is found.
**kwargs (dict, optional) – Additional keyword arguments to pass to the model constructor.
- Returns:
An instance of the appropriate FoldModel subclass.
- Return type:
FoldModel
- Raises:
ValueError – If no suitable FoldModel subclass is found and no default is provided.
- get_metadata()#
Get model metadata for this model.
- Returns:
The metadata associated with this model.
- Return type:
ModelMetadata
- classmethod get_model()#
Get the model_id(s) for this FoldModel subclass.
- Returns:
List of model_id strings associated with this class.
- Return type:
list of str
- property metadata#
Model metadata for this model.
- Type:
ModelMetadata
- class openprotein.fold.AlphaFold2Model[source]#
Class providing inference endpoints for AlphaFold2 structure prediction models, based on the implementation by ColabFold.
- __init__(session, model_id, metadata=None)[source]#
- Parameters:
session (APISession)
model_id (str)
metadata (ModelMetadata | None)
- fold(proteins=None, num_recycles=None, num_models=1, num_relax=0, **kwargs)[source]#
Post sequences to alphafold model.
- Parameters:
proteins (List[Protein] | MSAFuture) – List of protein sequences to fold. Protein objects must be tagged with an msa. Alternatively, supply an MSAFuture to use all query sequences as a multimer.
num_recycles (int) – number of times to recycle models
num_models (int) – number of models to train - best model will be used
max_msa (Union[str, int]) – maximum number of sequences in the msa to use.
relax_max_iterations (int) – maximum number of iterations
num_relax (int)
- Returns:
job
- Return type:
Job
- classmethod create(session, model_id, default=None, **kwargs)#
Create and return an instance of the appropriate FoldModel subclass based on the model_id.
- Parameters:
session (APISession) – The API session to use.
model_id (str) – The model identifier.
default (type[FoldModel] or None, optional) – Default FoldModel subclass to use if no match is found.
**kwargs (dict, optional) – Additional keyword arguments to pass to the model constructor.
- Returns:
An instance of the appropriate FoldModel subclass.
- Return type:
FoldModel
- Raises:
ValueError – If no suitable FoldModel subclass is found and no default is provided.
- get_metadata()#
Get model metadata for this model.
- Returns:
The metadata associated with this model.
- Return type:
ModelMetadata
- classmethod get_model()#
Get the model_id(s) for this FoldModel subclass.
- Returns:
List of model_id strings associated with this class.
- Return type:
list of str
- property metadata#
Model metadata for this model.
- Type:
ModelMetadata
- class openprotein.fold.ESMFoldModel[source]#
Class providing inference endpoints for Facebook’s ESMFold structure prediction models.
- model_id: str = 'esmfold'#
- __init__(session, model_id, metadata=None)[source]#
- Parameters:
session (APISession)
model_id (str)
metadata (ModelMetadata | None)
- fold(sequences, num_recycles=None)[source]#
Fold sequences using this model.
- Parameters:
sequences (Sequence[bytes | str]) – sequences to fold
num_recycles (int | None) – number of times to recycle models
- Return type:
- classmethod create(session, model_id, default=None, **kwargs)#
Create and return an instance of the appropriate FoldModel subclass based on the model_id.
- Parameters:
session (APISession) – The API session to use.
model_id (str) – The model identifier.
default (type[FoldModel] or None, optional) – Default FoldModel subclass to use if no match is found.
**kwargs (dict, optional) – Additional keyword arguments to pass to the model constructor.
- Returns:
An instance of the appropriate FoldModel subclass.
- Return type:
FoldModel
- Raises:
ValueError – If no suitable FoldModel subclass is found and no default is provided.
- get_metadata()#
Get model metadata for this model.
- Returns:
The metadata associated with this model.
- Return type:
ModelMetadata
- classmethod get_model()#
Get the model_id(s) for this FoldModel subclass.
- Returns:
List of model_id strings associated with this class.
- Return type:
list of str
- property metadata#
Model metadata for this model.
- Type:
ModelMetadata
Results#
- class openprotein.fold.FoldResultFuture[source]#
Fold results represented as a future.
- job#
The fold job associated with this future.
- Type:
FoldJob
- __init__(session, job, sequences=None, max_workers=10)[source]#
Initialize a FoldResultFuture instance.
- Parameters:
session (APISession) – The API session to use for requests.
job (FoldJob) – The fold job associated with this future.
sequences (list[bytes], optional) – List of sequences submitted for the fold request. If None, sequences will be fetched.
max_workers (int, optional) – Maximum number of concurrent workers. Default is config.MAX_CONCURRENT_WORKERS.
- classmethod create(session, job, **kwargs)[source]#
Factory method to create a FoldResultFuture or FoldComplexResultFuture.
- Parameters:
session (APISession) – The API session to use for requests.
job (FoldJob) – The fold job associated with this future.
**kwargs – Additional keyword arguments.
- Returns:
An instance of FoldResultFuture or FoldComplexResultFuture depending on the model.
- Return type:
- property sequences: list[bytes]#
Get the sequences submitted for the fold request.
- Returns:
List of sequences.
- Return type:
list[bytes]
- property model_id: str#
Get the model ID used for the fold request.
- Returns:
Model ID.
- Return type:
str
- property id#
Get the ID of the fold request.
- Returns:
Fold job ID.
- Return type:
str
- keys()[source]#
Get the list of sequences submitted for the fold request.
- Returns:
List of sequences.
- Return type:
list[bytes]
- get(verbose=False)[source]#
Retrieve the fold results as a list of tuples mapping sequence to PDB-encoded string.
- Parameters:
verbose (bool, optional) – If True, print verbose output. Default is False.
- Returns:
List of tuples mapping sequence to PDB-encoded string.
- Return type:
list[tuple[str, str]]
- get_item(sequence)[source]#
Get fold results for a specified sequence.
- Parameters:
sequence (bytes) – Sequence to fetch results for.
- Returns:
Fold result for the specified sequence.
- Return type:
bytes
- cancelled()#
check if job is cancelled
- Return type:
bool
- done()#
Check if job is complete
- Return type:
bool
- refresh()#
Refresh job status.
- stream()#
Retrieve results for this job as a stream.
- wait(interval=5, timeout=None, verbose=False)#
Wait for job to complete, then fetch results.
- Parameters:
interval (int, optional) – time between polling. Defaults to config.POLLING_INTERVAL.
timeout (int, optional) – max time to wait. Defaults to None.
verbose (bool, optional) – verbosity flag. Defaults to False.
- Returns:
results of job
- Return type:
results
- wait_until_done(interval=5, timeout=None, verbose=False)#
Wait for job to complete. Do not fetch results (unlike wait())
- Parameters:
interval (int, optional) – time between polling. Defaults to config.POLLING_INTERVAL.
timeout (int, optional) – max time to wait. Defaults to None.
verbose (bool, optional) – verbosity flag. Defaults to False.
- Returns:
results of job
- Return type:
results
- class openprotein.fold.FoldComplexResultFuture[source]#
Future for manipulating results of a fold complex request.
- job#
The fold job associated with this future.
- Type:
FoldJob
- __init__(session, job, model_id=None, proteins=None, ligands=None, dnas=None, rnas=None)[source]#
Initialize a FoldComplexResultFuture instance.
- Parameters:
session (APISession) – The API session to use for requests.
job (FoldJob) – The fold job associated with this future.
model_id (str, optional) – Model ID used for the fold request.
proteins (list[Protein], optional) – List of proteins submitted for fold request.
ligands (list[Ligand], optional) – List of ligands submitted for fold request.
dnas (list[DNA], optional) – List of DNAs submitted for fold request.
rnas (list[RNA], optional) – List of RNAs submitted for fold request.
- property model_id: str#
Get the model ID used for the fold request.
- Returns:
Model ID.
- Return type:
str
- property proteins: list[Protein] | None#
Get the proteins submitted for the fold request.
- Returns:
List of Protein objects or None.
- Return type:
list[Protein] or None
- property dnas: list[DNA] | None#
Get the DNAs submitted for the fold request.
- Returns:
List of DNA objects or None.
- Return type:
list[DNA] or None
- property rnas: list[RNA] | None#
Get the RNAs submitted for the fold request.
- Returns:
List of RNA objects or None.
- Return type:
list[RNA] or None
- property ligands: list[Ligand] | None#
Get the ligands submitted for the fold request.
- Returns:
List of Ligand objects or None.
- Return type:
list[Ligand] or None
- property pae: ndarray#
Get the Predicted Aligned Error (PAE) matrix.
- Returns:
PAE matrix.
- Return type:
np.ndarray
- Raises:
AttributeError – If PAE is not supported for the model.
- property pde: ndarray#
Get the Predicted Distance Error (PDE) matrix.
- Returns:
PDE matrix.
- Return type:
np.ndarray
- Raises:
AttributeError – If PDE is not supported for the model.
- property plddt: ndarray#
Get the Predicted Local Distance Difference Test (pLDDT) scores.
- Returns:
pLDDT scores.
- Return type:
np.ndarray
- Raises:
AttributeError – If pLDDT is not supported for the model.
- property confidence: list[BoltzConfidence]#
Retrieve the confidences of the structure prediction.
Note
This is only currently supported for Boltz models.
- Returns:
List of BoltzConfidence objects.
- Return type:
list[BoltzConfidence]
- Raises:
AttributeError – If confidence is not supported for the model.
- property affinity: BoltzAffinity#
Retrieve the predicted binding affinities.
Note
This is only currently supported for Boltz models.
- Returns:
BoltzAffinity object containing the predicted affinities.
- Return type:
BoltzAffinity
- Raises:
AttributeError – If affinity is not supported for the model.
- property id#
Get the ID of the fold request.
- Returns:
Fold job ID.
- Return type:
str
- get(format='mmcif', verbose=False)[source]#
Retrieve the fold results as a single bytestring.
Defaults to mmCIF for complexes. Additional predicted properties like plddt and pae should be accessed from their respective properties, i.e. .plddt and .pae.
- Parameters:
format ({'pdb', 'mmcif'}, optional) – Output format. Default is ‘mmcif’.
verbose (bool, optional) – If True, print verbose output. Default is False.
- Returns:
Fold result as a bytestring.
- Return type:
bytes
- cancelled()#
check if job is cancelled
- Return type:
bool
- done()#
Check if job is complete
- Return type:
bool
- refresh()#
Refresh job status.
- wait(interval=5, timeout=None, verbose=False)#
Wait for job to complete, then fetch results.
- Parameters:
interval (int, optional) – time between polling. Defaults to config.POLLING_INTERVAL.
timeout (int, optional) – max time to wait. Defaults to None.
verbose (bool, optional) – verbosity flag. Defaults to False.
- Returns:
results of job
- Return type:
results
- wait_until_done(interval=5, timeout=None, verbose=False)#
Wait for job to complete. Do not fetch results (unlike wait())
- Parameters:
interval (int, optional) – time between polling. Defaults to config.POLLING_INTERVAL.
timeout (int, optional) – max time to wait. Defaults to None.
verbose (bool, optional) – verbosity flag. Defaults to False.
- Returns:
results of job
- Return type:
results