openprotein.prompt#

Create prompts to be used with PoET models, along with queries which opens up use-cases like inverse folding with PoET-2.

Interface#

class openprotein.prompt.PromptAPI(session)[source]#

Prompt API providing the interface to create prompts for use with PoET models.

create_prompt(context, name=None, description=None)[source]#

Create a prompt.

Parameters:
  • context (Context | Sequence[Context]) – Context or list of contexts. Each context is a sequence of entries where each entry is a raw sequence (bytes/str, optionally with : chain breaks for multichain), Protein, or Complex. Currently only protein chains are accepted; passing a Complex with DNA, RNA, or Ligand chains raises InvalidParameterError. This restriction may be relaxed in the future.

  • name (str) – Name of the prompt.

  • description (Optional[str]) – Description of the prompt.

Returns:

The created prompt.

Return type:

Prompt

get_prompt(prompt_id)[source]#

Get the prompt for a given prompt ID.

Parameters:

prompt_id (str) – The prompt ID.

Returns:

The prompt.

Return type:

Prompt

list_prompts()[source]#

List all prompts.

Returns:

List of prompts.

Return type:

List[Prompt]

create_query(query, force_structure=False)[source]#

Create a query.

Parameters:
  • query (bytes or str or Protein or Complex) – A query protein or complex. Raw bytes/str inputs may include : chain breaks to denote a multichain protein. Currently only protein chains are accepted; passing a Complex with DNA, RNA, or Ligand chains raises InvalidParameterError. This restriction may be relaxed in the future.

  • force_structure (bool, optional) – Optionally force a query to be interpreted with a structure. Useful for creating structure prediction queries which can have no structure.

Returns:

The created query.

Return type:

Query

get_query(query_id)[source]#

Get the query for a given query ID.

Parameters:

query_id (str) – The query ID.

Returns:

The query.

Return type:

Query

Classes#

class openprotein.prompt.Prompt(session, job=None, metadata=None, num_replicates=None)[source]#

Prompt which contains a set of sequences and/or structures used to condition the PoET models.

get_as_complexes()[source]#

Retrieve the prompt context with every entry as a Complex.

Single-chain entries are wrapped as Complex({"A": protein}) so the return type is uniform regardless of chain count.

get_as_proteins()[source]#

Retrieve the prompt context with every entry as a Protein.

Raises InvalidParameterError if any entry is multichain — use get_as_complexes() instead when multichain entries may be present.

property id#

The unique identifier of the prompt.

property name#

The name of the prompt.

property description#

The description of the prompt.

property created_date#

The timestamp when the prompt was created.

property num_replicates#

The number of replicates in the prompt for an ensemble prompt.

property status#

The status of the prompt if sampling from an MSAFuture.

property args: dict[str, Any]#

The registered job arguments.

cancelled()#

Check if the job has been cancelled.

Returns:

True if the job is cancelled, False otherwise.

Return type:

bool

done()#

Check if the job has completed.

Returns:

True if the job is done, False otherwise.

Return type:

bool

property end_date: datetime | None#

The end timestamp of the job.

property job_id: str#

The unique identifier of the job.

property job_type: str#

The type of the job.

property progress_counter: int#

The progress counter of the job.

refresh()#

Refresh the job status and internal job object.

property start_date: datetime | None#

The start timestamp of the job.

wait(interval=5, timeout=None, verbose=False)#

Wait for the job to complete, then fetch results.

Parameters:
  • interval (int, optional) – Time in seconds between polling. Defaults to config.POLLING_INTERVAL.

  • timeout (int | None, optional) – Maximum time in seconds to wait. Defaults to None.

  • verbose (bool, optional) – Verbosity flag. Defaults to False.

Returns:

The results of the job.

Return type:

Any

wait_until_done(interval=5, timeout=None, verbose=False)#

Wait for the job to complete.

Parameters:
  • interval (float, optional) – Time in seconds between polling. Defaults to config.POLLING_INTERVAL.

  • timeout (int, optional) – Maximum time in seconds to wait. Defaults to None.

  • verbose (bool, optional) – Verbosity flag. Defaults to False.

Returns:

True if the job completed successfully.

Return type:

bool

Notes

This method does not fetch the job results, unlike wait().

class openprotein.prompt.Query(session, metadata)[source]#

Query containing a sequence/structure used to query the design models which opens up new workflows.

Create a query with a masked sequence using mask_sequence_at() for PoET2Model to run inverse folding.

Create a query with a masked structure using mask_structure_at() for RFdiffusionModel to run inverse folding.

get()[source]#

Retrieve the query as a Protein or Complex.

Single-chain queries collapse to Protein; multichain queries are returned as Complex. For a uniform return type, see get_as_complex() or get_as_protein().

Returns:

Protein or Complex representing the query

Return type:

Protein | Complex

get_as_complex()[source]#

Retrieve the query as a Complex.

A single-chain Protein result is wrapped as Complex({"A": protein}) so the return type is uniform.

get_as_protein()[source]#

Retrieve the query as a Protein.

Raises InvalidParameterError if the query is multichain — use get_as_complex() instead when multichain queries may be present.

property id#

The unique identifier of the query.

property created_date#

The timestamp when the query was created.

class openprotein.prompt.PromptMetadata(*, id, name, description=None, created_date, num_replicates, job_id=None, status)[source]#

Metadata about a prompt.

class openprotein.prompt.QueryMetadata(*, id, created_date)[source]#

Metadata about a query.

class openprotein.prompt.PromptJob(*, job_id, job_type, status, created_date, start_date=None, end_date=None, prerequisite_job_id=None, progress_message=None, progress_counter=None, sequence_length=None, failure_message=None, **extra_data)[source]#

A representation of a prompt job.

property msa_id#

ID of the underlying MSA.

property prompt_id#

Prompt ID.