starling.generate
- generate(user_input, conformations=400, ionic_strength=150, device=None, steps=30, sampler='ddim', return_structures=False, batch_size=100, num_cpus_mds=2, num_mds_init=4, output_directory=None, output_name=None, return_data=True, verbose=False, show_progress_bar=True, show_per_step_progress_bar=True, pdb_trajectory=False, return_single_ensemble=False, constraint=None, encoder_path=None, ddpm_path=None)[source]
Generate STARLING ensembles and distance maps for one or more sequences.
This is the primary high-level interface for STARLING ensemble generation. It normalizes the provided sequences, runs the diffusion sampler, optionally performs MDS refinement, and returns ensemble objects or writes them to disk.
- Parameters:
user_input (str or Sequence[str] or Mapping[str, str]) –
Input sequences to process. Supported forms include:
Path to a FASTA, TSV, or
seq.infile containing name/sequence rows.Raw amino-acid sequence string.
Iterable of sequence strings.
Mapping of sequence names to amino-acid sequences.
Non-canonical residues trigger a
ValueError.conformations (int, default=configs.DEFAULT_NUMBER_CONFS) – Number of conformations to sample per sequence.
ionic_strength (float, default=configs.DEFAULT_IONIC_STRENGTH) – Ionic strength (mM) supplied to the generative model.
device (str or None, default=None) – Device identifier (
'cuda','mps', or'cpu').Noneselects the best available accelerator.steps (int, default=configs.DEFAULT_STEPS) – Number of denoising diffusion steps.
sampler (str, default=configs.DEFAULT_SAMPLER) – Sampler backend registered in
starling.configs.return_structures (bool, default=False) – When
Trueinclude 3D coordinate ensembles in the results.batch_size (int, default=configs.DEFAULT_BATCH_SIZE) – Batch size used for sampling iterations.
num_cpus_mds (int, default=configs.DEFAULT_CPU_COUNT_MDS) – Number of CPU workers allocated to the MDS refinement stage.
num_mds_init (int, default=configs.DEFAULT_MDS_NUM_INIT) – Number of independent MDS initializations to run per sequence.
output_directory (str or os.PathLike or None, default=None) – Directory where generated outputs are written. When
Nonenothing is saved.output_name (str or None, default=None) – Override the generated sequence key when a single sequence string is provided.
return_data (bool, default=True) – When
Truereturn ensembles; otherwise the function returnsNone.verbose (bool, default=False) – Emit status messages during generation.
show_progress_bar (bool, default=True) – Display a global diffusion progress bar.
show_per_step_progress_bar (bool, default=True) – Display an inner progress bar for per-step diffusion updates.
pdb_trajectory (bool, default=False) – When
Truewrite PDB trajectories alongside XTC files. Only applies whenreturn_structuresisTrueor anoutput_directoryis provided.return_single_ensemble (bool, default=False) – When
Trueand exactly one sequence is processed, return a singlestarling.structure.ensemble.Ensemble. RaisesValueErrorif multiple sequences are supplied.constraint (Optional[starling.inference.constraints.Constraint], default=None) – Constraint object applied during sampling.
encoder_path (str or os.PathLike or None, default=None) – Custom encoder checkpoint path overriding the configured default.
ddpm_path (str or os.PathLike or None, default=None) – Custom diffusion model checkpoint path overriding the configured default.
- Returns:
Dictionary of ensembles keyed by sequence name when
return_dataisTrue. A single ensemble object is returned whenreturn_single_ensembleisTrue. ReturnsNonewhenreturn_dataisFalse.- Return type:
dict[str, starling.structure.ensemble.Ensemble] or starling.structure.ensemble.Ensemble or None
- Raises:
FileNotFoundError – If the input path or output directory cannot be located.
ValueError – If sequences contain non-canonical residues or argument combinations are invalid.