starling.frontend.ensemble_generation.generate

generate(user_input, conformations=400, ionic_strength=150, device=None, steps=30, sampler='ddim', return_structures=False, batch_size=100, num_cpus_mds=2, num_mds_init=4, output_directory=None, output_name=None, return_data=True, verbose=False, show_progress_bar=True, show_per_step_progress_bar=True, pdb_trajectory=False, return_single_ensemble=False, constraint=None, encoder_path=None, ddpm_path=None)[source]

Generate STARLING ensembles and distance maps for one or more sequences.

This is the primary high-level interface for STARLING ensemble generation. It normalizes the provided sequences, runs the diffusion sampler, optionally performs MDS refinement, and returns ensemble objects or writes them to disk.

Parameters:
  • user_input (str or Sequence[str] or Mapping[str, str]) –

    Input sequences to process. Supported forms include:

    • Path to a FASTA, TSV, or seq.in file containing name/sequence rows.

    • Raw amino-acid sequence string.

    • Iterable of sequence strings.

    • Mapping of sequence names to amino-acid sequences.

    Non-canonical residues trigger a ValueError.

  • conformations (int, default=configs.DEFAULT_NUMBER_CONFS) – Number of conformations to sample per sequence.

  • ionic_strength (float, default=configs.DEFAULT_IONIC_STRENGTH) – Ionic strength (mM) supplied to the generative model.

  • device (str or None, default=None) – Device identifier ('cuda', 'mps', or 'cpu'). None selects the best available accelerator.

  • steps (int, default=configs.DEFAULT_STEPS) – Number of denoising diffusion steps.

  • sampler (str, default=configs.DEFAULT_SAMPLER) – Sampler backend registered in starling.configs.

  • return_structures (bool, default=False) – When True include 3D coordinate ensembles in the results.

  • batch_size (int, default=configs.DEFAULT_BATCH_SIZE) – Batch size used for sampling iterations.

  • num_cpus_mds (int, default=configs.DEFAULT_CPU_COUNT_MDS) – Number of CPU workers allocated to the MDS refinement stage.

  • num_mds_init (int, default=configs.DEFAULT_MDS_NUM_INIT) – Number of independent MDS initializations to run per sequence.

  • output_directory (str or os.PathLike or None, default=None) – Directory where generated outputs are written. When None nothing is saved.

  • output_name (str or None, default=None) – Override the generated sequence key when a single sequence string is provided.

  • return_data (bool, default=True) – When True return ensembles; otherwise the function returns None.

  • verbose (bool, default=False) – Emit status messages during generation.

  • show_progress_bar (bool, default=True) – Display a global diffusion progress bar.

  • show_per_step_progress_bar (bool, default=True) – Display an inner progress bar for per-step diffusion updates.

  • pdb_trajectory (bool, default=False) – When True write PDB trajectories alongside XTC files. Only applies when return_structures is True or an output_directory is provided.

  • return_single_ensemble (bool, default=False) – When True and exactly one sequence is processed, return a single starling.structure.ensemble.Ensemble. Raises ValueError if multiple sequences are supplied.

  • constraint (Optional[starling.inference.constraints.Constraint], default=None) – Constraint object applied during sampling.

  • encoder_path (str or os.PathLike or None, default=None) – Custom encoder checkpoint path overriding the configured default.

  • ddpm_path (str or os.PathLike or None, default=None) – Custom diffusion model checkpoint path overriding the configured default.

Returns:

Dictionary of ensembles keyed by sequence name when return_data is True. A single ensemble object is returned when return_single_ensemble is True. Returns None when return_data is False.

Return type:

dict[str, starling.structure.ensemble.Ensemble] or starling.structure.ensemble.Ensemble or None

Raises:
  • FileNotFoundError – If the input path or output directory cannot be located.

  • ValueError – If sequences contain non-canonical residues or argument combinations are invalid.