Installation

STARLING is available on GitHub (bleeding edge) and on PyPi (stable).

Creating an Environment

We recommend creating a fresh conda environment for STARLING:

conda create -n starling python=3.11 -y
conda activate starling

Installation Options

Install from PyPi (Recommended)

You can install STARLING from PyPi using pip:

pip install idptools-starling

Install from GitHub (Development)

Or you can clone and install the bleeding-edge version from GitHub:

git clone git@github.com:idptools/starling.git
cd starling
pip install .

GPU Installation (CUDA)

For GPU-accelerated search with FAISS, you need to install PyTorch and FAISS-GPU via conda to match your CUDA version. As of October 14th, 2025, the pip package for faiss-gpu is not available, so conda is required. There is currently a roadmap to bring support for faiss-gpu wheels back to PyPi you can see more at the following GitHub issue. Until then, we must use conda for the GPU components.

Step 1: Create Environment

conda create -y -n starling python=3.11
conda activate starling

Step 2: Install PyTorch with CUDA Support

Install PyTorch that matches your GPU’s CUDA version (example for CUDA 12.x):

conda install -y -c pytorch -c nvidia pytorch pytorch-cuda=12.4

For other CUDA versions, visit the PyTorch installation page.

Step 3: Install FAISS-GPU

Install FAISS-GPU matching your CUDA version:

conda install -y -c pytorch "faiss-gpu=1.8.*" cuda-version=12.4

Step 4: Install Other Dependencies

Install the remaining dependencies via conda (preferred) or pip:

conda install -y -c conda-forge lightning numpy scipy cython matplotlib \
  jupyter ipython scikit-learn einops tqdm hdf5plugin mdtraj

Step 5: Install Pure-Python Packages

Install packages not available on conda-forge:

pip install protfasta soursop "metapredict>=3.0"

Step 6: Install STARLING

Finally, install STARLING without auto-installing dependencies:

# From PyPI:
pip install --no-deps idptools-starling

# Or from source:
cd /path/to/starling
pip install --no-deps .

Verification

Verify GPU support is working:

python -c "import faiss; print(f'FAISS GPUs available: {faiss.get_num_gpus()}')"
python -c "import torch; print(f'PyTorch CUDA available: {torch.cuda.is_available()}')"

Verification

To verify that STARLING has installed correctly, run:

starling --help

Docker

STARLING ships with a Dockerfile that produces a self-contained image with Python 3.11, CUDA 12.4, all dependencies, pre-downloaded model weights, and pre-built FAISS search artifacts. This is the easiest way to run STARLING on GPU-equipped compute infrastructure without managing local environments.

Note

The Docker image is Linux-based, so running it on macOS will not use MPS acceleration and will fall back to CPU.

Building the image

The Dockerfile lives in the docker/ directory and uses the repository root as the build context. From inside the docker/ directory, run:

cd docker/
docker build -f Dockerfile -t starling ..

The build is a multi-stage process:

Builder stage — installs Python, PyTorch (CUDA 12.4), and STARLING; downloads model weights and search artifacts.
Runtime stage — copies only the virtual environment, cached weights, and search artifacts into a slim CUDA runtime image.

The first build downloads model weights and search artifacts (~2.4 GB) and may take a while. Subsequent builds use Docker layer caching and are much faster unless starling/ source code changes.

Running the container

The image uses starling as its entrypoint, so CLI arguments are passed directly:

# Print version
docker run --rm starling --version

# Print help
docker run --rm starling --help

To use GPU acceleration, pass the --gpus flag (requires the NVIDIA Container Toolkit):

docker run --rm --gpus all starling --help

Generating ensembles

Output files are written to /work inside the container. Mount a local directory to retrieve them:

# Single sequence
docker run --rm --gpus all \
  -v $(pwd)/output:/work \
  starling MQDRVKRPMNAFIVWSRDQRRKMALENPRMRNSEISKQLGYQWKMLTEK \
  -c 200 \
  --ionic_strength 150

# With 3D structures (PDB + XTC)
docker run --rm --gpus all \
  -v $(pwd)/output:/work \
  starling MQDRVKRPMNAFIVWSRDQRRKMALENPRMRNSEISKQLGYQWKMLTEK \
  -c 200 \
  --return_structures \
  --ionic_strength 150

# From a FASTA file
docker run --rm --gpus all \
  -v $(pwd)/input:/input:ro \
  -v $(pwd)/output:/work \
  starling /input/sequences.fasta \
  -c 500 \
  --return_structures \
  --output_directory /work

Conversion utilities

Since the entrypoint is starling, use --entrypoint to access other commands:

# Convert to PDB
docker run --rm -v $(pwd)/output:/work \
  --entrypoint starling2pdb \
  starling /work/my_ensemble.starling -o /work

# Convert to XTC (topology + trajectory)
docker run --rm -v $(pwd)/output:/work \
  --entrypoint starling2xtc \
  starling /work/my_ensemble.starling -o /work

# Convert to NumPy
docker run --rm -v $(pwd)/output:/work \
  --entrypoint starling2numpy \
  starling /work/my_ensemble.starling -o /work

# Print sequence or metadata
docker run --rm -v $(pwd)/output:/work \
  --entrypoint starling2info \
  starling /work/my_ensemble.starling

Sequence search

Search artifacts are baked into the image, so FAISS search works out of the box:

docker run --rm --gpus all \
  -v $(pwd)/output:/work \
  --entrypoint starling-search \
  starling query \
  --seq MQDRVKRPMNAFIVWSRDQRRKMALENPRMRNSEISKQLGYQWKMLTEK \
  --k 20 \
  --nprobe 128 \
  --exclude-exact \
  --out /work/search_results

CPU-only usage

If no GPU is available, omit the --gpus flag. The container falls back to CPU automatically:

docker run --rm \
  -v $(pwd)/output:/work \
  starling MQDRVKRPMNAFIVWSRDQRRKMALENPRMRNSEISKQLGYQWKMLTEK \
  -c 50 \
  --device cpu

Docker tips

Volume mounts are required to access output files. The container’s working directory is /work.
Input files (FASTA, TSV) must also be mounted into the container. Use a read-only mount (:ro) for inputs.
GPU memory: For long sequences or large batch sizes, reduce -b / --batch_size to avoid OOM errors.
Image size: The image is ~8–10 GB due to bundled PyTorch, CUDA runtime, model weights, and search artifacts.
Rebuilding: Modifying STARLING source code only invalidates the COPY starling/ layer and later; earlier layers (system packages, PyTorch) are cached.