starling.utilities.check_distance_map_for_error

check_distance_map_for_error(distance_map, min_separation=1, max_separation=None, max_bond_length=4.81)[source]

Check a distance map for physically impossible inter-residue distances.

Two residues separated by |i - j| positions in the sequence are connected by |i - j| bonds, so the largest distance they can possibly be apart is |i - j| * max_bond_length (a fully extended chain). Any measured distance that exceeds this bound is physically impossible and flags the conformation as erroneous.

Crucially, the bound is applied per residue pair using each pair’s own sequence separation. A single global threshold (as used previously) either misses short-range errors – a sequence-adjacent pair could be ~4 x too far apart without being caught – or falsely flags valid long-range pairs.

The bound is a hard physical maximum, so it never produces false positives (assuming no bond exceeds max_bond_length). For large separations it becomes loose and therefore less sensitive, but it remains correct, which is why all pairs can safely be checked at once.

Parameters:
  • distance_map (np.ndarray) – The (n, n) distance map to check for errors.

  • min_separation (int) – The minimum sequence separation |i - j| to check across. Default is 1, which skips only the zero diagonal.

  • max_separation (int or None) – The maximum sequence separation |i - j| to check across. If None (default) every pair of residues is checked.

  • max_bond_length (float) – Maximum physical length of a single bond in Angstroms, including an error term. The Mpipi bond length is 3.81 A; the default of 4.81 A adds a +1 A per-bond error margin to minimise the risk of false positives.

Returns:

Returns True if any residue pair is further apart than physically possible, and False otherwise.

Return type:

bool