BME Reweighting =============== Bayesian Maximum Entropy (BME) reweighting adjusts the statistical weights of conformations in a STARLING ensemble so the weighted average better reproduces experimental observables while staying as close as possible to the prior (uniform) distribution. This guide covers end-to-end reweighting workflows using the :mod:`starling.structure.bme` module. .. seealso:: * :doc:`ensemble` – loading ensembles and computing structural properties. * :mod:`starling.structure.bme` – full API reference for the BME classes. * :mod:`starling.structure.bme_utils` – helper functions and constants. Concepts -------- BME works by solving the constrained optimisation problem: .. math:: \min_{\boldsymbol{w}} \; \chi^2(\boldsymbol{w}) + \theta \, D_{\mathrm{KL}}(\boldsymbol{w} \| \boldsymbol{w}_0) where :math:`\chi^2` measures how well the reweighted ensemble matches experiment, :math:`D_{\mathrm{KL}}` is the Kullback–Leibler divergence from the prior weights :math:`\boldsymbol{w}_0`, and :math:`\theta` balances data fidelity against ensemble diversity. * **Low** :math:`\theta` → aggressive fitting, risk of overfitting. * **High** :math:`\theta` → weights stay close to the prior, less data influence. Defining Experimental Observables --------------------------------- Wrap each measurement in an :class:`~starling.structure.bme.ExperimentalObservable`: .. code-block:: python from starling.structure.bme import ExperimentalObservable # Equality restraint: measured Rg = 25 ± 2 Å rg_obs = ExperimentalObservable( value=25.0, uncertainty=2.0, constraint="equality", name="Rg", ) # Upper-bound restraint: end-to-end distance ≤ 60 Å ete_obs = ExperimentalObservable( value=60.0, uncertainty=3.0, constraint="upper", name="End-to-end distance", ) # Lower-bound restraint: Rh ≥ 15 Å rh_obs = ExperimentalObservable( value=15.0, uncertainty=1.5, constraint="lower", name="Rh", ) Supported ``constraint`` types are ``"equality"``, ``"upper"``, and ``"lower"``. Running BME Reweighting ----------------------- Via the Ensemble helper ~~~~~~~~~~~~~~~~~~~~~~~ The simplest path is to call :meth:`~starling.structure.ensemble.Ensemble.reweight_bme` directly on an ``Ensemble`` object: .. code-block:: python import numpy as np from starling import load_ensemble from starling.structure.bme import ExperimentalObservable ensemble = load_ensemble("my_ensemble.starling") # Compute per-conformation values for each observable rg_values = ensemble.radius_of_gyration() ete_values = ensemble.end_to_end_distance() calculated = np.column_stack([rg_values, ete_values]) observables = [ ExperimentalObservable(value=25.0, uncertainty=2.0, name="Rg"), ExperimentalObservable(value=55.0, uncertainty=3.0, name="Re"), ] result = ensemble.reweight_bme( observables=observables, calculated_values=calculated, theta=0.5, verbose=True, ) print(f"χ² initial: {result.chi_squared_initial:.3f}") print(f"χ² final: {result.chi_squared_final:.3f}") After reweighting, ensemble analysis methods accept ``use_bme_weights=True`` to compute weighted averages: .. code-block:: python weighted_rg = ensemble.radius_of_gyration( return_mean=True, use_bme_weights=True ) Via the low-level BME class ~~~~~~~~~~~~~~~~~~~~~~~~~~~ For more control, instantiate :class:`~starling.structure.bme.BME` directly: .. code-block:: python from starling.structure.bme import BME bme = BME( observables=observables, calculated_values=calculated, theta=0.5, ) result = bme.fit(verbose=True) optimised_weights = bme.weights # Predict reweighted values for new calculated data predicted = bme.predict(calculated) Theta Scanning -------------- Choosing the right :math:`\theta` is critical. Use :func:`~starling.structure.bme.theta_scan` to sweep a range and inspect the trade-off: .. code-block:: python from starling.structure.bme import theta_scan scan_result = theta_scan( observables=observables, calculated_values=calculated, theta_values=[0.01, 0.1, 0.5, 1.0, 5.0, 10.0], ) The returned :class:`~starling.structure.bme_utils.ThetaScanResult` contains per-theta :math:`\chi^2`, effective sample sizes, and KL divergence values so you can select the best balance between fitting and diversity. Interpreting Results -------------------- :class:`~starling.structure.bme.BMEResult` provides several diagnostic helpers: .. code-block:: python # Run diagnostics – warns if effective sample size is low result.diagnostics(warn_threshold=0.5) # Effective sample size (fraction of the original ensemble retained) n_eff = result.phi # KL divergence from the prior kl = result.kl_divergence A large KL divergence or very small ``phi`` suggests the reweighting had to deviate substantially from the prior, which may indicate the ensemble is incompatible with the data or :math:`\theta` is too low. See Also -------- * :doc:`ensemble` – Structural analyses on ensembles. * :doc:`ensemble_generation` – Generating ensembles that can be reweighted. * :doc:`constraints` – Steering sampling at generation time instead of post-hoc reweighting.