spey.multiparameter.contour.find_contour

Contents

spey.multiparameter.contour.find_contour#

spey.multiparameter.contour.find_contour(stat_model, confidence_level: float = 0.95, poi_indices: list[int] = None, n_radial: int = 300, n_hmc_chains: int = 10, n_hmc_steps: int = 500, hmc_step_size: float = 0.05, n_gap_candidates: int = 3000, max_radial_bracket: float = 30.0, newton_tol: float = 1e-09, newton_max_iter: int = 100, whitener_regularisation: float = 1e-08, random_seed: int | None = None, n_jobs: int = 1, bounds: List[Tuple[float | None, float | None]] | None = None, n_coverage_passes: int = 2, coverage_knn: int = 5, n_multistart: int = 20, basin_dedup_tol: float = 0.001) ContourResult[source]#

Find the \((1-\alpha)\) chi-squared confidence contour of a StatisticalModel in its full parameter space.

The algorithm operates in four stages (see the module docstring for the full mathematical derivation):

  1. Pre-whitening — compute the Fisher information \(G\) at the MLE via the Hessian of the log-likelihood and Cholesky-factor it so that \(\varphi = L(\theta-\hat\theta)\) maps the contour to an approximate sphere.

  2. Radial search — shoot \(N\) random rays from \(\hat\theta\) in whitened space and locate each ray’s crossing of the NLL threshold \(T\) with Brent’s method.

  3. Gap detection — sample \(M \gg N\) candidate directions and select the \(n_\text{chains}\) directions least covered by the radial set as seeds for RATTLE chains.

  4. Constrained RATTLE HMC — for each seed direction, start a Hamiltonian Monte Carlo chain that walks along the constraint manifold \(\mathrm{NLL}(\theta) = T\), projecting both position and momentum at every step.

Added in version 0.2.7.

Parameters:
  • stat_model – A StatisticalModel whose backend is "default.multivariate_normal". The backend must expose get_logpdf_func() and get_hessian_logpdf_func(); these are used to build the NLL, its gradient (via autograd), and the Fisher information for whitening.

  • confidence_level (float, default 0.95) – Confidence level \(1-\alpha\). The chi-squared threshold is \(\Delta_\alpha = F^{-1}_{\chi^2_k}(1-\alpha)\).

  • poi_indices (list[int], default None) – Indices of the model parameters (in the full parameter space, excluding \(\mu\)) to include in the contour search. \(\mu\) itself is always excluded and held fixed at 1. Parameters not listed here — and not equal to \(\mu\) — are treated as profile parameters: they are numerically minimised at every NLL evaluation so that the algorithm maps the marginal confidence boundary in the poi_indices subspace. When None all non-\(\mu\) parameters form the contour space and no profiling is performed.

  • n_radial (int, default 300) – Number of random radial directions to explore in Stage 2.

  • n_hmc_chains (int, default 10) – Number of RATTLE chains to run in Stage 4, one per detected gap direction.

  • n_hmc_steps (int, default 500) – Number of leapfrog steps per RATTLE chain. Each step produces one contour point (after the constraint projection).

  • hmc_step_size (float, default 0.05) – Leapfrog step size \(\varepsilon\) in whitened-coordinate units.

  • n_gap_candidates (int, default 3000) – Number of candidate directions sampled for gap detection. Larger values give a more thorough search at modest extra cost.

  • max_radial_bracket (float, default 30.0) – Upper bracket for the radial root-finder in whitened-space units. If the contour lies further than this from the MLE in some direction, that direction is skipped.

  • newton_tol (float, default 1e-9) – Convergence tolerance \(\varepsilon_\text{tol}\) for the Newton projection step in RATTLE (i.e. for \(|\mathrm{NLL}(\theta)-T|\)).

  • newton_max_iter (int, default 100) – Maximum Newton iterations in the constraint projection step.

  • whitener_regularisation (float, default 1e-8) – Minimum eigenvalue enforced on the Fisher information matrix before Cholesky factorisation, preventing singular whitening transforms when the likelihood is locally flat.

  • random_seed (Optional[int], default None) – Seed for the numpy.random.Generator used throughout, enabling reproducible results.

  • n_jobs (int, default cpu_count() - 1) – Number of parallel worker processes for the RATTLE HMC chains (Stage 4). Each chain is independent and runs in a separate process via joblib.Parallel. Pass 1 to disable parallelism. Values \(\le 0\) are clamped to 1. Note: parallel execution requires the NLL and gradient closures to be picklable; if pickling fails, fall back to n_jobs=1.

  • bounds (default None) – Parameter bounds for each signal parameter (i.e. every parameter accepted by signal_yields, in the same order). Each element is a (lower, upper) pair where either value may be None to indicate no bound on that side. For example, [(0.0, None), (-1.0, 1.0)] constrains the first parameter to be non-negative and the second to [-1, 1]. When None the whole list is passed, no boundaries are applied. The radial search caps its bracket so that roots beyond a bound are never sought; the RATTLE walk terminates a chain the moment it steps outside the bounded region.

  • n_coverage_passes (int, default 2) – Total number of HMC exploration passes. Pass 1 seeds chains from the angular gap directions (Stage 3/4). Each subsequent pass (up to n_coverage_passes - 1 additional rounds) uses \(k\)-nearest-neighbour distances among all collected contour points to identify the most isolated regions and seeds new RATTLE chains there (Stage 5). Set to 1 to disable the adaptive coverage stage and reproduce the original single-pass behaviour.

  • coverage_knn (int, default 5) – The \(k\) in the \(k\)-NN distance used by Stage 5 to measure local point density. Larger values smooth the density estimate over a wider neighbourhood; smaller values focus on very localised gaps. Must satisfy coverage_knn >= 1.

  • n_multistart (int, default 20) – Total number of random starts used to discover disjoint local minima of the NLL (basins) before the per-basin contour search runs. When the NLL has more than one local minimum below the confidence threshold — e.g. an EFT likelihood with an interference term — each such minimum contributes its own component to the confidence region, so all of them must be mapped to recover the full contour. A single start can only ever find one. Values \(\le 1\) disable multi-start and reproduce the legacy single-anchor behaviour.

  • basin_dedup_tol (float, default 1e-3) – Euclidean tolerance in contour-parameter space for clustering duplicate minima found by the multi-start search. Two candidates within this distance are treated as the same basin; only the one with the lower NLL is kept.

Returns:

ContourResult

A dataclass containing theta_mle, nll_min, threshold, contour_points, and associated metadata.

Raises:
  • ValueError – If the model has fewer than one parameter.

  • RuntimeError – If the MLE optimisation fails to converge.

Examples

import numpy as np
import spey
from spey.multiparameter.contour import find_contour

base_signal = np.array([12.0, 15.0])
bkg   = np.array([50.0, 48.0])
data  = np.array([36.0, 33.0])
cov   = np.array([[144.0, 13.0], [25.0, 256.0]])

def signal(pars):
    return base_signal * (1.0 + pars[0])

pdf_wrapper = spey.get_backend('default.multivariate_normal')
stat_model = pdf_wrapper(
    signal_yields=signal,
    background_yields=bkg,
    data=data,
    covariance_matrix=cov,
    n_signal_parameters=1,
    analysis="demo",
)

result = find_contour(
    stat_model,
    confidence_level=0.95,
    n_radial=200,
    n_hmc_chains=5,
    random_seed=42
)

print("MLE :", result.theta_mle)
print("NLL min :", result.nll_min)
print("Contour points:", result.contour_points.shape)