Multi-dimensional chi-squared confidence contour finder#

This module implements a two-stage algorithm for mapping the boundary of the \((1-\alpha)\) confidence region in the full parameter space of a StatisticalModel. It uses frequentist approach to find the iso-surface of the profile likelihood ratio

\[\Delta\chi^2(\mu) = -2\log\frac{\mathcal{L}(\mu, \hat{\theta})}{\mathcal{L}(\hat{\mu}, \hat{\theta})} = \chi^2_{D,1-\alpha} \ ,\]

where \(\mu\) are parameters of interest, \(\theta\) are the nuisance parameters and \(\chi^2_{D,1-\alpha}\) is the \(\chi^2\) distribution at \(1-\alpha\) confidence region with D degrees of freedom.

Mathematical Framework#

Let \(\theta \in \mathbb{R}^k\) be the model parameter vector, \(\hat\theta\) the maximum-likelihood estimate (MLE), and

\[\mathrm{NLL}(\theta) = -\log\mathcal{L}(\theta)\]

the negative log-likelihood. Under Wilks’ theorem, the test statistic

\[\chi^2(\theta) = 2\bigl[\mathrm{NLL}(\theta) - \mathrm{NLL}(\hat\theta)\bigr]\]

follows a \(\chi^2_k\) distribution asymptotically. The \((1-\alpha)\) confidence region is

\[\mathcal{C}_\alpha = \bigl\{\theta : \chi^2(\theta) \le \Delta_\alpha\bigr\}, \qquad \Delta_\alpha = F^{-1}_{\chi^2_k}(1-\alpha),\]

and its \((k-1)\)-dimensional boundary — the contour — satisfies

\[\mathrm{NLL}(\theta) = T, \qquad T = \mathrm{NLL}(\hat\theta) + \tfrac{\Delta_\alpha}{2}.\]

Algorithm#

Stage 1 — Pre-whitening#

The Hessian of the NLL at the MLE equals the observed Fisher information:

\[G = \nabla^2 \mathrm{NLL}(\hat\theta) = -\nabla^2 \log\mathcal{L}(\hat\theta).\]

\(G\) is positive (semi-)definite at a proper minimum. After Cholesky factorisation \(G = LL^T\), the whitened coordinate

\[\varphi = L(\theta - \hat\theta), \qquad \theta = \hat\theta + L^{-T}\varphi\]

makes the contour approximately a \((k-1)\)-sphere of radius \(\sqrt{\Delta_\alpha}\):

\[\mathrm{NLL}\!\bigl(\hat\theta + L^{-T}\varphi\bigr) \approx \mathrm{NLL}(\hat\theta) + \tfrac{1}{2}|\varphi|^2 + O(|\varphi|^3).\]

Sampling uniform random directions on \(S^{k-1}\) in \(\varphi\)-space therefore gives approximately uniform coverage of the contour, even when the original parameter space is strongly anisotropic.

Stage 2 — Radial search#

For each unit vector \(\hat{e} \in S^{k-1}\) (drawn by normalising a standard Gaussian), define the one-dimensional profile

\[f(r) = \mathrm{NLL}\!\bigl(\hat\theta + L^{-T}(r\hat{e})\bigr) - T.\]

\(f\) is negative at \(r=0\) (since \(\mathrm{NLL}(\hat\theta) < T\)) and positive for large \(r\). The root \(r^*\) — found via Brent’s method — yields the contour point

\[\theta^* = \hat\theta + L^{-T}(r^*\hat{e}).\]

Stage 3 — Gap detection#

After the radial search, \(M \gg N\) candidate unit vectors are sampled uniformly on \(S^{k-1}\). For each candidate the maximum cosine similarity to any radially-found direction is computed. Candidates with the smallest maximum similarity correspond to the largest angular gaps.

For each gap direction a dedicated radial search is run to locate the exact contour crossing along that direction. This ensures every RATTLE chain starts directly on the contour in the sparse region, rather than at the nearest dense radial point which may be far away geodesically.

Stage 4 — Constrained Hamiltonian Monte Carlo (RATTLE)#

Let \(C(\theta) = \mathrm{NLL}(\theta) - T\) be the constraint function. Starting from a radial contour point, the RATTLE integrator Andersen [15] walks along \(\partial\mathcal{C}_\alpha\) while preserving the constraint at each step. One leapfrog step with step size \(\varepsilon\) reads

\[\begin{split}p_{1/2} &= p_0 - \tfrac{\varepsilon}{2}\,\nabla\mathrm{NLL}(\theta_0), \\[3pt] \theta' &= \theta_0 + \varepsilon\, p_{1/2}, \\[3pt] \theta_1 &= \theta' - \lambda\,\nabla\mathrm{NLL}(\theta'), \quad \lambda = \frac{\mathrm{NLL}(\theta')-T} {|\nabla\mathrm{NLL}(\theta')|^2}, \\[3pt] p' &= p_{1/2} - \tfrac{\varepsilon}{2}\,\nabla\mathrm{NLL}(\theta_1), \\[3pt] p_1 &= p' - \frac{p' \cdot \nabla\mathrm{NLL}(\theta_1)} {|\nabla\mathrm{NLL}(\theta_1)|^2} \,\nabla\mathrm{NLL}(\theta_1).\end{split}\]

The third equation is the SHAKE projection onto the constraint surface, iterated via Newton’s method until \(|\mathrm{NLL}(\theta_1) - T| < \varepsilon_\text{tol}\). The fifth equation projects the momentum onto the tangent space of the constraint, ensuring \(p_1 \perp \nabla C(\theta_1)\).

Stage 5 — Adaptive coverage improvement#

After Stage 4, all collected contour points (radial + RATTLE) are gathered into a single point cloud. A k-d tree is used to compute the \(k_\text{NN}\)-th nearest-neighbour distance for every point. The \(n_\text{chains}\) most isolated points (those with the largest \(k_\text{NN}\) distance) seed a new round of RATTLE chains that explore the sparsely covered neighbourhood. This pass is repeated \(n_\text{passes} - 1\) times, providing bias-free coverage of regions that are geometrically sparse on the contour even when they are not easily detected as angular gaps from the MLE.

Functions#

`contour.find_contour`(stat_model[, ...])	Find the \((1-\alpha)\) chi-squared confidence contour of a `StatisticalModel` in its full parameter space.
`contour.ContourResult`(theta_mle, nll_min, ...)	Container for the output of `find_contour()`.

Multi-dimensional chi-squared confidence contour finder

Contents