Building a plugin#
spey package has been designed to be expandable. It only needs to know certain aspects of the
data structure that is presented and a prescription to form a likelihood function.
What a plugin provides#
A quick intro on the terminology of spey plugins in this section:
A plugin is an external Python package that provides additional statistical model prescriptions to spey.
Each plugin may provide one (or more) statistical model prescriptions accessible directly through Spey.
Depending on the scope of the plugin, you may wish to provide additional (custom) operations and differentiability through various autodif packages such as
autogradorjax. As long as they are implemented through predefined function names, Spey can automatically detect and use them within the interface.
Creating your Statistical Model Prescription#
The first step in creating your Spey plugin is to create your statistical model interface.
This is as simple as importing abstract base class BackendBase from spey and
inheriting it. The most basic implementation of a statistical model can be found below;
1>>> import spey
2
3>>> class MyStatisticalModel(spey.BackendBase):
4>>> name = "my_stat_model"
5>>> version = "1.0.0"
6>>> author = "John Smith <john.smith@smith.com>"
7>>> spey_requires = ">=0.1.0,<0.2.0"
8
9>>> def __init__(self, ...)
10>>> ...
11
12>>> @property
13>>> def is_alive(self):
14>>> ...
15
16>>> def config(
17... self, allow_negative_signal: bool = True, poi_upper_bound: float = 10.0
18... ):
19>>> ...
20
21>>> def get_logpdf_func(
22... self, expected = spey.ExpectationType.observed, data = None
23... ):
24>>> ...
25
26>>> def expected_data(self, pars):
27>>> ...
BackendBase requires certain functionality from the statistical model to be
implemented, but let us first go through the above class structure. Spey looks for specific
metadata to track the implementation’s version, author and name. Additionally,
it checks compatibility with the current Spey version to ensure that the plugin works as it should.
Note
The list of metadata that Spey is looking for:
name (
str): Name of the plugin.version (
str): Version of the plugin.author (
str): Author of the plugin.spey_requires (
str): The minimum spey version that the plugin needs, e.g.spey_requires="0.0.1"orspey_requires=">=0.3.3".doi (
List[str]): Citable DOI numbers for the plugin.arXiv (
List[str]): arXiv numbers for the plugin.
MyStatisticalModel class has four main functionalities namely is_alive(),
config(), get_logpdf_func(), and
BackendBase() documentation by clicking on them.)
is_alive(): This function returns a boolean indicating that the statistical model has at least one signal bin with a non-zero yield.config(): This function returnsModelConfigclass which includes certain information about the model structure, such as the index of the parameter of interest within the parameter list (poi_index), minimum value parameter of interest can take (minimum_poi), suggested initialisation parameters for the optimiser (suggested_init) and suggested bounds for the parameters (suggested_bounds). Ifallow_negative_signal=Truethe lower bound of POI is expected to be zero; ifFalseminimum_poi.poi_upper_boundis used to enforce an upper bound on POI.Note
Suggested bounds and initialisation values should return a list with a length of the number of nuisance parameters and parameters of interest. Initialisation values should be a type of
List[float, ...]and bounds should have the type ofList[Tuple[float, float], ...].get_logpdf_func(): Returns a callable that computes the log-likelihood for any parameter vector. Mathematically, this function should return \(\log\mathcal{L}(\mu, \theta)\) where the input array contains both the POI (\(\mu\)) and nuisance parameters (\(\theta\)). Behind the scenes, Spey uses this function within an optimization loop:\[(\hat{\mu}, \hat{\theta}) = \arg\min_{\mu, \theta} \left[ -\log\mathcal{L}(\mu, \theta) \right]\]The
expectedargument determines which data to use in the likelihood computation: ifexpected=spey.ExpectationType.observed, use actual experimental data; ifexpected=spey.ExpectationType.apriori, use background yields as the “observed” data. This ensures the function correctly computes both fit and Asimov likelihoods. Ifdatais provided explicitly, it overrides the default data selection (used for Asimov data in hypothesis testing).expected_data()(optional): This function is crutial for asymptotic hypothesis testing. This function is used to generate the expected value of the data with the given fit parameters, i.e. \(\theta\) and \(\mu\). If this function does not exist, exclusion limits can still be computed usingchi_squarecalculator. seeexclusion_confidence_level().
Other available functions that can be implemented are shown in the table below. These are optional optimizations that improve computational efficiency or enable advanced features.
Functions and Properties |
Mathematical Purpose |
Use Case |
|---|---|---|
|
Returns \(f(\vec{p}) = -\log\mathcal{L}(\vec{p})\) and optionally its gradient \(\nabla f\). Enables first-order optimization methods that use analytical gradients instead of numerical differentiation. |
Significant speedup for high-dimensional fits; essential for Automatic Differentiation backends |
|
Returns the Hessian matrix \(H_{ij} = \frac{\partial^2 \log\mathcal{L}}{\partial p_i \partial p_j}\). The inverse Hessian at the maximum is the Fisher information matrix, used to estimate parameter uncertainties. |
Accurate uncertainty estimation via |
|
Returns a function that generates pseudo-datasets by sampling from the likelihood distribution at given parameter values.
Enables toy Monte Carlo hypothesis testing (see |
Toy-based exclusion limits; empirical p-value computation when asymptotic approximations are insufficient |
Attention
A simple example implementation can be found in the example-plugin repository which implements
In order to make this model recognised by Spey, the class must be registered as an entry point or by a decorator. The former is
explained in the next section, while the latter can be done by using the register_backend() decorator as follows;
1>>> import spey
2
3>>> @spey.register_backend
4>>> class MyStatisticalModel(spey.BackendBase):
5>>> name = "my_stat_model"
6>>> ...
7>>> # rest of the implementation
8>>> ...
Notice that this method does not require a setup.py file, but the statistical model will only be
available if the module is imported before calling AvailableBackends(). Hence if the goal is to create a package that
can be installed and used as a plugin, the entry point method is preferred.
Identifying and installing your statistical model#
To register your statistical model with Spey, you need to create an entry point. Modern Python projects use pyproject.toml
(recommended), while legacy projects may use setup.py. Both approaches are shown below.
Folder structure (same for both methods):
my_folder
├── my_subfolder
│ ├── __init__.py
│ └── mystat_model.py # this includes class MyStatisticalModel
├── pyproject.toml # Modern approach (recommended)
└── setup.py # Legacy approach (optional)
Using pyproject.toml (Recommended)#
The modern, PEP 517/518 compliant approach uses pyproject.toml:
[build-system]
requires = ["setuptools>=64"]
build-backend = "setuptools.build_meta"
[project]
name = "my-spey-plugin"
version = "1.0.0"
description = "A custom Spey statistical model"
requires-python = ">=3.8"
dependencies = ["spey>=0.1.0"]
[project.entry-points."spey.backend.plugins"]
"my_stat_model" = "my_subfolder.mystat_model:MyStatisticalModel"
[tool.setuptools.packages.find]
where = ["."]
Key components:
[build-system]: Specifies that the project uses setuptools with PEP 517 backend[project]: Standard project metadata (name, version, dependencies)[project.entry-points."spey.backend.plugins"]: The section where plugins are registered - Key (left of=): The name Spey will use to identify your backend (must match thenameclass attribute) - Value (right of=): The module path and class name in the format"module.path:ClassName"[tool.setuptools.packages.find]: Automatically discovers packages in the current directory
After writing pyproject.toml, install with: pip install -e .
The Spey package will automatically discover your plugin, and AvailableBackends() will list "my_stat_model".
Using setup.py (Legacy)#
If you prefer the legacy approach or need maximum compatibility with older tools:
from setuptools import setup
stat_model_list = ["my_stat_model = my_subfolder.mystat_model:MyStatisticalModel"]
setup(
name="my-spey-plugin",
version="1.0.0",
description="A custom Spey statistical model",
py_modules=["my_subfolder"],
install_requires=["spey>=0.1.0"],
entry_points={"spey.backend.plugins": stat_model_list}
)
Parameters:
stat_model_listis a list of statistical models to register (can include multiple backends)"my_stat_model"is the backend identifier (must match the class’snameattribute)"my_subfolder.mystat_model:MyStatisticalModel"is the module path and class name
After writing setup.py, install with: pip install -e .
Both methods achieve the same result—after installation, your plugin is immediately available through Spey.
Choose pyproject.toml for new projects unless you have specific legacy requirements.
Citing Plug-ins#
Since other users can build plug-ins, they are given a metadata accessor to extract proper information
to cite them. get_backend_metadata() function allows the user to extract name, author, version, DOI and
arXiv number to be used in academic publications. This information can be accessed as follows
>>> import spey
>>> spey.get_backend_metadata("mystat_model")
>>> # {'name': 'my_stat_model',
... # 'author': 'John Smith <john.smith@smith.com>',
... # 'version': '1.0.0',
... # 'spey_requires': '>=0.1.0,<0.2.0',
... # 'doi': [],
... # 'arXiv': []}