Helper functions#
Helper utilities for creating and interpreting pyhf workspace inputs.
This module provides WorkspaceInterpreter, a thin layer around a
pyhf background-only workspace that bookkeeps signal injection, channel
removal and conversion between signal maps and JSONPatch documents.
It also exposes a small set of pure helper functions that build the patch
operation dictionaries consumed by pyhf and convenience transformations
of the workspace such as luminosity extrapolation and systematic-uncertainty
rescaling.
- class spey_pyhf.WorkspaceInterpreter(background_only_model: Dict)[source]#
Bookkeeping wrapper around a
pyhfbackground-only workspace.The interpreter holds the original background-only
pyhfworkspace dictionary together with a parallel description of any signal injection, control-region masking and modifier configuration provided by the user. Once populated it can produce theJSONPatchdocument thatpyhfconsumes to build the signal-plus-background statistical model, and it can produce derived workspaces with rescaled luminosity or rescaled systematic uncertainties.- Parameters:
background_only_model (
Dict) – a validpyhfworkspace description for the background-only fit, containing at least the keyschannels,observationsandmeasurements.
- add_patch(signal_patch: List[Dict]) None[source]#
Replace the current signal configuration with one read from a
JSONPatch.- Parameters:
signal_patch (
List[Dict]) –JSONPatchdocument, typically produced bymake_patch(), describing signal sample additions and channel removals.
- background_only_model#
pyhfworkspace description for the background-only fit.
- property bin_map: Dict[str, int]#
Number of bins for every channel declared in the workspace.
- Returns:
mapping from channel name to the number of bins of its first sample.
- Return type:
Dict[str, int]
- property channels: Iterator[str]#
Iterate over the channel names declared in the workspace.
- Returns:
generator yielding the channel names in the order they appear in
workspace["channels"].- Return type:
Iterator[str]
- property expected_background_yields: Dict[str, List[float]]#
Total expected background yields per channel, given the current configuration.
Channels listed in
remove_listare skipped. A warning is emitted once for any channel that is kept but has not been configured with a signal injection.- Returns:
mapping from channel name to the bin-wise sum of all sample yields contributing to that channel.
- Return type:
Dict[str, List[float]]
- extrapolate_luminosity(factor: float) WorkspaceInterpreter[source]#
Return a luminosity-extrapolated copy of this interpreter.
Every sample yield, observation count and luminosity-sensitive modifier
datafield is multiplied byfactor. The transformation assumes that relative uncertainties remain constant, so absolute per-bin uncertainties (carried byshapesys,staterrorandhistosysalternative templates) scale linearly with the yields. Dimensionless modifier data (normsys,normfactor,lumi,shapefactor) is left unchanged.Both the background-only workspace and any registered signal injection are scaled. The original interpreter is not modified.
Added in version 0.2.1.
- Parameters:
factor (
float) – luminosity scale factor, typicallynew_lumi / old_lumi. Must be strictly positive.- Raises:
ValueError – if
factoris not strictly positive.- Returns:
a new interpreter wrapping a deep copy of the workspace with all yields and absolute uncertainties scaled by
factor, preserving the existing signal injections and channel-removal list.- Return type:
WorkspaceInterpreter
- get_channels(channel_index: List[int] | List[str]) List[str][source]#
Resolve a mix of channel indices and channel names to channel names.
- Parameters:
channel_index (
Union[List[int], List[str]]) – indices and/or names of the channels to look up.- Returns:
channel names whose index or name appears in
channel_index.- Return type:
List[str]
- guess_CRVR() List[str][source]#
Return all channel names that look like control or validation regions.
Classification follows
guess_channel_type().- Returns:
channel names classified as
"CR"or"VR".- Return type:
List[str]
- guess_channel_type(channel_name: str) str[source]#
Heuristically classify a channel as control, validation or signal region.
The classification is purely string-based: the uppercased channel name is searched for the substrings
"CR","VR"or"SR"in that order and the first match wins. Any other channel name returns"__unknown__". Because the check is a substring match, channel names that happen to contain these letters for unrelated reasons may be misclassified.- Parameters:
channel_name (
str) – name of the channel to classify.- Raises:
ValueError – if
channel_nameis not a channel of this workspace.- Returns:
one of
"CR","VR","SR"or"__unknown__".- Return type:
str
- inject_signal(channel: str, data: List[float], modifiers: List[Dict] | None = None) None[source]#
Register a signal injection in one channel of the workspace.
If
modifiersis provided but does not contain the defaultlumiandnormfactormodifiers (withpoi_nametaken from the first measurement), they are appended automatically.- Parameters:
channel (
str) – name of the target channel; must already exist in the background-only workspace.data (
List[float]) – signal yields, one entry per bin of the channel.modifiers (
Optional[List[Dict]], defaultNone) – modifier dictionaries to attach to the signal sample. WhenNone,_default_modifiers()is used.
- Raises:
ValueError – if
channeldoes not exist in the workspace, or if the length ofdatadoes not match the number of bins ofchannel.
- make_patch() List[Dict][source]#
Convert the registered signal injections and removals into a
JSONPatch.The returned patch list contains, in order, one
addoperation per channel registered viainject_signal(), followed by theremoveoperations for channels registered viaremove_channel(), sorted in descending index order so that earlier indices remain valid aspyhfapplies the patch.- Raises:
ValueError – if no signal has been registered yet.
- Returns:
JSONPatchdocument for the signal-plus-background workspace.- Return type:
List[Dict]
- patch_to_map(signal_patch: List[Dict], return_remove_list: bool = False) Tuple[Dict[str, List[float]], Dict[str, List[Dict]], List[str]] | Tuple[Dict[str, List[float]], Dict[str, List[Dict]]][source]#
Convert a
JSONPatchdocument into the internal signal map.>>> signal_map = {channel_name: signal_yields} >>> modifier_map = {channel_name: signal_modifiers}
- Parameters:
signal_patch (
List[Dict]) –JSONPatchdocument for the signal.return_remove_list (
bool, defaultFalse) –if
True, also return the list of channel names marked for removal.Added in version 0.1.5.
- Returns:
mapping from channel name to signal yields, mapping from channel name to signal modifiers, and (optionally) the list of channel names marked for removal.
- Return type:
Tuple[Dict[str, List[float]], Dict[str, List[Dict]], List[str]]orTuple[Dict[str, List[float]], Dict[str, List[Dict]]]
- property poi_name: List[Tuple[str, str]]#
Parameter-of-interest name for each measurement.
- Returns:
list of
(measurement_name, poi_name)tuples, one per entry ofworkspace["measurements"].- Return type:
List[Tuple[str, str]]
- remove_channel(channel_name: str) None[source]#
Mark a channel to be removed from the likelihood.
Added in version 0.1.5.
- Parameters:
channel_name (
str) – name of the channel to be removed. Channels unknown to the workspace produce an error log and no modification.
- property remove_list: List[str]#
Names of channels marked for removal from the model.
Added in version 0.1.5.
- Returns:
channel names registered via
remove_channel().- Return type:
List[str]
- scale_systematics(fraction: float, modifier_types: List[str] | None = None) WorkspaceInterpreter[source]#
Return a copy in which systematic-uncertainty deviations are rescaled.
For each modifier whose
typeis inmodifier_typesthe deviation from the nominal value is multiplied byfraction:normsysup/down scale factors are rescaled around 1, so that a fraction of0makes the systematic vanish (hi = lo = 1) and a fraction of1is a no-op;histosysalternative templates are rescaled around the nominal sample yields with the same convention.
Statistical modifiers (
shapesys,staterror) are never modified by this method, regardless ofmodifier_types: passing one of them raises aValueError. Sample yields and observations are unchanged.The original interpreter is not modified.
Added in version 0.2.1.
- Parameters:
fraction (
float) – multiplicative factor applied to each systematic deviation.1is a no-op,0removes the systematic, intermediate values shrink it. Must be non-negative.modifier_types (
Optional[List[str]], defaultNone) – modifiertypevalues to rescale. WhenNone, defaults to["normsys", "histosys"]. Statistical modifier types (shapesys,staterror) are not allowed.
- Raises:
ValueError – if
fractionis negative, or ifmodifier_typescontains a statistical modifier type.- Returns:
a new interpreter wrapping a deep copy of the workspace with the requested systematic deviations rescaled, preserving the existing signal injections and channel-removal list.
- Return type:
WorkspaceInterpreter
- property signal_per_channel: Dict[str, List[float]]#
Currently registered signal yields, keyed by channel name.
- Returns:
mapping from channel name to the signal yields registered via
inject_signal()oradd_patch().- Return type:
Dict[str, List[float]]
- summary(measurement_name: str | None = None, show_samples: bool = False, show_modifiers: bool = False, max_channels: int = 50) None[source]#
Print a human-readable summary of the workspace and the signal injection state.
The header reports workspace-level statistics (version, number of channels, measurements and observations). Each measurement is listed with its parameter of interest and parameter count. For every channel the summary shows its guessed region type (
CR/VR/SR), bin count, observation total, expected-background total, sample count and an aggregated count of modifier types attached to its samples. Injected signals and channels marked for removal are listed at the bottom.Added in version 0.2.1.
- Parameters:
measurement_name (
Optional[str], defaultNone) – if given, restrict the per-measurement section to the named measurement.show_samples (
bool, defaultFalse) – ifTrue, list every sample name and its yield total beneath each channel.show_modifiers (
bool, defaultFalse) – ifTrue, list every modifier name and type per sample. Impliesshow_samples.max_channels (
int, default50) – maximum number of channels to print per measurement.