spey.helper_functions.merge_correlated_bins#
- spey.helper_functions.merge_correlated_bins(background_yields: ndarray, data: ndarray, covariance_matrix: ndarray, merge_groups: List[List[int]], signal_yields: ndarray = None, return_group_indices: bool = False) → Dict[str, ndarray][source]#
Merge correlated bins in a histogram/cutflow.
This function takes a set of background yields, data, and a covariance matrix, and merges specified groups of bins into single bins. The resulting background yields, data, and covariance matrix are returned in a dictionary. The merging is done by summing the yields and data for the specified groups, and summing the covariance matrix entries for the merged bins.
Added in version 0.2.4.
Example:
>>> from spey.helper_functions import merge_correlated_bins >>> import numpy as np >>> background_yields = np.array([10, 20, 30, 40]) >>> data = np.array([12, 22, 32, 42]) >>> covariance_matrix = np.array( ... [[4, 1, 0.5, 0.2], ... [1, 3, 0.3, 0.1], ... [0.5, 0.3, 5, 0.2], ... [0.2, 0.1, 0.2, 4]] >>> ) >>> merge_groups = [[0, 1], [2, 3]] >>> result = merge_correlated_bins( ... background_yields=background_yields, ... data=data, ... covariance_matrix=covariance_matrix, ... merge_groups=merge_groups ... ) >>> print(result) >>> # { ... # 'background_yields': array([30., 70.]), ... # 'data': array([34., 74.]), ... # 'covariance_matrix': array([[ 9. , 1.1], ... # [ 1.1, 9.4]]) ... # }
- The resulting
resultdictionary will contain: background_yields: Merged background yields.data: Merged data.covariance_matrix: Merged covariance matrix.
Note
The function assumes that the input arrays are 1-dimensional and that the covariance matrix is square. It also checks for overlapping indices in
merge_groupsand raises an assertion error if any are found.Warning
The function does not check for the validity of the covariance matrix (e.g., positive definiteness). It is assumed that the input covariance matrix is valid for the given background yields and data.
- Parameters:
background_yields (
np.ndarray) – background yields for each bin.data (
np.ndarray) – observed data for each bin.covariance_matrix (
np.ndarray) – covariance matrix for the bins.merge_groups (
list[list[int]]) – indices of bins to merge.signal_yields (
np.ndarray, defaultNone) – signal yields for each bin. If provided, these will also be merged according to the specified groups.return_group_indices (
bool, defaultFalse) –if
True, the function will return the indices of the merged groups in the output dictionary. This is to help user to keep track of which bins were merged together and how the bins are reordered. New signal yields can be formed by running the following code:>>> new_signal_yields = [sum(np.array(signal_yields)[Gi]) for Gi in output["group_indices"]]
- Raises:
AssertionError –
If the lengths of the input arrays do not match or if the covariance matrix is not square. * If there are overlapping indices in
merge_groups. * If the lengths ofdata,background_yields, andsignal_yieldsdo not match. * If the covariance matrix is not square. * If the lengths ofdata,background_yields, andsignal_yieldsdo not match.
- Returns:
A dictionary containing the merged background yields, data, and covariance matrix (and signal if included).
- Return type:
dict[str, np.ndarray]
- The resulting