-
Dear pyhf experts, I am working on estimating the sensitivity of a counting experiment in a ~background free scenario as a function of the integrated luminosity with pyhf. I want to do this with toys, since the asymptotics approximation fails badly in this scenario. Do you have any examples/recommendations for how to run these in separate batch jobs and combine them at the end to extract the p-values? Thanks a lot in advance! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
This is a really good question and I've had a refresh of what we currently do in pyhf. For # all the same code from your example ...
from numpy import random
def job(calculator, seed=0):
random.seed(seed)
return calculator.distributions(0.0)
calculator = pyhf.infer.utils.create_calculator(
calc_type, data, model, test_stat="q0", track_progress=False, **kwargs
)
test_stat = calculator.teststatistic(0.0)
from multiprocessing import Pool
with Pool(processes=4) as pool:
res = pool.map(lambda x: job(calculator, seed=x), range(10))
from pyhf.infer.calculators import EmpiricalDistribution
sig_plus_bkg_dist = EmpiricalDistribution([dist.samples for dist, _ in res])
bkg_dist = EmpiricalDistribution([dist.samples for _, dist in res])
pvalues = calculator.pvalues(test_stat, sig_plus_bkg_dist, bkg_dist) The key point is for the numpy backend (see #1795) for setting the random seed to ensure that generating N toys when you resample from the pdf is "uniquely" and guaranteed random. This is of course going to be backend-dependent, so I'll leave that up to the reader to figure it out for the other backends. The other note is that in the case of toy-based calculator, there's no stateful information from the toys actually stored on the calculator (just information about the model, data, bounds, etc...) so it's kosher in this case to pass in the merged distributions from the toys into the calculator here. This won't necessarily be correct for the asymptotics case, but file a different discussion if you want that. We do need to improve the user experience here more. But this should be enough to get you started. Feel free to respond with a full working solution if you have one. |
Beta Was this translation helpful? Give feedback.
This is a really good question and I've had a refresh of what we currently do in pyhf. For
v0.6.3
of the code, you should be able to do something like this: