Ability to define merged channels in PostFitShapesFromWorkspace (+ a few random fixes) #295

ajgilbert · 2023-03-06T13:51:03Z

Adds option --merged-channels to PostFitShapesFromWorkspace that lets the user define additional channel labels that will be calculated as the sum over all the channels matching a particular regex pattern. E.g., given a card with channels: chn_ee_2016, chn_ee_2017, chn_ee_2018, chn_mm_2016, chn_mm_2017, chn_mm_2018, we could add:
--merged-channels 'chn_ee_tot=chn_ee_201.' 'chn_mm_tot=chn_mm_201.', and these additional merged channels will appear as directories in the ROOT file.

Partially addresses #291.

This PR also fixes a few assorted bugs, mostly cases where we need to avoid divide by zero.

kcormi

Thanks, this looks good.

I was not familiar with LimitCompare.cc, but I gues it was unused and being removed for keeping things clean?

kcormi · 2023-03-06T15:41:27Z

CombineTools/python/plotting.py

@@ -1191,6 +1192,9 @@ def FixTopRange(pad, fix_y, fraction):
        if ymin == 0.:
            print('Cannot adjust log-scale y-axis range if the minimum is zero!')
            return
+        if fix_y <= 0:
+            print('Cannot adjust log-scale y-axis range if the maximum is zero!')


Should we maybe put "less than or equal to zero" here in the printout, just since that's what the check is doing?

anigamova · 2023-03-22T13:27:16Z

For the merging would it make sense to check if the channel that a user wants to merge have the same binning and if it is not the case print out a warning message? I think TH1.Add() adds histograms by bin number and completely disregards the differences in bin labels

ajgilbert · 2023-03-22T14:57:35Z

@anigamova I think it's a good idea. It's true it probably does something undesirable if this is the case

ajgilbert · 2023-03-27T08:33:41Z

Have been rethinking this one a bit, and #267.

We might be able to make a bigger improvement, that will also help performance, by also doing some restructuring. One of the issues often reported is that PostFitShapesFromWorkspace is quite slow for more complex models. This is partly due to the fact that each call to GetUncertainty or GetShapeWithUncertainty builds its own loop over the covariance matrix resamples, which adds a certain amount of overhead.

A better way could be to add a new function that returns a list of shapes/yields according to some input specification. By default, this could be a list of each category, within which a list of processes, as well as the usual signal/background/total sums. But we could allow custom specs, e.g. defining a new category based on a list of actual categories, or a regex to match them - similarly for sums of processes.

Then all the shapes/yields the user has requested can be computed with just one loop. We might also build into this a mechanism to optionally return the full set of variations.

pkausw · 2023-03-28T11:25:30Z

Have been rethinking this one a bit, and #267.

We might be able to make a bigger improvement, that will also help performance, by also doing some restructuring. One of the issues often reported is that PostFitShapesFromWorkspace is quite slow for more complex models. This is partly due to the fact that each call to GetUncertainty or GetShapeWithUncertainty builds its own loop over the covariance matrix resamples, which adds a certain amount of overhead.

A better way could be to add a new function that returns a list of shapes/yields according to some input specification. By default, this could be a list of each category, within which a list of processes, as well as the usual signal/background/total sums. But we could allow custom specs, e.g. defining a new category based on a list of actual categories, or a regex to match them - similarly for sums of processes.

Then all the shapes/yields the user has requested can be computed with just one loop. We might also build into this a mechanism to optionally return the full set of variations.

Hi @ajgilbert ,

Yes, this seems like a good way to increase performance! Since the changes you propose are mostly relevant for the backend of the harvester, it might make sense to still go ahead with this PR and #267 in my personal opinion. This way, we could already use the improvements to PostfitshapesFromWorkspace from these PRs and maybe disentangle them from the following changes in the backend. But I also don't have a very strong opinion about this. Would you prefer to do it all in one go?

ajgilbert added 4 commits August 28, 2019 10:15

remove LimitCompare.cpp

64ea0ef

Add option for pattern-based merged bins in PostFitShapesFromWorkspace

10e02d6

Merge branch 'main' into devs-wgamma

2b15f1e

Fix option name

8a39721

ajgilbert mentioned this pull request Mar 6, 2023

Merge processes at runtime #267

Open

kcormi reviewed Mar 6, 2023

View reviewed changes

anigamova linked an issue Mar 10, 2023 that may be closed by this pull request

PostFitShapesFromWorkspace development #291

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to define merged channels in PostFitShapesFromWorkspace (+ a few random fixes) #295

Ability to define merged channels in PostFitShapesFromWorkspace (+ a few random fixes) #295

ajgilbert commented Mar 6, 2023

kcormi left a comment

kcormi Mar 6, 2023

anigamova commented Mar 22, 2023

ajgilbert commented Mar 22, 2023

ajgilbert commented Mar 27, 2023

pkausw commented Mar 28, 2023

Ability to define merged channels in PostFitShapesFromWorkspace (+ a few random fixes) #295

Are you sure you want to change the base?

Ability to define merged channels in PostFitShapesFromWorkspace (+ a few random fixes) #295

Conversation

ajgilbert commented Mar 6, 2023

kcormi left a comment

Choose a reason for hiding this comment

kcormi Mar 6, 2023

Choose a reason for hiding this comment

anigamova commented Mar 22, 2023

ajgilbert commented Mar 22, 2023

ajgilbert commented Mar 27, 2023

pkausw commented Mar 28, 2023