Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ROOT -> FITS converter scripts for CTA MC IRFs #308

Closed
cdeil opened this issue Jan 20, 2017 · 21 comments
Closed

Add ROOT -> FITS converter scripts for CTA MC IRFs #308

cdeil opened this issue Jan 20, 2017 · 21 comments

Comments

@cdeil
Copy link
Contributor

cdeil commented Jan 20, 2017

@kosack @GernotMaier - As discussed at
https://forge.in2p3.fr/issues/16702#note-4 , we want to move the ROOT -> FITS converter scripts for CTA MC IRFs to ctapipe.

They should for now be implemented without using Gammapy or Gammalib. No interpolation is needed, just changing the format from ROOT histograms to FITS BINTABLE.

Here's some questions I have:

  • @kosack - Please suggest a location for the code.
  • @GernotMaier @kosack - Should we use root_numpy.hist2array? It's convenient and fast (data copy in C, not Python loop over histogram bins). But efficiency is not really a concern here, so if we want to avoid people having to pip install root_numpy, then we can just use ROOT.py directly and add a utility function like here.
  • OK if we just add the command line interface for this in the same file in a if __name__ == '__main__' section at the end, so that it can be driven via python -m ctapipe.io.our_module <options>? Or should the cli be put somewhere else?
  • Any other things that need discussion before we start implementing this?

cc @jjlk @TarekHC @jknodlseder

@kosack
Copy link
Contributor

kosack commented Jan 25, 2017

I'd put the code in the ctapipe/tools directory, and add it to the entry_points in setup.py so it becomes an executable when installed. Of course it requires ROOT, so it should fail gracefully with a nice error message if ROOT is not installed.

@kosack
Copy link
Contributor

kosack commented Jan 25, 2017

I think using just the basic ROOT.py functionality is also best, to avoid requiring users to install lots of dependencies (ROOT is already a big one), and root_numpy isn't in the default conda repo

@cdeil
Copy link
Contributor Author

cdeil commented Apr 23, 2017

Please note that the IRF spec has been amended to include HDU class header keywords:
http://gamma-astro-data-formats.readthedocs.io/en/latest/general/hduclass.html

This allows science tools to read the IRFs without having to do heuristics such as looking at column names and trying to infer the format. It's a simple declarative scheme.

Please adopt it for the CTA IRF writers (for now the ROOT -> FITS converter, later ctapipe). And if you have any suggestions for change or extension, please file issues or pull requests for the spec.

@jknodlseder
Copy link

jknodlseder commented Apr 27, 2017 via email

@TarekHC
Copy link
Contributor

TarekHC commented Apr 27, 2017

I believe now we are recommending using HDUVERS = ‘0.2’.

Although for the moment it does not make that much sense, as there is not stable version yet. We will freeze one version as soon as we do a data release (HESS?).

@jknodlseder
Copy link

jknodlseder commented Apr 27, 2017 via email

@TarekHC
Copy link
Contributor

TarekHC commented Apr 27, 2017

(I got the file privately from Jurgen)

Some issues I found:

  • In the specs we always set HDUCLASS = 'GADF'. In the file you sent you set is as 'OGIP'.
  • For the PSF, you set HDUCLAS2 = 'RPSF' and in the specs we have 'PSF'. If there is any reason why we should change the specs, we could. Opinions?
  • For the background, you set HDUCLAS2 = 'BGD' and in the specs we have 'BKG'. Again, if there is any reason why we should change the specs, we could. Opinions?
  • You set HDUDOC = 'CAL/GEN/92-019' (I see others have HDUDOC = '???'). We have not really converged in this issue yet. Maybe we could just put a link within the comment to the specifications until there is a dedicated document? Opinions? @cdeil

I believe I see no other issues.

@jknodlseder
Copy link

jknodlseder commented Apr 28, 2017 via email

@cdeil
Copy link
Contributor Author

cdeil commented Apr 28, 2017

My 2 cents:

  • I do think we should adopt the old OGIP FITS recommendations where they exist. But I don't think we should claim that our formats are OGIP FITS recommendations (by putting HDUCLASS=OGIP). They are not! Our formats are different in detail here and there and come a decade or two later and there's no way to revive OGIP now to amend their specs. So I prefer a new HDUCLASS = 'GADF' over HDUCLASS = 'OGIP'. I think we discussed this quite a bit at the recent DL3 f2f and at the end there was a preference for HDUCLASS = 'GADF', no?

  • This is also the reason I prefer "PSF" over "RPSF". The "R" stands for radially symmetric, no? We'll probably have non-radially symmetric PSF formats at some point, and I would find it weird to then introduce a new HDUCLAS2 for those. So I'm +1 to stick with "PSF", to have better internal consistency within the spec (e.g. from the index files or your observation definition files, we're not calling the keys RPSF).

  • @jknodlseder - You say there we should format closely to OGIP and use HDUCLASS='OGIP' or HDUCLAS2='RPSF' because there's useful tools that need this. I don't know any that would be useful to us that rely on those header keys. Can you point out a few?

  • Thanks for switching to "BKG", we'll do the same in Gammapy.

  • I'm +1 to drop the duplicated HDUCLAS2 info from HDUCLAS4. @TarekHC - Can you make a pull request for that or should I?

I just realised that this is the ROOT -> FITS converter issue in the ctapipe tracker. If something is controversial (like the change to HDUCLASS='OGIP'), I'd suggest to split it out into a separate issue in the gamma-astro-data-formats issue tracker.

@cdeil
Copy link
Contributor Author

cdeil commented Apr 28, 2017

I just now also saw @TarekHC's question whether to put HDUDOC = 'CAL/GEN/92-019' above.
Like I said in my last comment, I think it would be better if we point to our specs.

Pointing to the OGIP specs means that we have to follow their spec exactly, with no possibility of changing or extending it in any way.

Concretely HDUDOC = 'CAL/GEN/92-019 is
https://heasarc.gsfc.nasa.gov/docs/heasarc/caldb/docs/memos/cal_gen_92_019/cal_gen_92_019.html

It recommends for the energy axis to use keV units (but any units are allowed)

Our corresponding spec requires to have energy in TeV here (because @jknodlseder, you wanted to require fixed units in the files):
https://gamma-astro-data-formats.readthedocs.io/en/latest/irfs/full_enclosure/aeff/index.html#effective-area-vs-true-energy

So already it's different in detail from OGIP, and having our own specs means we can extend and evolve them as use cases arise (e.g. new IRF axes like field of view or event type or ...).

@jknodlseder
Copy link

jknodlseder commented Apr 28, 2017 via email

@GernotMaier
Copy link
Contributor

Can I ask if someone is actually working on the root-to-fits converter?

I didn't find anything in ctapipe/tools

@TarekHC
Copy link
Contributor

TarekHC commented May 3, 2017

Relevant for @GernotMaier @cdeil @jknodlseder

While taking a look at the internal IRF document in preparation by @GernotMaier , I realized we are filling the "EestOverEtrue" histogram as a function of the reconstructed energy (in the MC root files we provide). I asked @moralejo and he believes we use it for checking the energy resolution, and was not added as a histogram to be used by science tools. Science tools should create the EDISP from the (finely binned) migration matrix we provide.

Within the DL3 specs, the energy migration is stored that way (Eest/Etrue) but as a function of the true energy. See: http://gamma-astro-data-formats.readthedocs.io/en/latest/irfs/full_enclosure/edisp/index.html

I was worried that this could be misleading, so I checked the 2 converters linked in this issue:

Again, I'm not sure I fully understand both macros, or if both of them are the ones used at the moment, but @jknodlseder , could you check if what I'm saying is true? Are you using the "EestOverEtrue" histogram and assuming the x axis is true energy? It would be something to be fixed (although not sure about the impact).

@GernotMaier @kosack Shall we create a dedicated very small repo just for this tool within the cta-observatory project? We could put all documentation there, and a single macro that all science tools use.

@cdeil @jknodlseder Regarding the data format issues, I'll try to implement the changes we agreed upon, and create dedicated issues within the correct git repo...

@GernotMaier
Copy link
Contributor

EestOverEtrue is in fact filled as a function of reconstruction energy and not a good replacement for the migration matrix. Never understood why decided on Erec/Etrue for the format, the few empty bins in the migration matrix is not a good reason (but I certainly don't want to open that discussion).

I would be in favor of a separate tool.

@kosack
Copy link
Contributor

kosack commented May 4, 2017

@GernotMaier @kosack Shall we create a dedicated very small repo just for this tool within the cta-observatory project? We could put all documentation there, and a single macro that all science tools use.

Yes, I think a small separate repo is a good idea. Most users won't need to use all (or any of) of ctapipe, for example, so it would be overkill to mix it into that repo. Also, we do not have ROOT as a dependency for ctapipe, and would like to keep it that way.

We can also create a conda package for it, once there is a repo, so users can do the install more easily (assuming there is a compatible version of the ROOT conda package)

@TarekHC
Copy link
Contributor

TarekHC commented May 4, 2017

Yes, I think a small separate repo is a good idea. Most users won't need to use all (or any of) of ctapipe, for example, so it would be overkill to mix it into that repo. Also, we do not have ROOT as a dependency for ctapipe, and would like to keep it that way.

@kosack Could you create it? We can copy there the available code there (both from gammapy and ctools) and continue discussion in that repo on how to merge both. I have no idea about creating a conda package, etc... But at least we can make sure the conversion is done properly (or even by the ASWG people, so science tools directly get the fits files...).

@jknodlseder @cdeil I guess this would also force both science tools to use identical format for IRFs, which is probably desired.

@cdeil
Copy link
Contributor Author

cdeil commented May 7, 2017

@TarekHC - Thanks for pushing on this!

I don't think it matters if we make a separate repo or have the few .py files in a folder in the ctapipe repo. This is a temp solution that will go away and probably some parts of the code will be refactored into code in ctapipe.

The important point for me is that we have a common written down interface definition what's what (like we started in http://gamma-astro-data-formats.readthedocs.io/) and that we have a common set of scripts producing FITS IRFs (to avoid duplication of effort and conversion differences / mistakes). We get this either way, whether the scripts go in the ctapipe repo or a new repo.

The most important question though is the one @GernotMaier asked above:

Can I ask if someone is actually working on the root-to-fits converter?

As far as I know, there's a little work being done here and there, but no-one has been willing so far to take charge of this and spend a week and write a full converter script, that exports the info the STs need to FITS (including the full PSF shape and point-like response info).

I have no idea about creating a conda package

I don't think you need this. This is a temp solution and the script will just need to be run by a few guys. Just a few .py files in a folder and a README is fine, no?

@cdeil
Copy link
Contributor Author

cdeil commented May 7, 2017

EestOverEtrue is in fact filled as a function of reconstruction energy and not a good replacement for the migration matrix. Never understood why decided on Erec/Etrue for the format, the few empty bins in the migration matrix is not a good reason (but I certainly don't want to open that discussion).

@GernotMaier - There is an open issue with extensive discussion / proposals for EDISP:
open-gamma-ray-astro/gamma-astro-data-formats#67
The reason we have y = e_reco / e_true is because that's the format EDISP was stored in HESS and so we just exported it like that to FITS.
This in no way means that CTA should adopt this. I think the reason there has been no action on that again is that no-one is working on the topic.

If anyone wants to promote or even just prototype a different way to store EDISP for CTA, please do and (either before or after prototyping) please make a pull request briefly describing the format in the spec, so that we have a clear common understanding what the content is and how science tools should use it (especially concerning the question of normalisation for EDISP).

@maxnoe
Copy link
Member

maxnoe commented Jan 3, 2019

Do we still want this? If yes, we could use uproot to read the root files.

@maxnoe
Copy link
Member

maxnoe commented Jan 3, 2019

From #83, the answer seems to be "no", so i'm closing this one.

@maxnoe maxnoe closed this as completed Jan 3, 2019
@maxnoe
Copy link
Member

maxnoe commented Jan 3, 2019

Imho it should be the responsibility of whoever creates irfs in root files to convert them to the now standardized file format, not ctapipe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants