Skip to content

Commit

Permalink
Merge pull request #126 from cdeil/obs-idx
Browse files Browse the repository at this point in the history
Update observation index table spec
  • Loading branch information
cdeil authored Jul 29, 2018
2 parents 4f1eff8 + ad4c8dd commit c7de282
Show file tree
Hide file tree
Showing 3 changed files with 67 additions and 68 deletions.
27 changes: 14 additions & 13 deletions source/data_storage/hdu_index/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,9 @@ The HDU index table is stored in a FITS file as a BINTABLE HDU:

The HDU index table can be used to locate HDUs. E.g. for a given ``OBS_ID`` and
(``HDU_TYPE`` and / or ``HDU_CLASS``), the HDU can be located via the
information in the ``FILE_DIR``, ``FILE_NAME`` and ``HDU_NAME`` columns.
The path listed in ``FILE_DIR`` has to be relative to the location of the index file.
information in the ``FILE_DIR``, ``FILE_NAME`` and ``HDU_NAME`` columns. The
path listed in ``FILE_DIR`` has to be relative to the location of the index
file.

.. _hdu-index-base-dir:

Expand All @@ -26,9 +27,9 @@ Tools are expected to support relative file paths in POSIX notation like
``FILE_DIR = "../../data/"`` as well as absolute file path like ``FILE_DIR =
"/data/cta"``.

To allow for some additional flexibility, an optional header keyword ``BASE_DIR``
can be used. If it is given, the file path is ``BASE_DIR / FILE_DIR / FILE_NAME``,
i.e. the location of the HDU index table becomes irrelevant.
To allow for some additional flexibility, an optional header keyword
``BASE_DIR`` can be used. If it is given, the file path is ``BASE_DIR / FILE_DIR
/ FILE_NAME``, i.e. the location of the HDU index table becomes irrelevant.

.. _hdu-index-columns:

Expand Down Expand Up @@ -56,10 +57,10 @@ HDU_TYPE and HDU_CLASS

The ``HDU_TYPE`` and ``HDU_CLASS`` can be used to select the HDU of interest.

The difference is that ``HDU_TYPE`` corresponds generally to e.g. PSF,
whereas ``HDU_CLASS`` corresponds to a specific PSF format.
Declaring ``HDU_CLASS`` here means that tools loading these files don't have
to do guesswork to infer the format on load.
The difference is that ``HDU_TYPE`` corresponds generally to e.g. PSF, whereas
``HDU_CLASS`` corresponds to a specific PSF format. Declaring ``HDU_CLASS`` here
means that tools loading these files don't have to do guesswork to infer the
format on load.

Valid ``HDU_TYPE`` values:

Expand Down Expand Up @@ -87,14 +88,14 @@ Relation to HDUCLAS
-------------------

At :ref:`hduclass` and throughout this spec, ``HDUCLAS`` header keys are defined
as a declarative HDU classification scheme. This appears similar to this HDU index table,
but in reality is different and incompatible!
as a declarative HDU classification scheme. This appears similar to this HDU
index table, but in reality is different and incompatible!

Here in the index table, we have ``HDU_CLASS`` and ``HDU_TYPE``. In
:ref:`hduclass`, there is ``HDUCLASS`` which is always "GADF" and then there is
a hierarchical ``HDUCLAS1``, ``HDUCLAS2``, ``HDUCLAS3`` and ``HDUCLAS4`` that
corresponds to the information in ``HDU_CLASS`` and ``HDU_TYPE`` here.
Also the values are different: here we have lower-case and use e.g. ``HDU_CLASS="aeff"``,
corresponds to the information in ``HDU_CLASS`` and ``HDU_TYPE`` here. Also the
values are different: here we have lower-case and use e.g. ``HDU_CLASS="aeff"``,
in :ref:`hduclass` we use upper-case and e.g. ``HDUCLAS2="EFF_AREA"``

One reason for these inconsistencies is that the spec for this HDU index table
Expand Down
48 changes: 24 additions & 24 deletions source/data_storage/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,29 +3,28 @@
IACT data storage
=================

At the moment there is no agreed way to organise IACT data,
and how to connect ``EVENTS`` with IRFs or other information
such as time or pointing information that is needed for analysis.

Here we document one scheme that is used extensively in H.E.S.S.,
and partly also by other IACTs. We expect that it will be superceded
in the future by a different scheme developed by CTA.

The basic idea is that current IACT data consists of "runs" or
"observations" with a given ``OBS_ID``, and that for each
observation there is one ``EVENTS`` and several IRF FITS HDUs
that contain everything needed to analyse that data.

A second idea is that with H.E.S.S. we export all data to FITS,
so we have many 1000s of observations and users usually will need
to do a run selection e.g. by sky position or observation time,
and they want to do that in an efficient way that doesn't require
globbing for 1000s of files and opening up the FITS headers to
find out what data is present.
At the moment there is no agreed way to organise IACT data, and how to connect
``EVENTS`` with IRFs or other information such as time or pointing information
that is needed for analysis.

Here we document one scheme that is used extensively in H.E.S.S., and partly
also by other IACTs. We expect that it will be superceded in the future by a
different scheme developed by CTA.

The basic idea is that current IACT data consists of "runs" or "observations"
with a given ``OBS_ID``, and that for each observation there is one ``EVENTS``
and several IRF FITS HDUs that contain everything needed to analyse that data.

A second idea is that with H.E.S.S. we export all data to FITS, so we have many
1000s of observations and users usually will need to do a run selection e.g. by
sky position or observation time, and they want to do that in an efficient way
that doesn't require globbing for 1000s of files and opening up the FITS headers
to find out what data is present.

There are two index tables:

.. toctree::
:maxdepth: 1

obs_index/index
hdu_index/index
Expand All @@ -37,9 +36,10 @@ Science tools can make use of this index files to build filenames of required
files according to some user parameters.

Note that the HDU index table would be superfluous if IRFs were always bundled
in the same file with ``EVENTS``. For HESS this wasn't done initially, because
the background IRFs were large in size and re-used for many runs. The level of
in the same file with ``EVENTS`` and if the observation index table contained
the location of that file. For HESS this wasn't done initially, because the
background IRFs were large in size and re-used for many runs. The level of
indirection that the HDU index table offers allows to support both IRFs bundled
with EVENTS ("per-run IRFs" as used in HESS) as well as the use of a global lookup database of
IRFs located separately from EVENTS (sometimes called a CALDB), as used for the
CTA first data challenge.
with EVENTS ("per-run IRFs" as used in HESS) as well as the use of a global
lookup database of IRFs located separately from EVENTS (sometimes called a
CALDB), as used for the CTA first data challenge.
60 changes: 29 additions & 31 deletions source/data_storage/obs_index/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,19 @@ Required columns
* Obsevation pointing right ascension (see :ref:`coords-radec`)
* ``DEC_PNT`` type: float, unit: deg
* Observation pointing declination (see :ref:`coords-radec`)
* ``TSTART`` type: float, unit: s
* Start time of observation relative to the reference time (see :ref:`time`)
* ``TSTOP`` type: float, unit: s
* End time of observation relative to the reference time (see :ref:`time`)

.. _obs-index-optional-columns:

Optional columns
----------------

The following columns are optional. They are sometimes used for observation
selection or data quality checks or analysis, but aren't needed for most users.

* ``ZEN_PNT`` type: float, unit: deg
* Observation pointing zenith angle at observation mid-time ``TMID`` (see :ref:`coords-altaz`)
* ``ALT_PNT`` float, deg
Expand All @@ -39,10 +52,6 @@ Required columns
* Dead time correction.
* It is defined such that ``LIVETIME`` = ``DEADC`` * ``ONTIME``
i.e. the fraction of time the telescope was actually able to take data.
* ``TSTART`` type: float, unit: s
* Start time of observation relative to the reference time (see :ref:`time`)
* ``TSTOP`` type: float, unit: s
* End time of observation relative to the reference time (see :ref:`time`)
* ``DATE-OBS`` type: string
* Observation start date (see :ref:`time`)
* ``TIME-OBS`` type: string
Expand All @@ -61,15 +70,6 @@ Required columns
* 0 = best quality, suitable for spectral analysis.
* 1 = medium quality, suitable for detection, but not spectra (typically if the atmosphere was hazy).
* 2 = bad quality, usually not to be used for analysis.

.. _obs-index-optional-columns:

Optional columns
----------------

The following columns are optional. They are sometimes used for observation
selection or data quality checks or analysis, but aren't needed for most users.

* ``OBJECT`` type: string
* Primary target of the observation
* Recommendations:
Expand Down Expand Up @@ -132,30 +132,28 @@ Mandatory Header keywords
-------------------------

The standard FITS reference time header keywords should be used (see :ref:`time-formats`).
An observatory Earth location should be given as well (see :ref:`coords-location`).

.. _obs-index-notes:

Notes
-----

* Some of the required columns are redundant. E.g. ``ONTIME`` = ``TSTOP`` - ``TSTART``.
The motivation to declare those columns required is to make it easy for users
and tools to browse the observation lists and select observations via cuts
on these parameters without having to compute them on the fly.
* Observation runs where the telescopes don't point to a fixed RA / DEC position
(e.g. drift scan runs) aren't supported at the moment by this format.
* Purpose / definition of ``BKG_SCALE``:
For a 3D likelihood analysis a good estimate of the background is important. The run-by-run
varation of the background rate is ~20%. The main reasons are the changing atmospheric conditions.
This parameter allows to specify (from separate studies) a scaling factor to the :ref:`bkg`
This factor comes e.g. from the analysis of off runs. The background
normalisation usually dependends on e.g. the number of events in a run, the
zenith angle and other parameters. This parameter provides the possibility to
give the user a better prediction of the background normalisation. For CTA
this might be induced from atmospheric monitoring and additional diagnostic
input. For HESS we try to find a trend in the off run background
normalisations and other parameters such as number of events per unit
livetime. The background scale should be around 1.0 if the background model is
good. This number should also be set to 1.0 if no dependency analysis has been
performed. If the background model normalisation is off by a few orders of
magnitude for some reasons, this can also be incorporated here.
For a 3D likelihood analysis a good estimate of the background is important.
The run-by-run varation of the background rate is ~20%. The main reasons are
the changing atmospheric conditions. This parameter allows to specify (from
separate studies) a scaling factor to the :ref:`bkg` This factor comes e.g.
from the analysis of off runs. The background normalisation usually dependends
on e.g. the number of events in a run, the zenith angle and other parameters.
This parameter provides the possibility to give the user a better prediction
of the background normalisation. For CTA this might be induced from
atmospheric monitoring and additional diagnostic input. For HESS we try to
find a trend in the off run background normalisations and other parameters
such as number of events per unit livetime. The background scale should be
around 1.0 if the background model is good. This number should also be set to
1.0 if no dependency analysis has been performed. If the background model
normalisation is off by a few orders of magnitude for some reasons, this can
also be incorporated here.

0 comments on commit c7de282

Please sign in to comment.