specificRequirements.tex


\chapter{Specific Requirements}
\section{External Interfaces}
\subsection{Log entries}
\begin{enumerate}
  \item In the current logbook it is not always easy to extract information about the subsystem. To get the needed information it is nowadays sometimes necessary to do things by hand. Exporting data is not always easy because there is no standard export format. A root-file format would come in handy for this.\footnote{From: Robert Helmut Munzer}
  \item The End Of Shift (EOS) report is often not easily accessible. The shifter types a file in its own way, using his own words, in his own order and sometimes forgetting things. An idea could be to parse this text or the system should ask whether a subsystem should be mentioned. A template for the EOS-report is another idea.\footnote{From: Robert Helmut Munzer}
  An on call expert should use a template for categories of issues. Such a template could use questions and contain mandatory information. Perhaps based on already available information which is collected on the fly. Uniformity of reports does help a lot to gather information quickly.\footnote{From: Robert Helmut Munzer}
  \item Follow up or track a specific issue should be in the logbook and not in JIRA as now is the case. The new system should have the functionality to add comments and edit then. But never, never erase anything! 
  \item The gas log entries are for gas technicians. This is a general service of CERN.\footnote{Roberto Diva}
    \item there should be log entries for other users.\footnote{Roberto Diva}
  \item in the logbook there should be 3 or 4 roles.\footnote{Roberto Diva}
  \item each role, user or specific person should have read, write or administration access.\footnote{Roberto Diva}
\end{enumerate}

\section{Functional requirements}

\subsection{Operational teams}
\begin{itemize}
  \item Run Coordination
  \item Technical Coordination
  \item Detector experts
  \item O2 experts
  \item Trigger
\end{itemize}

\subsection{Subsystem actors}
ALICE consists of several subsystems. All these systems are able to create a log entry. 
A subsystem makes an entry into the database consisting of several items. 

\subsection{User stories}
\subsection{User}
\begin{enumerate}
  \item As a user, I want to have a smart editor to create my log entries (WYSIWYG or Markup) and be able to use smart text so that messages look nice (e.g. links, code, …) 
  \item As a user, I want to attach files to log entries so that I can add additional non-textual information
  \item As a user, I want to reply to existing log messages so that a conversation stays in a well-defined thread
  \item As a user, I want to list log entries in a summary view so that I can get an overview of what happened in a given period.
  \item As a user, I want to list log entries in a detailed view so that I can read them one after the other.
  \item As a user, I want to search log entries by different criteria (e.g. title, content, author, creation date, …) and have the results listed. 
  \item As a user, I want to browse through all the available metadata associated with a given run to understand on which conditions the run was made. 
  \item As a user, I want to list all runs that match a given criteria to create my own run set. 
  \item As a user, I want to see in a dashboard the metadata associated with an LHC Fill so that I can have a global image of what happened during that LHC Fill. 
  \item As a user, I want to be able to login with my CERN credentials to avoid having to remember a new set of credentials. This should be done by using the CERN authentication method.
  \item As a user, I want to be able to customize dashboards so that I only see the fields relevant to me. 
  \item As a user, I want to be able to save search criteria for later use so that I don’t lose time defining them at each visit. 
  \item As an ALICE member, I would like to receive via email a global summary of each LHC Fill in order to follow ALICE operations without visiting the bookkeeping tools. Currently in the ALICE logbook I like that I receive via email a document with info on efficiency and EOR Reasons and that on the body of the email there is a summary for each fill (Vasco Chibante Barroso).
  \item As ALICE collaborator I have to create statistics reports such as number of runs, quantity of data, number of events, summaries by trigger classes etc... These reports will use selection criterias I will specify such as time spans, active systems (e.g. only the runs including my particular system), run type etc...
\item As ALICE collaborator I may have to open multiple GUIs with independent selection criterias (e.g. one browser window for day-to-day work and a second browser window for statistics) (Roberto Divia).
\item As ALICE collaborator I need APIs to retrieve specific fields in the Logbook using selection criterias I provide. This access must be restricted, probably API by API or by fields being accessed, and protected against intentional or unintentional DoS attacks (Roberto Divia).
\item As ALICE collaborator I need to be able to access the Logbook on a run-per-run summary view (possibly using a selection criteria I specify) and on a log entry by log entry view (possibly using a selection criteria I specify) (Roberto Divia).
\item As ALICE collaborator I need to check the details of any run: EOR reason, statistics, log entries (Roberto Divia).
  \item People can create issues (Pierre vanden Vyvre)
\end{enumerate}

\subsection{Shifter}
\begin{enumerate}
  \item A shifter makes an entry into the database consisting of several items. 
  \item As a shifter, I want to have templates that prefill most of my end-of-shift reports from the available metadata so that I don’t need to fill in myself what the system already knows (Vasco Chibante Barroso).
  \item As a Shifter, I would like to have templates that automatically compile and format the data available in the system in order to write my end of shift report in a fast and uniform way. Currently in the ALICE logbook I don't like that I need to compile all the information myself and that not all shifters use the same structure (Vasco Chibante Barroso).
  \item As a shifter I want to view log entries.
  \item As a shifter I want to be able to create log entries.
  \item As a shifter I want to view announcements.
  \item As a shifter I want to view on call interventions.
  \item As a shifter I want to view some statistics of runs and other stuff.
  \item As a shifter I want to view data about calibration of the detector.
  \item As a shifter I want to be able to have a big screen view.
  \item As a shifter I want to view data about the fill.
  \item As shifter I would like to have templates that automatically compile  and format the data available in the system in order to write my end of shift report in a fast and uniform way. Currently in the ALICE logbook I don't like that I need to compile all the information myself and that not all shifters use the same structure (Roberto Divia).
  \item As shifter I have to create log entries concerning any system (alone or in combination) (Roberto Divia).
  \item As shifter I may have to cross-reference log entries (e.g. by URL, by unique Reference ID, or by run number)  (Roberto Divia).
  \item  As shifter I may need to attach files to log entries. These files may contain text or binary information (PNGs, JPGs etc...) (Roberto Divia).
  \item As shifter I may need to cross reference log entries or other logbook fields (e.g. run numbers, fill numbers etc...) with whatever issue tracking system will be used by the ALICE collaboration (today: JIRA). This association may also be done automatically by daemons (e.g. what is done today for EOR reasons and Jira tickets) (Roberto Divia).

\end{enumerate}

\subsection{Run Coordinator}
\begin{enumerate}
  \item As run co\"ordinator, I want shifters to use templates so that it is easier and faster to read them (Vasco Chibante Barroso). 
  \item As run co\"ordinator, I want to attach tags to runs so that I can then use them while searching (Vasco Chibante Barroso). 
  \item As run co\"ordinator, I want to edit certain specialized fields associated to a run (e.g. EOR Reason) so that I correct wrong information inserted by the $O^2$ software (Vasco Chibante Barroso). 
  \item As run coordinator, I want to specify acquisition targets for certain time periods and check how far we are in achieving them so that I can keep track of progress (Vasco Chibante Barroso). 
  \item As run co\"ordinator I need to gather statistics on the runs selected by using custom rules (timestamps, run numbers, run types, included detectors etc...). These statistics will include EOR reasons, per-detector and per-system summaries, error recovery (PARs) rates etc... (Roberto Divia)
  \item As run co\"ordinator I have to create Logbook entries that cover almost all the Systems (e.g. global announcements or minutes) (Roberto Divia).
  \item As run co\"ordinator I have to create log entries concerning any system (alone or in combination) (Roberto Divia).
  \item As run co\"ordinator I need to be able to update the logbook information for what concerns subsystems, in particular the run quality flag and the EOR reason(s). The question arises if subsystem run co\"ordianators can update information associated to other systems (e.g. EOR reasons) as it is the case today (Roberto Divia).
  \item As run co\"ordinator I must be able to move collaborators to and out of subsystem teams. These action may be conflict the information stored in SAMS (Roberto Divia).
  \item As run co\"ordinator I may request to receive automatic e-mails concerning all Logbook entries that include all systems (either without distinction or using special selection criterias). The e-mail delivery address will probably be an e-group (single e-mail address <...>@cern.ch) (Roberto Divia).
  \item As run co\"ordinator access to Logbook actions restricted to my role should be granted without external interventions and for the time span of my duties (e.g. for shifters the shifts before and after mine, plus my own shift) (Roberto Divia).
  \item As run co\"ordinator I need to give ALICE collaborators write or read-only access to the logbook. These rights will be superseeded by equivalent rights given according to the function of the user (e.g. a ALICE collaborator with read-only access will be given write access during the time of his/her duties as a shifter, subsystem run co\"ordinator or system team member) (Roberto Divia).
  \item As run co\"ordinator I may have to cross-reference log entries (e.g. by URL, by unique Reference ID, or by run number)  (Roberto Divia).
  \item  As run co\"ordinator I may need to attach files to log entries. These files may contain text or binary information (PNGs, JPGs etc...) (Roberto Divia).
  \item As run co\"ordinator I may need to cross reference log entries or other logbook fields (e.g. run numbers, fill numbers etc...) with whatever issue tracking system will be used by the ALICE collaboration (today: Jira). This association may also be done automatically by daemons (e.g. what is done today for EOR reasons and Jira tickets) (Roberto Divia).
\end{enumerate}

\subsection{Subsystem Run Coordinator}
Each subsystem has its own Subsystem Run Coordinator (SRC). 
\begin{enumerate}
  \item As a subsystem responsible, I want to be notified by email (or other channels) of log entries which are related with my subsystem so that I can better follow-up activities without having to constantly visit the product, e.g. EOS report (Robert Munzer) (Vasco Chibante Barroso). 
  \item As a subsystem expert, I want to attach quality flags to runs so that physicists can use them while searching for good data sets for their analysis (Vasco Chibante Barroso). 
  \item As a subsystem expert, I want to store custom fields that are only relevant to my subsystem so that I can correlate them with the rest of the metadata repository (e.g. ‘fetch all runs with configuration X where this happened to my detector’) (Vasco Chibante Barroso). 
  \item As a detector expert I would like be able to extract run/fill information in a format, which allows easier
analysis than txt files, e.g. root-files to be able to do specific statistical analysis (Robert Munzer).
  \item As a SRC I would like to be able to create my own detector specific templates for example On-Call interventions. In this case I can specify the relevant information which are required from the OnCall shifter for different kind of “standard” events (Robert Munzer).
  \item As ECS/DAQ System Run Coordinator I need a way to access information of runs matching a selection criteria I specify (timestamps, run numbers, run types, included detectors etc...). Navigation between runs must be easy and quick. The target is to check the global runs (production and tests) for quality and errors (Roberto Divia).
  \item As System Run Coordinator I need ways to interrogate all the runs where the System I am responsible for participated and to get access to individual run entries and to summary statistics (Roberto Divia).
  \item As subsystem run co\"ordinator I have to create log entries concerning any system (alone or in combination) (Roberto Divia).
  \item As subsystem run co\"ordinator I need to be able to update the logbook information for what concerns my system and other systems, in particular the run quality flag and the EOR reason(s). The question arises if subsystem run co\"ordinators can update information associated to other systems (e.g. EOR reasons) as it is the case today (Roberto Divia).
  \item As subsystem run co\"ordinator I must be able to move collaborators to and out of subsystem teams. These action may be conflict the information stored in SAMS (Roberto Divia).
  \item As subsystem run co\"ordinator I may request to receive automatic e-mails concerning all Logbook entries that include the System I am working for (either without distinction or using special selection criterias). The e-mail delivery address will probably be an e-group (single e-mail address <...>@cern.ch) (Roberto Divia).
  \item As subsystem run co\"ordinator access to Logbook actions restricted to my role should be granted without external interventions and for the time span of my duties (e.g. for shifters the shifts before and after mine, plus my own shift) (Roberto Divia).
  \item As subsystem run co\"ordinator I need to give ALICE collaborators write or read-only access to the logbook. These rights will be superseeded by equivalent rights given according to the function of the user (e.g. a ALICE collaborator with read-only access will be given write access during the time of his/her duties as a shifter, subsystem run co\"ordinator or system team member) (Roberto Divia).
  \item As subsystem run co\"ordinator I may have to cross-reference log entries (e.g. by URL, by unique Reference ID, or by run number)  (Roberto Divia).
  \item  As subsystem run co\"ordinator I may need to attach files to log entries. These files may contain text or binary information (PNGs, JPGs etc...) (Roberto Divia).
  \item As subsystem run co\"ordinator I may need to cross reference log entries or other logbook fields (e.g. run numbers, fill numbers etc...) with whatever issue tracking system will be used by the ALICE collaboration (today: Jira). This association may also be done automatically by daemons (e.g. what is done today for EOR reasons and Jira tickets) (Roberto Divia).
  \item The subsystem co\"ordinator wants to be reported when something is going with his system. He should not have to take action for himself to find out things (Robert Helmut Munzer).
\end{enumerate}

\subsection{On Call expert}
\begin{enumerate}
  \item A person who is called for a specific intervention makes an entry into the log system.
\end{enumerate}

\subsection{System team member}
\begin{enumerate}
  \item As system team member I have to create log entries concerning the system I am responsible for (alone or in combination) (Roberto Divia).
  \item As system team member I may have to cross-reference log entries (e.g. by URL, by unique Reference ID, or by run number)  (Roberto Divia).
  \item  As system team member I may need to attach files to log entries. These files may contain text or binary information (PNGs, JPGs etc...) (Roberto Divia).
  \item As system team member I may need to cross reference log entries or other logbook fields (e.g. run numbers, fill numbers etc...) with whatever issue tracking system will be used by the ALICE collaboration (today: Jira). This association may also be done automatically by daemons (e.g. what is done today for EOR reasons and Jira tickets) (Roberto Divia).
\end{enumerate}


\subsection{Physics community}
The Physics Board has several needs or questions:
\begin{enumerate}
  \item To make the planning possible an overview of storage and processing power (CPU) is needed. 
  \item The use of resources per user to run jobs could be more detailed.
  \item How much PB is available on disk for storage.
  \item For MC-storage a fine grained but lacks an overview.
  \item When I want to clean up, where do I have to look?
  \item MC production requests.
  \item Usage statistics (which data is popular?).
  \item Sort out why a train takes a specific time to process.
\end{enumerate}
Most data is replicated because a lot of people use the data.

There are two views from the Physics Board:
\begin{itemize}
  \item clean up, to know what could be cleaned up
  \item planning, when can this MC be run?
\end{itemize}
The is a need for reports on resource usage. This is a management view.

\subsection{ALICE management}
\begin{enumerate}
  \item Each week global and specific statistics about the system are needed
  \begin{itemize}
    \item CPU usage
    \item data storage
    \item etc.
  \end{itemize}
  \item evolution of the queue between synchronous and asynchronous computing
  \item logbook images of statistics
  \item statistics or number of issues per project and detector
  \item daily report about data taking and data processing for the meeting at Point 2.
  \begin{itemize}
    \item how long is data been taken?
    \item what are the issues?
  \end{itemize}
  \item As a manager I want to know whether all the relevant people are involved with respect to an issue (Pierre vanden Vyvre).
\end{enumerate}


\subsection{CERN administration officer}
\begin{itemize}
  \item As CERN administration officer I need to check all the on-call intervention records issued by CERN personnel (use case to be cross-checked with EP-AID-DA management) (Roberto Divia).
\end{itemize}

\subsection{Developer}
\begin{enumerate}
  \item As a developer, I want to programmatically fetch log entries that match a given criteria so that I can build custom logic or applications based on existing data (Vasco Chibante Barroso). 
  \item As a developer, I want to programmatically fetch runs that match a given criteria so that I can build custom logic or applications based on existing data (Vasco Chibante Barroso). 
\end{enumerate}

\subsection{Administrator}
\begin{enumerate}
  \item System administrators can create an announcement. 
  \item As an administrator, I want to have a dashboard that gives me log-entry related analytics so that I follow the evolution of the repository (Vasco Chibante Barroso). 
  \item As Administrator I need to be able to update the logbook information for what concerns the bookkeeping system and other systems, in particular the run quality flag and the EOR reason(s). The question arises if subsystem run co\"ordinators can update information associated to other systems (e.g. EOR reasons) as it is the case today (Roberto Divia).
  \item As administrator I must be able to move collaborators to and out of subsystem teams. These action may be conflict the information stored in SAMS (Roberto Divia).
    \item As administrator access to Logbook actions restricted to my role should be granted without external interventions and for the time span of my duties (e.g. for shifters the shifts before and after mine, plus my own shift) (Roberto Divia).
  \item Only administrator may be given the possibility to remove log entries (and I am not even sure about this) (Roberto Divia).
\item As administrator I must be able to trigger the migration of the whole system (server, databases etc...) to alternate sites to cover scenarios such as HW failures or long power interruptions (e.g. Xmas shutdown). This migration should be as automatic as possible, particularly for the 1st example above (hot stand-by?) (Roberto Divia).
  \item As administrator I may request to replicate either selected portions or all of the Logbook data to external sites and to provide adequate access tools to it (to facilitate read-only accesses) (Roberto Divia).
  \item As an administrator I must be able to configure the system.
\end{enumerate}

\subsection{Data Preparation Group}
\begin{enumerate}
  \item Make different views:
  \begin{itemize}
    \item shifter
    \item analyser
    \item detector expert
  \end{itemize}
\end{enumerate}

As DPG for raw data and MC processing we need to
\begin{enumerate}
  \item Access information about runs to be processed
  \begin{itemize}
    \item Detectors present in acquisition, special conditions/triggers, issues
spotted online
    \end{itemize}
  \item Track the steps for the preparation of the production
  \item Keep track of the status of a given production
  \item Bookkeep of information about raw data reconstruction passes
  \begin{itemize}
    \item Software version used, calibrations used, known bugs/missing features
  \end{itemize}
  \item Bookkeep of MC productions anchored to a given run/period
  \begin{itemize}
    \item Characteristics of the MC production (event generator, statistics...)
    \item Which reconstruction pass the MC was anchored to
  \end{itemize}
  \item Link each production to its QA results (comments / plots) per run and per period
  \begin{itemize}
    \item Ideally with the possibility to comparing MC QA plots to corresponding
raw data
  \end{itemize}
  \item Provide lists of good runs to be used for analysis
  \begin{itemize}
    \item Lists are analysis dependent because of different requirements on
detectors/triggers ....
  \end{itemize}
\end{enumerate}

DPG-QA would like to have:
\begin{enumerate}
  \item an automatic tool to ask each subsystem run coordinator for validation,
  \item something so they don't have to collect manually good runs to create a run list,
  \item one page for derived data,
  \item access different information from a single webpage,
  \item in the RCT access to plots and other info,
  \item a representation of wall time,
  \item the links of the MC which are anchored to a specific run,
  \item easy access to data with an API to compare trending plots and correlation plots,
  \item notification of alarms,
  \item aggregation of data for users,
  \item have ONE single entry point for all information which proper link a specific run to ALL information needed:
  \begin{itemize}
    \item logbook extracted information
    \item all detectors QAplots
    \item the production/QA jira ticket
    \item look/browse the QAresults.root file
  \end{itemize}
  \item cross checks of QA results/anomalies in different detectors, e.g. check ITS QA variables (e.g. occupancy or matching efficiency) against (or close to...) interaction rate from EVS QA
  \item easy comparison of data vs. MC
\end{enumerate}

\subsection{Gas technician}
\begin{enumerate}
  \item As a gas technician I want to create log entries when I delivered gas and other substances at Point 2.
\end{enumerate}

\subsection{Observer}
\begin{enumerate}
  \item As an observer I want to be able to look at the bookkeeping without the chance of adding or manipulating data.
\end{enumerate}

\section{Software System Attributes}
The software system attributes, system quality attributes or non functional requirements are not yet fully determined. 

\subsection{Development stack}
The $O^2$ user interaction framework consists of various aspects, which are described at https://\-github.com/\-AliceO2Group/WebUi.

\subsection{Availability}
Given the criticality of the accounting data, the repositories should run in high availability. The views do not strictly need high availability for as long as failures remain rare and downtime low and as long as it doesn’t impact operations.

\begin{itemize}
  \item Every activity should be logged
  \item Bookkeeping should not be the reason for EOR
  \item GUI is not critical
  \item Repository is critical
  \item Database should be highly available
\end{itemize}

There are some disasters which can be mentioned. One such a disasters is when the platform for the logbook fails. This could mean an EOR. There is a redundant database for all the data. The logbook is always active. When the power is off (during Christmas) the logbook is transferred to CERN 1. \footnote{Roberto Diva}


\subsection{Backup}
Backups of the repositories are mandatory. CERN facilities such as tape system can be used to store the backups. For disaster recovery, a remote copy might make sense. 
\begin{itemize}
  \item Once a day a back up is done
  \item Redundancy should be considered
\end{itemize}

\subsection{Configuration management}
Recipes for $O^2$ adopted configuration management tools (most likely Puppet or Ansible) are mandatory in order to allow deployment and configuration changes without intervention of main developers. 

\subsection{Documentation}
User documentation is needed for complex actions. Administrator documentation for system administrators and on call support crew during operations. Developers documentation for newcomers, long term maintainability and in case of need for handover. 
\begin{itemize}
  \item Use pop ups (contextual help)
\end{itemize}

\subsection{Evolution}
Changes are frequent due to new requirements, changes in workflows and operational optimizations. New developments should be expected during the full lifetime of the software. The software also needs to be supported until the end of Run 4 (currently 2029). SLAs should be agreed on between AUAS and ALICE $O^2$.

\subsection{Guidelines}
Software code should follow O2 guidelines and software processes. Several guidelines are available:
\begin{table}
\begin{tabular}{lp{7cm}}
  \hline
  Subject & https://github.com/AliceO2Group/\\
  \hline
  \hline
   C++ & CodingGuidelines\\
   \hline
   Web development & Gui/blob/master/docs/DEV.md\\
   \hline
   Build system & AliceO2\\
   \hline
\end{tabular}
\end{table}

\subsection{Interoperability}

Software needs to integrate with the following systems: 
\begin{itemize}
  \item Configuration Management
  \item Monitoring (will become clear in February 2018)
  \item Logging of the system (Logstash, Elastic Search, etc). and of applications in use.
  \item Build system
  \item O2 
  \item Grid operations (Alien, LPM) 
  \item AliBuild which is extensively documented
  \item Alien and LPM need to be changed
\end{itemize}


\subsection{Licensing}
Software needs to be compatible with O2 project licensing guidelines (GPLv3, copyright owners are all participating institutes). 
License available here. 

\subsection{Performance}
The repository should handle the load from processes running in the O2 farm and jobs running on the grid. The web server should handle the load from human visitors (should not be very high) and the load from programmatic access via the REST API. The numbers for performance of the elements mentioned are to be decided in due time. Web GUI response time should follow accepted guidelines and the acceptable latency should conform to common standards.

\subsection{Platform compatibility}
The software should be compatible with the $O^2$ supported platforms. Server components will run in the $O^2$ facility at Point Two. The operating system will most likely be CentOs 7 or similar Linux distribution. Programmatic APIs should run on all target platforms (Linux, Apple). GUI should run on major browsers (Firefox, Chrome, Safari). Microsoft Edge to be checked.

\subsection{Security}
There should be integration with CERN Single Sign On services. And roles based on CERN e-groups. By use of certificates users get access. This is done local.

\subsection{Serviceability}
Support crews should be able to independently diagnosis and either restore the service to nominal conditions or migrate to a new instance. Access model to production instances by developers team to be decided.

\subsection{Connectivity}
The system should be able to function correctly without an internet connection.