modify_after_thesis_delivery.txt

Change descriptions of table 
Table B.5: Content of the output Excel Book ’fitted_
model_result.xlsx’ (sheet pp_pintar). One file by dataset
(ISD, ERA5).
(return_levels.csv)


change estaciones by stations (including 01 estaciones.txt)

label of stations text file. mph by kph


return_levels.csv
change descriptions of 10 to 20 including the intensity function and Yn ecuation in NIST report


change code for raw_data_station_id_fitted.csv (I added one column in position 2): Written By

modify the elements of the table repreenting the matriz ZZZ (to be written to fitted_model_result.xlsx)
# 1: id. - Ya
# 2: t_thresh. - Ya 
# 3: t_mu_location - Ya
# 4: t_psi_scale - Ya
# 5: t_average_events_per_year - Ya
# 6: t_average_time_per_year(At) - Ya (Days)
-# 7: t_gamma_Pot-Poisson-GPD (Need to be calculated using Pot-Poisson-GPD)
# 8: nt_thresh - Ya
# 9: nt_mu_location - Ya
# 10: nt_psi_scale - Ya
# 11: nt_average_events_per_year - Ya
# 12: nt_average_time_per_year(Ant) - Ya
-# 13: nt_gamma_Pot-Poisson-GPD (Need to be calculated using Pot-Poisson-GPD)
# 14: distance_w - ya
# 15: station ID - ya
#Thunderstorm ONLY
 # 16-26 10-20: t_MRI_poissonprocessintfunc.
 # 27-37 21-31: t_MRI_gumbeltailintfunc.
 # 38-48 32-42: t_MRI_gumbelquantilefunc.
 -# 49-59: t_MRI_POT-Poisson-GPD. (Need to be calculated using Pot-Poisson-GPD)
#Non-thunderstorm ONLY
 # 60-70 43-53: nt_MRI_poissonprocessintfunc.
 # 71-81 54-64: nt_MRI_gumbeltailintfunc.
 # 82-92 65-75: nt_MRI_gumbelquantilefunc.
 -# 93-103: nt_MRI_POT-Poisson-GPD. (Need to be calculated using Pot-Poisson-GPD)
#Thunderstorm and Non-thunderstorm simultaneously
 # 104-114 76-86: tnt_MRI_poissonprocessintfunc.


change name plot_t.r to rl_plot_t.r

Verify:
- At y Ant are in days for time dependency (t and nt)
- At y Ant are 1 when no time dependency (t or nt)
- Columnns t_gamma_Pot-Poisson-GPD and nt_gamma_Pot-Poisson-GPD are there to do calculations with Pot-Poisson-GPD.
- Put how was calculated nt_average_events_per_year (key procedure in declustering - distance between non thunderstorm events considered same cluster)
- Write in manual the trick for Equivalent POT-Poisson-GPD of POT-PP
- Write in manual: For Yn
this ecuation is the solution of the integral in time and magnitude using the 
Poisson Process Intensity Function equation 16 page 39 NIST Report 500-301 (same equation page 28, but not depending of time)


***After contract in Ais
--- Raw data files need to have 'kph' instead of 'mph' (text files were changed)

Put a summary of the meaning of the variables names in R code (eJ: rl_plot_nt.r)

Review in the document related to As and Ans (365). Etc


La leyenda de la Figura 5.4 no se ve bien. (modifz eracomparisonideam_188_78_plotxts.rds)


Review the meaning of 3D intensity function. delete that graphic??

Review the sentence "To fit the intensity function to the data " in th PDF

It is nos possible to have only "t" in any dataset???. Verify this and put in the manual. If you do no have t and nt classification, put all as nt (do not use all data as t)


I added 20 new columns to fitted_model_result.xlsx with percentajes of probabilities of storm and non-storm related to the calculation of return value/level (Yn) when the dataset is composed of thunderstorm and non-thunderstorm at the same time. First ten columns 115-125 report the percentaje of thunderstorm probability, and second ten columns 126-136 repor the percentaje of non-thunderstorm probability. Whit this colums is possible to know how much percentage probability does storm bring and how much percentage of probability does non-storm bring to calculate the return level Yn. prob = At*integral + Ant * Integral (Yn is the suscript of the integral). First ten columns contain At*integral, and next ten columns contains Ant*integral.

Put in the manual: It is not possible to leave only t in time series. It can be nt only or (t and nt), but not t only.

Put in theoretical framework:
- According to Coles(2001): in case of having available complete time series (no gaps) and values above certain threshold, the POT approach is more efficient than block maxima method

Put in theoretical framewor:
selection of threshold is a trade-off between bias (low threshold) and variance (high threshold)

threshold selection in coles (2001): (a) exploratory approach previous to the actual model fitting and is based on the mean residual life plot, and (b) assess the stability of the parameters estimates using a range of different thresholds

selection of the thresholds is the most important part of extreme valua analysis using partial duration series, see  Scarrott and MacDonald provide a quite good overview of approaches for threshold estimation in their 2012 article A review of extreme value threshold estimation and uncertainty quantification (REVSTAT 10(1): 33–59) https://www.ine.pt/revstat/pdf/rs120102.pdf

The typical POT approach uses the GP distribution

talk about as, ans time units Yn. Dependence of location parameter of time units!

change 'at' and 'ant' by 'as' and 'ans'