From 3a5eeeec601f7f487d37f20666fc577c2a93763e Mon Sep 17 00:00:00 2001 From: Pao Corrales Date: Wed, 14 Feb 2024 13:08:14 -0300 Subject: [PATCH] General revision of content --- content/gsi/01-gsi.qmd | 26 ++--- content/gsi/02-convencionals.qmd | 22 ++--- content/gsi/03-radiances.qmd | 18 ++-- content/gsi/04-diagfiles.qmd | 135 +++++++++++++------------- content/gsi/05-tutorial.qmd | 22 +++-- content/observations/01-bufr.qmd | 76 +++++++-------- content/observations/02-conv-bufr.qmd | 53 +++++----- content/observations/03-rad-bufr.qmd | 18 ++-- 8 files changed, 181 insertions(+), 189 deletions(-) diff --git a/content/gsi/01-gsi.qmd b/content/gsi/01-gsi.qmd index 6a7386e..0f5ed8e 100644 --- a/content/gsi/01-gsi.qmd +++ b/content/gsi/01-gsi.qmd @@ -6,13 +6,13 @@ bibliography: references.bib The GSI (Gridpoint Statistical Interpolation) System, is a state-of-the-art data assimilation system initially developed by the Environmental Modeling Center at NCEP. It was designed as a traditional 3DVAR system applied in the gridpoint space of models to facilitate the implementation of inhomogeneous anisotropic covariances [@wu2002; @purser2003a; @purser2003]. It is designed to run on various computational platforms, create analyses for different numerical forecast models, and remain flexible enough to handle future scientific developments, such as the use of new observation types, improved data selection, and new state variables [@kleist2009]. -The- 3DVAR system replaced the NCEP regional grid-point operational analysis system by the North American Mesoscale Prediction System (NAM) in 2006 and the *Spectral Statistical Interpolation* (SSI) global analysis system used to generate *Global Forecast System* (GFS) initial conditions in 2007 [@kleist2009]. In recent years, GSI has evolved to include various data assimilation techniques for multiple operational applications, including 2DVAR [e.g., the *Real-Time Mesoscale Analysis* (RTMA) system; @pondeca2011], the hybrid EnVar technique (e.g., data assimilation systems for the GFS, the *Rapid Refresh system* (RAP), the NAM, the HWRF, etc. ), and 4DVAR [e.g., the data assimilation system for NASA's Goddard Earth Observing System, version 5 (GEOS-5); @zhu2008]. GSI also includes a hybrid 4D-EnVar approach that is currently used for GFS generation. +The 3DVAR system replaced the NCEP regional grid-point operational analysis system by the North American Mesoscale Prediction System (NAM) in 2006 and the *Spectral Statistical Interpolation* (SSI) global analysis system used to generate *Global Forecast System* (GFS) initial conditions in 2007 [@kleist2009]. In recent years, GSI has evolved to include various data assimilation techniques for multiple operational applications, including 2DVAR [e.g., the *Real-Time Mesoscale Analysis* (RTMA) system; @pondeca2011], the hybrid EnVar technique (e.g., data assimilation systems for the GFS, the *Rapid Refresh system* (RAP), the NAM, the HWRF, etc. ), and 4DVAR [e.g., the data assimilation system for NASA's Goddard Earth Observing System, version 5 (GEOS-5); @zhu2008]. GSI also includes a hybrid 4D-EnVar approach that is currently used for GFS generation. In addition to the development of hybrid techniques, GSI allows the use of ensemble assimilation methods. To achieve this, it uses the same observation operator as the variational methods to compare the preliminary field or background with the observations. In this way the exhaustive quality controls developed for variational methods are also applied in ensemble assimilation methods. The EnKF code was developed by the Earth System Research Lab (ESRL) of the National Oceanic and Atmospheric Administration (NOAA) in collaboration with the scientific community. It contains two different algorithms for calculating the analysis increment, the serial Ensemble Square Root Filter [EnSRF, @whitaker2002] and the LETKF [@hunt2007] contributed by Yoichiro Ota of the Japan Meteorological Agency (JMA). -To reduce the impact of spurious covariances on the increment applied to the analysis, ensemble systems apply a localization to the covariance matrix of the errors of the observations $R$ in both the horizontal and vertical directions. GSI uses a polynomial of order 5 to reduce the impact of each observation gradually until a limiting distance is reached at which the impact is zero. The vertical location scale is defined in terms of the logarithm of the pressure and the horizontal scale is usually defined in kilometers. These parameters are important in obtaining a good analysis and depend on factors such as the size of the ensemble and the resolution of the model. +To reduce the impact of spurious covariances on the increment applied to the analysis, ensemble systems apply a localization to the covariance matrix of the errors of the observations $R$ in both the horizontal and vertical directions. GSI uses a polynomial of order 5 to reduce the impact of each observation gradually until a limiting distance is reached at which the impact is zero. The vertical localization scale is defined in terms of the logarithm of the pressure and the horizontal scale is usually defined in kilometers. These parameters are important in obtaining a good analysis and depend on factors such as the size of the ensemble and the resolution of the model. -GSI uses the Community Radiative Transfer Model [CRTM, @liu2008] as an operator for the radiance observations that calculates the brightness temperature simulated by the model in order to compare it with satellite sensor observations. GSI also implements a bias correction algorithm for the satellite radiance observations. The preliminary field estimate with the CRMT is compared with the radiance observations to obtain the innovation. This innovation is then used to calculate a bias that is applied to an updated innovation. This process can be repeated several times until the innovation and the bias correction coefficients converge. +GSI uses the Community Radiative Transfer Model [CRTM, @liu2008] as an operator for the radiance observations that calculates the brightness temperature simulated by the model in order to compare it with satellite observations. GSI also implements a bias correction algorithm for the satellite radiance observations. The preliminary field estimated with CRMT is compared with the radiance observations to obtain the innovation. This innovation is then used to calculate a bias that is then applied to an updated innovation. This process can be repeated several times until the innovation and the bias correction coefficients converge. ## Available observations for assimilation @@ -83,7 +83,7 @@ Here is the list of observations that can be assimilated by GSI. In bold are the ## Running GSI -Every assimilation cycle starts with the background, a forecast generated using a numerical model (WRF-ARW for this guide), that was initialized from previous analysis and observations (in bufr format) that enters the GSI system. GSI will also need "fixed" files with information about the observations. This files define which observations are going to be assimilated, they errors and quality control options. +Every assimilation cycle starts with the background, a forecast generated using a numerical model (WRF-ARW for this guide), that was initialized from previous analysis and observations (in bufr format) that enters the GSI system. GSI will also need "fixed" files with information about the observations. This "fix" files define which observations are going to be assimilated, they errors and quality control options. ![Diagram of an assimilation cycle](img/cycle_diagram.png){fig-alt="Diagram that shows a cycle. First the background (forecasts generated using a numerical model, WRF-ARW, initialized from previous analysis) and Observations (from bufr files) enters the GSI system. Inside the system, the Observation Operator takes care of the quality control and bias correction and then calculates the innovation and generates the diag files. Then the ENKF part calculates the update appling the innovation to the background to generate the analysis. The analysis is used to create a new background to star the cycle again"} @@ -99,9 +99,9 @@ GSI can also be used with the following background files: - WRF-Chem GOCART input fields with NetCDF format - CMAQ binary file -And the official tutorials are a good starting point to grasp the use of this options. +And the [official tutorials in the DTCenter webpage](https://dtcenter.org/community-code/gridpoint-statistical-interpolation-gsi/gsi-tutorial-online) are a good starting point to grasp the use of this options. -GSI can also be run *without observations* to test the code, this is with a single synthetic observation defined in the SINGLEOB_TEST section in the namelist. +GSI can also be run *without observations* to test the code, this is with a single synthetic observation defined in the SINGLEOB_TEST section in the gsi namelist. Another thing to try at the beginning. The fixed files are located in the `fix/` folder and includes statistic files, configuration files, bias correction files, and CRTM coefficient files[^1]. The information of the configuration files is saved in the output files after running GSI. @@ -112,16 +112,16 @@ The fixed files are located in the `fix/` folder and includes statistic files, c | berror_stats | background error covariance (for variacional methods) | nam_nmmstat_na.gcv, nam_glb_berror.f77.gcv, | | errtable | Observation error table | prepobs_errtable.global | | convinfo | Conventional observation information file | global_convinfo.txt | -| satinfo | satellite channel information file | global_satinfo.txt | -| pcpinfo | precipitation rate observation information file | global_pcpinfo.txt | -| ozinfo | ozone observation information file | global_ozinfo.txt | -| satbias_angle | satellite scan angle dependent bias correction file | global_satangbias.txt | -| satbias_in | satellite mass bias correction coefficient file | sample.satbias | -| satbias_in | combined satellite angle dependent and mass bias correction coefficient file | gdas1.t00z.abias.new | +| satinfo | Satellite channel information file | global_satinfo.txt | +| pcpinfo | Precipitation rate observation information file | global_pcpinfo.txt | +| ozinfo | Ozone observation information file | global_ozinfo.txt | +| satbias_angle | Satellite scan angle dependent bias correction file | global_satangbias.txt | +| satbias_in | Satellite mass bias correction coefficient file | sample.satbias | +| satbias_in | Combined satellite angle dependent and mass bias correction coefficient file | gdas1.t00z.abias.new | ## About the GSI code -GSI is writen in fortran and the code is separated in more than 5 hundred files. While GSI has 2 good user guides, not everything is documented and sometimes you will need to read the code. +GSI is written in fortran and the code is separated in more than 5 hundred files. While GSI has [2 good user guides](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3/tree/main/docs), not everything is documented and sometimes you will need to read the code. To swim around the code I found the `grep -r "key word"` command very useful. Each file and subroutine inside it has a header with information about what it does, changes and input and output arguments. It's worth mentioning a few key files: diff --git a/content/gsi/02-convencionals.qmd b/content/gsi/02-convencionals.qmd index 0834309..707d9b1 100644 --- a/content/gsi/02-convencionals.qmd +++ b/content/gsi/02-convencionals.qmd @@ -6,11 +6,11 @@ Conventional observations are assimilated from PREPBUFR files. NCEP ADP Global U While the PREPBUFR includes wind derived from satellite observations, GSI ignores this observations and uses the ones provided by the specific bufr file `gdas.t00z.satwnd.tm00.bufr_d`. -PREPBUFR files usually contains observations from a 6 to 12 h window and can be modify using FORTRAN routines provided with the GSI code (see `util/bufr_tools` in the GSI source code folder). You can also create your own bufr file or add new observation to an existing bufr file (see [Working with bufr files](../observations/01-bufr.qmd)). +PREPBUFR files usually contains observations from a 6 to 12 h window and can be modify using FORTRAN routines provided with the GSI code (see `util/bufr_tools` in the [GSI source code folder](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3)). You can also create your own bufr file or add new observation to an existing bufr file (see [Working with bufr files](../observations/01-bufr.qmd)). ### Controlling which observations are assimilated -The assimilation of conventional observations is controlled with the `convinfo` file. Let's check the `global_convinfo.txt` file we get as an example: +The assimilation of conventional observations is controlled with the `convinfo` file. Let's check the `global_convinfo.txt` file we get as an example in the `fix` folder: ``` bash ! otype = observation type (a7, t, uv, q, etc.) @@ -41,14 +41,14 @@ The assimilation of conventional observations is controlled with the `convinfo` The head of the file explains the content of each column but there are a few more things to add: -- type: this is defined by the bufr tables, particular [Table 2](https://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc/table_2.htm). It is worth checking this table as includes information about which observations are assimilated in GFS, errors asociated to specific instruments and other details. +- type: this is defined by the bufr tables, particular [Table 2](https://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc/table_2.htm). It is worth checking this table as includes information about which observations are assimilated in GFS, errors associated to specific instruments and other details. - twindow: while the assimilation window is defined in the gsi namalist, it is possible to control an assimilation window for specific observations. This is useful if, for example the assimilation window is 3 h and you want to assimilate temperature in a 1h window. -In general you only change the *isue* column to assimilate or not a type of observation and maybe just maybe the *gross*, *ermax*, and *ermin* parameters if you want to modify the quality control of the observations. +In general you only change the *iuse* column to assimilate or not a type of observation and maybe just maybe the *gross*, *ermax*, and *ermin* parameters if you want to modify the quality control of the observations. ### Observation errors and quality control -For **regional assimilation** GSI uses an error table located in the `errtable` file that you'll find in the `./fix` folder under the name `prepobs_errtable.global` (it is confusing that the table for regional errors in in a file called global). Here is a small example of the content of the file for observations from surface stations. +For **regional assimilation** GSI uses an error table located in the `errtable` file that you'll find in the `./fix` folder under the name `prepobs_errtable.global` (yes, it is confusing that the table for regional errors in in a file called global). Here is a small example of the content of the file for observations from surface stations. ``` bash 181 OBSERVATION TYPE @@ -76,11 +76,11 @@ GSI will perform a quality control for each observation. In general terms this i For the gross check, GSI first calculates a ratio: -$$ ratio = (obs - bk)/max(ermin, min(ermax, obserror)) $$ The main error parameters are controlled by the `convinfo` file. The `obserror` is the observation error defined in the prepbufr file for each observation as a result to the quality control perform while generating that file. +$$ ratio = (obs - bk)/max(ermin, min(ermax, obserror)) $$ The main error parameters are controlled by the `convinfo` file. The `obserror` is the observation error defined in the prepbufr file for each observation plus information in the `prepobs_errtable`. If $ration > gross$ the observation is rejected. -Other piece of information used during the quality control is the quality control flag that is included in the prepbufr file a part if it quality control process. The possible values for conventional observations are: +Other piece of information used during the quality control is the quality control flag that is included in the prepbufr file a part if it quality control process performed by NCEP. The possible values for conventional observations are: | qc flag | meaning | |:------------------------------:|:---------------------------------------| @@ -90,10 +90,10 @@ Other piece of information used during the quality control is the quality contro You can find more details about the quality control flags in [Table 7](https://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc/table_7.htm). -GSI can also perform a thinning for conventional observations. You can activate that option for each type of observation changing `ithin = 1` in the `convinfo` file. There are other important columns, `rmesh`, `pmesh`, in the `convinfo` file to configure conventional data thinning: +GSI can also perform a thinning for conventional observations. You can activate that option for each type of observation changing `ithin = 1` in the `convinfo` file. There are other important columns, `rmesh`, `pmesh`, in the `convinfo` file to configure conventional data thinning: -* `ithin`: 0 = no thinning; 1 = thinning with grid mesh decided by `rmesh` and `pmesh` -* `rmesh`: horizontal thinning grid size in km -* `pmesh`: vertical thinning grid size in mb; if 0, then use background vertical grid +- `ithin`: 0 = no thinning; 1 = thinning with grid mesh decided by `rmesh` and `pmesh` +- `rmesh`: horizontal thinning grid size in km +- `pmesh`: vertical thinning grid size in mb; if 0, then use background vertical grid **For each observation GSI will check different things and change the observation error accordingly.** The final observation error is recorded in the `diag` file and then used during the assimilation. diff --git a/content/gsi/03-radiances.qmd b/content/gsi/03-radiances.qmd index 15bb26b..88be6d4 100644 --- a/content/gsi/03-radiances.qmd +++ b/content/gsi/03-radiances.qmd @@ -3,13 +3,13 @@ title: "Assimilating Radiance Observations" bibliography: references.bib --- -Assimilating radiance observations is more complicated than assimilating conventional observations as radiances are not state variables. We need a observation operator to transform the model variables to radiances. GSI uses the Community Radiative Transfer Model [CRTM, @liu2008] as an operator of the radiance observations that calculates the brightness temperature simulated by the model in order to compare it with the observations from satellite sensors. +Assimilating radiance observations is more complicated than assimilating conventional observations as radiances are not state variables. We need a observation operator to transform the model variables to brightness temperature. GSI uses the Community Radiative Transfer Model [CRTM, @liu2008] as an operator of the radiance observations that calculates the brightness temperature simulated by the model in order to compare it with the observations from satellite sensors. ## The CRTM radiative transfer model -The CRTM is a fast radiative transfer model that was jointly developed by the *NOAA Center for Satellite Applications and Research* and the *Joint Center for Satellite Data Assimilation* (JCSDA). It is a model widely used by the remote sensing community as it is open source and publicly available. In addition, it is used for satellite instrument calibration [@weng2013; @iacovazzi2020; @crews2021], and in turn to generate retrievals from satellite observations [@boukabara2011; @hu2019; @hu2021]. It is also used as an operator of observations as part of the assimilation of satellite radiances [@tong2020; @barton2021]. +The CRTM is a fast radiative transfer model that was jointly developed by the *NOAA Center for Satellite Applications and Research* and the *Joint Center for Satellite Data Assimilation* (JCSDA). It is a model widely used by the remote sensing community as it is open source and publicly available. In addition, it is used for satellite instrument calibration [@weng2013; @iacovazzi2020; @crews2021], and to generate retrievals from satellite observations [@boukabara2011; @hu2019; @hu2021]. It is also used as an operator of observations as part of the assimilation of satellite radiances [@tong2020; @barton2021]. -The CRTM is capable of simulating microwave, infrared and visible radiances, under clear and cloudy sky conditions, using atmospheric profiles of pressure, temperature, humidity and other species such as ozone. Recently @cutraro2021 evaluated their takeoff in the region with good results in simulating GOES-16 observations. +The CRTM is capable of simulating microwave, infrared and visible radiances, under clear and cloudy sky conditions, using atmospheric profiles of pressure, temperature, humidity and other species such as ozone. Recently @cutraro2021 evaluated their use in Argentina with good results in simulating GOES-16 observations. CRTM is a sensor-oriented radiative transfer model, i.e. it contains pre-calculated parameterizations and coefficient tables specifically for operational sensors. It includes modules that calculate thermal radiation from gaseous absorption, absorption and scattering by aerosols and clouds, and emission and reflection of radiation by the Earth's surface. The inputs of CRTM include atmospheric state variables, e.g., temperature, water vapor, pressure, and ozone concentration in user-defined layers, and surface state variables and parameters, including emissivity, surface temperature, and wind. @@ -45,7 +45,7 @@ The assimilation of radiance observations is controlled with the `satinfo` file hirs3_n17 1 -1 2.000 0.000 4.500 10.000 0.000 -1 -1 -1 ``` -The file includes a line for each sensor/satellite and each channel (chan). Usually you will change the `iuse` and `error` columns to configure the assimilation of each channel. The options for `iuse` are: +The file includes a line for each sensor/satellite and each channel (`chan`). Usually you will change the `iuse` and `error` columns to configure the assimilation of each channel. The options for `iuse` are: - -2 do not use - -1 monitor if diagnostics produced @@ -66,7 +66,7 @@ Here is an incomplete list of parameters in the GSI and ENKF namelists. **GSI namelist** | Parameter | Description | Needed value | -|--------------|--------------------------------------------|--------------| +|------------------|----------------------------------------------------------------------------------------------------------------------------------|--------------| | passive_bc | option to turn on bias correction for passive (monitored) channels | .true. | | use_edges | option to exclude radiance data on scan edges | .false. | | lwrite_predterms | option to write out actual predictor terms instead of predicted bias to the radiance diagnostic files | .true. | @@ -78,14 +78,14 @@ This namelist will also includes a list with the type of observations and the na **ENKF namelist** | Parameter | Description | Needed value | -|------------|------------------------------------------------|------------| +|-------------|---------------------------------------------------------------------------------------------|--------------| | adp_anglebc | turn off or on the variational radiance angle bias correction | .true. | | angord | order of polynomial for angle bias correction | 4 | | use_edges | logical to use data on scan edges (.true.=to use) | .false. | | emiss_bc | If true, turn on emissivity bias correction | .true. | | upd_pred | bias update indicator for radiance bias correction; 1.0=bias correction coefficients evolve | 1 | -: This namelist will also includes a list with the type of observations and the name of each one inside GSI. For example `amsua_n15` represent the observations of the AMSU-A sensor on board NOAA-15. +This namelist will also includes a list with the type of observations and the name of each one inside GSI. For example `amsua_n15` represent the observations of the AMSU-A sensor on board NOAA-15. ## Observation errors and quality control @@ -180,7 +180,9 @@ On the other hand the predictors are saved in the diag files (see the [Undestand ### Cloud detection -The cloud pixel detection methodology depends on the wavelength of the observations. For microwave radiances, potentially cloud-contaminated observations are detected using scattering and Liquid Water Path (LWP) indices calculated from differences between different channels of each sensor [@weston2019; @zhu2016]. For infrared channels, cloud contaminated observations are detected using the transmittance profile calculated by the CRTM model. In addition, GSI checks the difference between the observations and the simulated brightness temperature to detect cloudy pixels. A particular case is the ABI observations since the cloud mask (level 2 product) available at the same resolution as the observations is used. This cloud mask is generated by combining information from 8 channels of the ABI sensor from the spatial and temporal point of view. +The cloud pixel detection methodology depends on the wavelength of the observations. For microwave radiances, potentially cloud-contaminated observations are detected using scattering and Liquid Water Path (LWP) indices calculated from differences between different channels of each sensor [@weston2019; @zhu2016]. For infrared channels, cloud contaminated observations are detected using the transmittance profile calculated by the CRTM model. In addition, GSI checks the difference between the observations and the simulated brightness temperature to detect cloudy pixels. + +A particular case is the ABI observations since the cloud mask (level 2 product) available at the same resolution as the observations is used. This cloud mask is generated by combining information from 8 channels of the ABI sensor from the spatial and temporal point of view. ### Other quality controls diff --git a/content/gsi/04-diagfiles.qmd b/content/gsi/04-diagfiles.qmd index f7d4f2e..6c56e66 100644 --- a/content/gsi/04-diagfiles.qmd +++ b/content/gsi/04-diagfiles.qmd @@ -6,11 +6,11 @@ The `diag_` files save the key information of how the observations where as It is important to check that the `write_diag` option in the GSI namelist is setup to `.true.`. -By default the diag files are saved in binary format, and while GSI should be able to saved them as netcdfs, I have never made it work. So, this section will concentrate on how to decode the binary files and how to interpret the information. +By default the diag files are saved in binary format, and while GSI should be able to saved them as netCDFs, I have never made it work. So, this section will concentrate on how to decode the binary files and how to interpret the information. -Here is a list of the files you get if you run GSI as observation operator, in this case we are assimilating AMSU-A observation from the NOAA-18 and METOP-A satellites, ABI observations from GOES-16 and conventional observations. To be able to use this output with the Kalman Filter methods we need the diag files for the ensemble mean and every member. +Here is a list of the files you get if you run GSI as observation operator, in this case we are assimilating AMSU-A observation from the NOAA-18 and METOP-A satellites, ABI observations from GOES-16 and conventional observations. To be able to use this output with the Kalman Filter methods we need the diag files for the ensemble mean and each member. -```bash +``` bash diag_abi_g16_ges.ensmean diag_abi_g16_ges.mem001 diag_abi_g16_ges.mem002 @@ -57,11 +57,11 @@ diag_conv_ges.mem009 diag_conv_ges.mem010 ``` -GSI includes some fortran routines you can use to decode the binary files. In may case I decided to modify those routines to get the information as a tidy table (1 observation per row, variables in columns) and to include more details present in the diagfiles. +GSI includes some fortran routines you can use to decode the binary files. In my case I decided to modify those routines to get the information as a tidy table (1 observation per row, variables in columns) and to include more details present in the diagfiles. -The code is published [in this repository](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3/tree/main/util/read_diag), that includes a version of GSI with some modidications: +The code is published [in this repository](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3/tree/main/util/read_diag), that includes a version of GSI with my modifications: -```bash +``` bash read_diag/ ├── convinfo ├── namelist.conv @@ -81,41 +81,41 @@ read_diag/ └── read_diag_rad.f90_original ``` -There are 2 fortran routines, `read_diag_conv.f90` for conventional diag files and `read_diag_rad.f90` for radiances. To compile the routines it is necessary to link them with the libraries that GSI uses. An example of how to compile the code can be found in `compile_gcc` and `compile_ifort`. +There are 2 fortran routines, `read_diag_conv.f90` for conventional diag files and `read_diag_rad.f90` for radiances. To compile the routines it is necessary to link them with the libraries that GSI uses. An example of how to compile the code can be found in `compile_gcc` and `compile_ifort`. -The resulting executables are `read_diag_conv.x` and `read_diag_rad.x`. Each one is asociated to a namelist that you need to modify each time in order to run the code and decode an especific diagfile. See for example the content of `namelist.conv`: +The resulting executables are `read_diag_conv.x` and `read_diag_rad.x`. Each one is associated to a namelist that you need to modify each time in order to run the code and decode an specific diagfile. See for example the content of `namelist.conv`: -```bash +``` bash &iosetup infilename='/home/paola.corrales/datosmunin3/EXP/E6_long/ANA/20181112220000/diagfiles/diag_conv_ges.ensmean', outfilename='/home/paola.corrales/datosmunin3/EXP/E6_long/ANA/20181112220000/diagfiles/asim_conv_20181112220000.ensmean', / ``` -The namelist is very simple, it only need the path to the diag file and the path to the output: a plain text file. But if you need to do this for every diag file, it is very time consuming. For that reason I wrote in bash some loops to go through all the diagfiles and decode them automatically. There are 4 bash files, to decode conventional diagfiles (ensemble mean or the members of the ensemble) and 2 for the radiance diagfiles. I've also kept the original fortran routines just in case. +The namelist is very simple, it only need the path to the diag file and the path to the output: a plain text file. But if you need to do this for every diag file, it is very time consuming. For that reason I wrote in bash some loops to go through all the diagfiles and decode them automatically. There are 4 bash files, 2 to decode conventional diagfiles (ensemble mean or the members of the ensemble) and 2 for the radiance diagfiles. I've also kept the original fortran routines just in case. ### Conventional obs This is the information you get when you decode a conventional diagfile using the `read_diag_conv.x`: -* variable -* stationID -* type (acording to the prepbufr) -* dhr (difference between the obserbation time and the analysis time) -* latitude -* longitude -* pressure -* usage flag (defined by gsi) -* usage flag preprepbufr -* observation -* observation minus guess -* observation (only of uv) -* observation minus guess (only of uv) -* observation error +- variable +- stationID +- type (according to the prepbufr) +- dhr (difference between the observation time and the analysis time) +- latitude +- longitude +- pressure +- usage flag (defined by gsi) +- usage flag preprepbufr +- observation +- observation minus guess +- observation (only if uv) +- observation minus guess (only if uv) +- observation error Each row is an observation, except for wind that has u and v components in the same row. -```bash +``` bash ps @ SCVD : 187 -0.50 -39.61 286.94 0.101E+04 1 0 0.101E+04 -0.142E+01 0.100E+11 0.181E+03 0.161E+01 ps @ 85782 : 181 -0.50 -40.60 286.95 0.999E+03 1 0 0.999E+03 -0.176E+01 0.100E+11 0.181E+03 0.227E+01 ps @ 85766 : 181 -0.50 -39.65 287.92 0.101E+04 1 0 0.101E+04 -0.139E+00 0.100E+11 0.187E+03 0.295E+01 @@ -130,56 +130,55 @@ Each row is an observation, except for wind that has u and v components in the s uv @ IR270 : 245 0.00 -39.31 286.11 0.269E+03 -1 100 0.434E+02 -0.225E+01 -0.193E+02 -0.108E+01 0.685E+01 ``` -The diag file includes more details. It may be useful to read the subroutine that write the diag files. The subroutine is called `contents_binary_diag_` and is present in each `setup*.f90` file. But here is a tip, it is much easier to read the `contents_netcdf_diag_` subroutine because it mention the name of each variable to create the metadata of the netcdf file. +The diag file includes more details. It may be useful to read the subroutine that write the diag files. The subroutine is called `contents_binary_diag_` and is present in each `setup*.f90` file. But here is a tip, it is much easier to read the `contents_netcdf_diag_` subroutine because it mention the name of each variable to create the metadata of the netCDF file. ### Radiance obs For radiances we'll get a diag file for each sensor and satellite. But the structure of the binary files is always the same. Here is the list of variable I save from the diagfiles: -* sensor -* channel -* frequency -* latitude -* longitude -* elevation at observation location according guess (mb) -* pressure at max of weighting function (mb) -* dhr (difference between the observation time and the analysis time) -* observation (BT) -* observation minus guess with bias correction -* observation minus guess without bias correction -* inverse observation error -* quality control flag -* emissivity from surface -* stability index -* satellite zenith angle (degrees) -* satellite azimuth angle (degrees) -* fractional coverage by land -* fractional coverage by ice -* fractional coverage by ice -* cloud fraction (%) -* cloud top pressure (hPa) -* predictor 1 -* predictor 2 -* predictor 3 -* predictor 4 -* predictor 5 -* predictor 6 -* predictor 7 -* predictor 8 -* predictor 9 -* predictor 10 -* predictor 11 -* predictor 12 - -Again, it is worth cheeking the subroutine that write the diagfiles for radiances, in case there are other details you need to include in the decodification. - -`read_diag_rad.x` will write a plain text file with all the variables listed above for each observation (from each satellite/sensor/channel). +- sensor +- channel +- frequency +- latitude +- longitude +- elevation at observation location according guess (mb) +- pressure at max of weighting function (mb) +- dhr (difference between the observation time and the analysis time) +- observation (BT) +- observation minus guess with bias correction +- observation minus guess without bias correction +- inverse observation error +- quality control flag +- emissivity from surface +- stability index +- satellite zenith angle (degrees) +- satellite azimuth angle (degrees) +- fractional coverage by land +- fractional coverage by ice +- fractional coverage by ice +- cloud fraction (%) +- cloud top pressure (hPa) +- predictor 1 +- predictor 2 +- predictor 3 +- predictor 4 +- predictor 5 +- predictor 6 +- predictor 7 +- predictor 8 +- predictor 9 +- predictor 10 +- predictor 11 +- predictor 12 + +Again, it is worth cheeking the subroutine that write the diagfiles for radiances, in case there are other details you need to include in the decodification. + +`read_diag_rad.x` will write a plain text file with all the variables listed above for each observation (from each satellite/sensor/channel). ### Important information in the diagfiles While all variables included in the diagfiles are necessary for the assimilation, there are a few that I found particularly important to monitor the assimilation process: -* **observation minus guess:** this variable should have a normal distribution centered in zero. If a bias correction was perform, it is important to compare the distribution with and without bias correction. -* **quality control flag:** if this variable is not zero, the observation will be rejected during the assimilation. To understand why this happened you need to check the GSI code and find the corresponding qc value. It will be different for each type of observation. -* **error:** this variable is also used to decide if the observation will be assimilated or not. If the value is to big (and remember, GSI changes the value of the error depending on the quality of the observation), it means that the observation is not good. - +- **observation minus guess:** this variable should have a normal distribution centered in zero. If a bias correction was perform, it is important to compare the distribution with and without bias correction. +- **quality control flag:** if this variable is not zero, the observation will be rejected during the assimilation. To understand why this happened you need to check the GSI code and find the corresponding qc value. It will be different for each type of observation. +- **error:** this variable is also used to decide if the observation will be assimilated or not. If the value is to big (and remember, GSI changes the value of the error depending on the quality of the observation), it means that the observation is not good. diff --git a/content/gsi/05-tutorial.qmd b/content/gsi/05-tutorial.qmd index e71aad5..58b74f0 100644 --- a/content/gsi/05-tutorial.qmd +++ b/content/gsi/05-tutorial.qmd @@ -2,7 +2,7 @@ title: "GSI tutorial" --- -This tutorial is intended to showcase some of the capabilities of the GSI system. For a more complete set of examples, please visit the official website for GSI: [variational methods](https://dtcenter.org/community-code/gridpoint-statistical-interpolation-gsi) and \[Kalman Filter\]() methods. +This tutorial is intended to showcase some of the capabilities of the GSI system. For a more complete set of examples, please visit the official website for GSI: [variational methods](https://dtcenter.org/community-code/gridpoint-statistical-interpolation-gsi) and [Kalman Filter](https://dtcenter.org/community-code/ensemble-kalman-filter-system-enkf) methods. The tutorial uses the serial Ensemble Square Root Filter (EnSRF) algorithm but could also be used to run the Local Ensemble Kalman Filter (LETKF) algorithm if the GSI system code is compiled using a intel compiler[^1]. It also uses a version of the code that have the capability of assimilating GOES-16 observations. @@ -10,7 +10,9 @@ The tutorial uses the serial Ensemble Square Root Filter (EnSRF) algorithm but c ## Moving parts -The first component to run the tutorial is the GSI system code that can be found in this repository: [github.com/paocorrales/comGSIv3.7_EnKFv1.3](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3). Alternatively, the original code without modifications is available [here](https://dtcenter.org/community-code/gridpoint-statistical-interpolation-gsi). The repository includes an example script `compile_gsi_yakaira` to compile the code. In any case it is important to include the option `-DBUILD_WRF=ON` to use the ENKF algorithm with WRF. +### GSI code + +The first component to run the tutorial is the GSI system code that can be found in this repository: [github.com/paocorrales/comGSIv3.7_EnKFv1.3](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3). Alternatively, the original code without modifications is available [here](https://dtcenter.org/community-code/gridpoint-statistical-interpolation-gsi). The repository includes an example script `compile_gsi_yakaira` to compile the code. In any case it is important to include the option `-DBUILD_WRF=ON` to use the ENKF algorithm with WRF as model. Then, clone or download the repository with the scripts, namelists and specific config files from: . This repository also includes a bash scripts to download the necessary data to run the tutorial. This data is also available as a [Zenodo record](https://zenodo.org/records/10439645). @@ -22,7 +24,7 @@ The observations included in this tutorial are: - File: `cimap.20181122.t12z.01h.prepbufr.nqc` -**Radiance observations:** radiances from polar and geostationary satellites. Observations from polar satellites comes from the Global Data Assimilation System (GDAS) Model: . The GOES-16 observations (ABI sensor) are derived from the public netcdf files published by NOAA. +**Radiance observations:** radiances from polar and geostationary satellites. Observations from polar satellites comes from the Global Data Assimilation System (GDAS) Model: . The GOES-16 observations (ABI sensor) are derived from the public netCDF files published by NOAA. Files: @@ -46,7 +48,7 @@ The background files includes the 10-member ensemble generated using the WRF-ARW ## Structure -By cloning the tutorial repo and downloading the associated data with the provied script you will end up with the following folder structure. +By cloning the tutorial repo and downloading the associated data with the provided script ([download_data.sh](https://github.com/paocorrales/tutorial_gsi/blob/main/download_data.sh "download_data.sh")) you will end up with the following folder structure. ``` bash tutorial_gsi/ @@ -173,7 +175,7 @@ A `GSI` folder will be created when running the `run_gsi.sh` script and a `ENKF` ## Running GSI {#fgat} -As we focus on running GSI with the Kalman Filter method, the first stem is to run GSI as Observation operator. So, the system will compare the observations with the background state and and save that information in diagnostic files. +As we focus on running GSI with the Kalman Filter method, the first step is to run GSI as Observation operator. So, the system will compare the observations with the background state and and save that information in diagnostic files. The example used in this tutorial is relatively small, so while you may need a HPC system for real cases, this one can be run in a small server or even a computer with at least 10 processors. @@ -203,7 +205,7 @@ GSIPROC=10 set-x ``` -In principle, you only need to change `BASEDIR` and `GSIDIR` variables that are the path to the tutorial folder and the path to where GSI is compiled (the code expects to find a `build` folder with the executable files. +In principle, you only need to change `BASEDIR` and `GSIDIR` variables that are the path to the tutorial folder and the path to where GSI is compiled (the code expects to find a `build` folder with the executable files inside `GSIDIR`). So, with that, you can run the script or send it to a queue. @@ -231,7 +233,7 @@ I had to slightly change that first line every time I changed machines. So, if y If you get a `exit 0` at the end, it probably means that everything went well. However, I recommend you check a few things to make sure everything went *really* well. - Check that all the `diag*` files are there. You will get 1 file per member and ensemble mean for each type of observation. If you don't see any of these files, check the [Issues](@issues-gsi) section. If you are missing the files for 1 type of observation, that probably means that the bufr file with the observations was not read properly or that is missing in the folder. Check if the script is linking the correct file to the `GSI` folder. -- Check the statistics for each type of observations. You will find this information in the `fit_.` files. Each one may have different information or structure depending on the type of observation, but make sure to check the number of observations read and keep by the system. This information is also included in the `stdout` file, you can sear `READ_*` to find the section in the file. +- Check the statistics for each type of observations. You will find this information in the `fit_.` files. Each one may have different information or structure depending on the type of observation, but make sure to check the number of observations read and keep by the system. This information is also included in the `stdout` file, you can search for `READ_*` to find the section in the file. If you got an error number instead and, if you are lucky, the error code may be described in `gsimain.f90`. @@ -282,11 +284,11 @@ If you get a `exit 0` at the end, it probably means that everything went well. O ## Running GSI using the FGAT method -The tutorial is configured to use the FGAT (First Guess at Appropriate Time) methods, this means that will GSI attempt to read multiple time level backgrounds a show in the following diagram. +The tutorial is configured to use the FGAT (First Guess at Appropriate Time) methods, this means that will GSI attempt to read multiple time level backgrounds as show in the following diagram. ![FGAT method](img/fgat_diagram.png){fig-alt="A diagram that shows the assimilation window of 1 h centered at 12 UTC. The observations are used every 10 minutes"} -There is no option in the namelist or configuration files to use the FGAT method. To "activate" this option GSI needs to find the appropiate files in appropriate folder. In this example we have 7 files in total for each member, 1 background file at the assimilation time plus 3 files every 10 minutes before and after the assimilation time. +There is no option in the namelist or configuration files to use the FGAT method. To "activate" this option GSI needs to find the appropriate files in appropriate folder. In this example we have 7 files in total for each member, 1 background file at the assimilation time plus 3 files every 10 minutes before and after the assimilation time. So, GSI will expect to find files called `wrf_inou1` to `wrf_inou7`. The fonder structure looks like this: @@ -433,7 +435,7 @@ convert wrf_inou1 to sigf01 You'll know that GSI is using the FGAT method. The example here shows that GSI found the `wrf_inout1` file that is renaming to `sig01` and then, information related to the time and domain characteristics. This will be repeated for each background file. In this case the first file corresponds to th 11:30 UTC of November 2018 and after that there is 1 file every 10 minutes. -This period between background files (10 minutes) is defined by the user when saving the background files. Again there is no configuration option to do this. GSI relies on finding the files in the especified folder. +This period between background files (10 minutes) is defined by the user when saving the background files. Again there is no configuration option to do this. GSI relies on finding the files in the specified folder. ------------------------------------------------------------------------ diff --git a/content/observations/01-bufr.qmd b/content/observations/01-bufr.qmd index d25f530..ac0f78d 100644 --- a/content/observations/01-bufr.qmd +++ b/content/observations/01-bufr.qmd @@ -13,7 +13,7 @@ A BUFR file is divided into one or more messages, each message containing one or A message is a continuous binary stream that can be divided into 6 sections: | Section Number | Name | Contents | -|-----------|-----------------------|---------------------------------------| +|----------------|---------------------|------------------------------------| | 0 | Indicator section | "BUFR", length of message, edition number | | 1 | Identification section | Originating center and subcenter, data category and subcategory, master table and version number | | 2 | (Optional) Local Use section | (Optional) free-format additional information of potential interest to users | @@ -25,7 +25,7 @@ A message is a continuous binary stream that can be divided into 6 sections: Section 1 of each message with include the relevant information associated to the data: data type and observation date and time. However, the information is encoded using *mnemonics*. To decode the information in a BUFR file we will need the table that was used to create the file. In the case of prepBUFR files the table has different sections (check the complete table [here](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3/blob/main/util/bufr_tools/prepbufr.table)). In general this tables includes the mnemonics and a short description. The first part of the prepBUFR table includes the mnemonics associated to the data type: -```bash +``` bash .------------------------------------------------------------------------------. | ------------ USER DEFINITIONS FOR TABLE-A TABLE-B TABLE D -------------- | |------------------------------------------------------------------------------| @@ -39,7 +39,7 @@ Section 1 of each message with include the relevant information associated to th As an example, let say we get the message type `ADPUPA`. The first part of the table tell us that this message includes upper-air observations. In a following section of the table we find the structure of this type of messages: -```bash +``` bash |------------------------------------------------------------------------------| | MNEMONIC | SEQUENCE | |----------|-------------------------------------------------------------------| @@ -50,31 +50,31 @@ As an example, let say we get the message type `ADPUPA`. The first part of the t This tell us that the message has a header (`HEADR` - report header sequence) among other things. When we search for `HEADR` inside the table, we will find the content and it definition: -```bash +``` bash | HEADR | SID 207003 XOB YOB 207000 DHR ELV TYP T29 TSB ITP SQN | | HEADR | PROCN RPT TCOR ``` Again, we will need to look each mnemonic to understand their meaning: -* SID: STATION IDENTIFICATION -* XOB: LONGITUDE (DEG E) -* YOB: LATITUDE (DEG N) -* DHR: OBSERVATION TIME MINUS CYCLE TIME (HOURS) -* ELV: STATION ELEVATION (METERS) -* TYP: PREPBUFR REPORT TYPE -* T29: DATA DUMP REPORT TYPE -* TSB: REPORT SUBTYPE (HAS VARIOUS MEANINGS DEPENDING ON TYPE) -* ITP: INSTRUMENT TYPE -* SQN: REPORT SEQUENCE NUMBER -* PROCN: PROCESS NUMBER FOR THIS MPI RUN -* RPT: REPORTED OBSERVATION TIME (HOURS) -* TCOR: INDICATOR WHETHER OBS. TIME IN "DHR" WAS CORRECTED -* RSRD_SEQ: RESTRICTIONS ON REDISTRIBUTION SEQUENCE +- SID: STATION IDENTIFICATION +- XOB: LONGITUDE (DEG E) +- YOB: LATITUDE (DEG N) +- DHR: OBSERVATION TIME MINUS CYCLE TIME (HOURS) +- ELV: STATION ELEVATION (METERS) +- TYP: PREPBUFR REPORT TYPE +- T29: DATA DUMP REPORT TYPE +- TSB: REPORT SUBTYPE (HAS VARIOUS MEANINGS DEPENDING ON TYPE) +- ITP: INSTRUMENT TYPE +- SQN: REPORT SEQUENCE NUMBER +- PROCN: PROCESS NUMBER FOR THIS MPI RUN +- RPT: REPORTED OBSERVATION TIME (HOURS) +- TCOR: INDICATOR WHETHER OBS. TIME IN "DHR" WAS CORRECTED +- RSRD_SEQ: RESTRICTIONS ON REDISTRIBUTION SEQUENCE Each of this mnemonics are described in an specific section: -```bash +``` bash |------------------------------------------------------------------------------| | MNEMONIC | SCAL | REFERENCE | BIT | UNITS |-------------| |----------|------|-------------|-----|--------------------------|-------------| @@ -89,23 +89,22 @@ Each of this mnemonics are described in an specific section: | DHR | 5 | -2400000 | 23 | HOURS |-------------| ``` -Half of the information included in the header is not useful from the assimilation point of view but it is good to know how it works. For this example we have the `{PRSLEVEL}` sequence included. This is the "PRESSURE LEVEL SEQUENCE" for upper-air observations and includes humidity, temperature and wind observations at each pressure level. +Half of the information included in the header is not useful from the assimilation point of view but it is good to know how it works. For this example we have the `{PRSLEVEL}` sequence included. This is the "PRESSURE LEVEL SEQUENCE" for upper-air observations and includes humidity, temperature and wind observations at each pressure level. -```bash +``` bash | PRSLEVEL | CAT | | PRSLEVEL | [W1_EVENT] ``` -A surface station (`ADPSFC`) will have a different structure but the idea is the same, everything can be decode from the table. +A surface station (`ADPSFC`) will have a different structure but the idea is the same, everything can be decode from the table. ## An introduction on how to read and write BUFR files -Working with BUFR files requires to work with fortran routines. Fortunately there is a library that contains subroutines, functions and other utilities that can be used to read (decode) and write (encode) data in BUFR: [ -NCEPLIBS-bufr](https://noaa-emc.github.io/NCEPLIBS-bufr/). This library is compiled during the compilation of GSI but can also be used independently. +Working with BUFR files requires to work with fortran routines. Fortunately there is a library that contains subroutines, functions and other utilities that can be used to read (decode) and write (encode) data in BUFR: [NCEPLIBS-bufr](https://noaa-emc.github.io/NCEPLIBS-bufr/). This library is compiled during the compilation of GSI but can also be used independently. Here is a simplified example of a routine to decode a bufr file: -```fortran +``` fortran open(unit_in,file='sample.bufr',action='read',form='unformatted') call openbf(unit_in,'IN',unit_in) @@ -123,21 +122,20 @@ call openbf(unit_in,'IN',unit_in) call closbf(unit_in) ``` -* The first level of code open and close the file. It uses the `openbf()` and `closebf()`functions from the BUFR library. -* The second level will read each message. Each loop reads in one message until the last message in the file is reached. It uses the `ireadmg()` function. -* The third level will read the reports inside each message using the `ireadsb()`. -* And finally inside each report, we access the data values using the `ufbint()` function. In this example we save the header in formation in one array and the observation information in a second array. +- The first level of code open and close the file. It uses the `openbf()` and `closebf()`functions from the BUFR library. +- The second level will read each message. Each loop reads in one message until the last message in the file is reached. It uses the `ireadmg()` function. +- The third level will read the reports inside each message using the `ireadsb()`. +- And finally inside each report, we access the data values using the `ufbint()` function. In this example we save the header in formation in one array and the observation information in a second array. -GSI includes some example routines as part ad the bufr tools kit. A complete routine to decode a BUFR file can be found [here](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3/blob/main/util/bufr_tools/bufr_decode_sample.f90). The routine to [encode a BUFR file](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3/blob/main/util/bufr_tools/bufr_encode_sample.f90) will have similar characteristics. - -And again, BUFR files will be different for different type of observations and you always need the associated mnemonics table. +GSI includes some example routines as part ad the bufr tools kit. A complete routine to decode a BUFR file can be found [here](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3/blob/main/util/bufr_tools/bufr_decode_sample.f90). The routine to [encode a BUFR file](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3/blob/main/util/bufr_tools/bufr_encode_sample.f90) will have similar characteristics. +And again, BUFR files will be different for different type of observations and you always need the associated mnemonics table. ### Decode prepBUFR files -This is not different that we've seen previously and the BUFR tools fonder in the GSI code has very good examples to use. But I like to have the data in tidy tables, so I've modified that code to decode a prepBUFR file into a table where each row is an observation. This is the routine: +This is not different that we've seen previously and the [BUFR tools folder](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3/tree/main/util/bufr_tools) in the GSI repository has very good examples to use. But I like to have the data in tidy tables, so I've modified that code to decode a prepBUFR file into a table where each row is an observation. This is the routine: -```fortran +``` fortran program prepbufr_decode_all_df ! ! read all observations out from prepbufr. @@ -196,15 +194,9 @@ program prepbufr_decode_all_df end program ``` - - - - - - ::: callout-important There are excellent resources for learning how to work with BUFR files from which I have taken many things mentioned here: -* [BUFR/PrepBUFR User’s Guide](https://dtcenter.org/sites/default/files/community-code/gsi/downloads/BUFR/BUFR_PrepBUFR_User_Guide_v1.pdf) from the Developmental Testbed Center. -* [BUFR Reference Manual](https://www.ecmwf.int/sites/default/files/elibrary/2008/18792-bufr-reference-manual.pdf) from the ECMWF. +- [BUFR/PrepBUFR User's Guide](https://dtcenter.org/sites/default/files/community-code/gsi/downloads/BUFR/BUFR_PrepBUFR_User_Guide_v1.pdf) from the Developmental Testbed Center. +- [BUFR Reference Manual](https://www.ecmwf.int/sites/default/files/elibrary/2008/18792-bufr-reference-manual.pdf) from the ECMWF. ::: diff --git a/content/observations/02-conv-bufr.qmd b/content/observations/02-conv-bufr.qmd index debfc6f..342db8d 100644 --- a/content/observations/02-conv-bufr.qmd +++ b/content/observations/02-conv-bufr.qmd @@ -5,7 +5,7 @@ bibliography: references.bib This section covers how to add new surface observations to prepBUFR and how to divide a file into smaller subsets. -One of the first challenges (besides to learning how to use GSI and deal with bufr files) was to incorporate surface observations from automatic stations to the assimilation. In Argentina there are very few official stations to cover the entire country but there are private automatic station network that could potentially help to improve weather forecasts thank to their higher frequency (10 minutes) and spacial resolution. +One of the first challenges (besides learning how to use GSI and deal with bufr files) was to incorporate surface observations from automatic stations to the assimilation. In Argentina there are very few official stations to cover the entire country but there are private automatic station network that could potentially help to improve weather forecasts thank to their higher frequency (10 minutes) and spacial resolution. ![Location of automatic stations (green squares) and conventional stations (orange triangles)](img/ema.png){width="500"} @@ -13,52 +13,49 @@ This requires to incorporate these new observations [available at @garcia2019] t ## Add observations to a prepBUFR file -In the previous section I briefly explained how to read and write a bufr file. To *add* new observations to an existing file (o to a new file!) we use the same functions mentioned [here](#an-introduction-on-how-to-read-and-write-bufr-files) and structure of the routine is also similar. But we also need to figure out how to read the new observations and how to create new reports. - -The code I'm sharing here assume that the new observations are in csv format and has the following look: - -|LAT |LON |ELV |STATION| yyyymmddhhmmss |TOB |QOB |UOB |VOB |PRSS | -|------|------|------|------|------|------|------|------|------|------| -|-19.0124|-65.2921|2907|"BLV102"|20181111200000|28.9|0.00624936719941647|0|-2.88088888888889|71950 -|-17.4111|-66.1744|2548|"BLV124"|20181111200000|29|0.00636021217641851|-3.27390439689371|-3.27390439689372|74280 -|-16.8378|-64.7925|195|"BLV141"|20181111200000|32.9|0.0439748860789023|-1.34593847427853|-1.34593847427853|50240 -|-17.9747|-67.08|4057|"BLV154"|20181111201500|19.2|0.00317768840091676|-6.08364406830543|-2.51992788174274|62030 -|-17.6364|-67.2003|3798|"BLV157"|120181111200000|19|0.00626217287136847|-5.34737718159307|-5.34737718159307|64460 +In the [previous section](obs/01-bufr.qmd) I briefly explained how to read and write a bufr file. To *add* new observations to an existing file (o to a new file!) we use the same functions mentioned [here](#an-introduction-on-how-to-read-and-write-bufr-files) and structure of the routine is also similar. But we also need to figure out how to read the new observations into fortran and how to create new reports. +The code I'm sharing here assume that the new observations are in csv format and has the following look (without the header): +| LAT | LON | ELV | STATION | yyyymmddhhmmss | TOB | QOB | UOB | VOB | PRSS | +|--------|--------|--------|--------|--------|--------|--------|--------|--------|--------| +| -19.0124 | -65.2921 | 2907 | "BLV102" | 20181111200000 | 28.9 | 0.00624936719941647 | 0 | -2.88088888888889 | 71950 | +| -17.4111 | -66.1744 | 2548 | "BLV124" | 20181111200000 | 29 | 0.00636021217641851 | -3.27390439689371 | -3.27390439689372 | 74280 | +| -16.8378 | -64.7925 | 195 | "BLV141" | 20181111200000 | 32.9 | 0.0439748860789023 | -1.34593847427853 | -1.34593847427853 | 50240 | +| -17.9747 | -67.08 | 4057 | "BLV154" | 20181111201500 | 19.2 | 0.00317768840091676 | -6.08364406830543 | -2.51992788174274 | 62030 | +| -17.6364 | -67.2003 | 3798 | "BLV157" | 120181111200000 | 19 | 0.00626217287136847 | -5.34737718159307 | -5.34737718159307 | 64460 | ::: callout-important -If you want to convert the files available here into csv tables, [here is an R script to do that](https://gist.github.com/paocorrales/e68daf085cf16b5d2e48f11513392084). +If you want to convert the files available [here](https://data.eol.ucar.edu/dataset/553.008) into csv tables, [I wrote an R script to do that](https://gist.github.com/paocorrales/e68daf085cf16b5d2e48f11513392084). ::: -The program read the configuration file that define where is the namelist and the prepbufr table and call a function that start the process. The namelist will list the name of the prepbufr to modify and the csv files with the new observations. The most important subroutine is `adpsfc_rw_prepbufr()`: +The program read the configuration file (`add_obs.conf`) that define where is the namelist and the prepbufr table and call a function that start the process. The namelist will list the name of the prepbufr file to modify and the csv files with the new observations. The most important subroutine is `adpsfc_rw_prepbufr()` that: -1. Reads the bufr file -2. Reads the csv with new observations -3. Decides if the observations fall into the time window -4. Creates a new report type 187 -5. Write the bufr file +1. Reads the bufr file +2. Reads the csv with new observations +3. Decides if the observations fall into the time window +4. Creates a new report type 187 (check obs types [here](https://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc/table_2.htm)) +5. Write the bufr file ![add_obs rutine flowchart](img/add_obs.png) ## Filter observations in a prepBUFR -When I started using GSI I thought that the bufr files needed to have the observations for the assimilation window only, in other words, that GSI was not capable of filtering and ignoring the observations outside the assimilation window. I know, it does not make sense. Before I realized I was wrong I created a program to read a prepbufr and write a new one only with the observations in a specific time window. - -The general structure is similar to the previous routine. In this case the `read_namelist` module will read the namelist and list all the available prepbufrs. The magic then happens inside `filter_obs_prepbufr()` that loops over each observation and check the difference between the observation time and the analysis time. If that difference is less that half the window, the observation is written in the new prepBUFR file. +When I started using GSI I thought that the bufr files needed to have the observations for the assimilation window only, in other words, that GSI was not capable of filtering and ignoring the observations outside the assimilation window. I know, it does not make sense. Before I realized I was wrong I created a program to read a prepbufr and write a new one only with the observations in a specific time window. +The general structure is similar to the previous routine. In this case the `read_namelist` module will read the namelist and list all the available prepbufrs. The magic then happens inside `filter_obs_prepbufr()` that loops over each observation and check the difference between the observation time and the analysis time. If that difference is less that half the window, the observation is written in the new prepBUFR file. ![prepBURF generation / filter obs rutine flowchart](img/filter_obs.png) ## The code -The source code is publicly available in [this repository](https://github.com/paocorrales/prepbufr_tools). Each source folder includes a `compile` file as example of how to compile the program. It needs the bufr and [w3lib](https://www.nco.ncep.noaa.gov/pmb/docs/libs/w3lib/ncep_w3lib.shtml) to work with times. +The source code is publicly available in [this repository](https://github.com/paocorrales/prepbufr_tools). Each source folder includes a `compile` file as example of how to compile the program. It needs the bufr library to work with bufr files and [w3lib](https://www.nco.ncep.noaa.gov/pmb/docs/libs/w3lib/ncep_w3lib.shtml) library to work with times. -The `run` folder includes all the configuration files, namelists and bufr tables. The can be uses to run the programs using the example observations available in the `example_obs` folder. +The `run` folder includes all the configuration files, namelists and bufr tables. They can be uses as it is to run the programs using the example observations available in the `example_obs` folder. -Finally, if you ever need to do this for many files, the `run_bash` folder has 2 bash scripts that modify the namelist and run the programs in a loop. +Finally, if you ever need to do this many files, the `run_bash` folder has 2 bash scripts that modify the namelist and run the programs in a loop. -```bash +``` bash prepbufr_tools ├── example_obs │   ├── 2018111719.csv @@ -90,5 +87,5 @@ prepbufr_tools ``` ::: callout-important -Once again, this routines are based on the INPE/CPTEC modules and programs that are part of their operational assimilation system. -::: \ No newline at end of file +Once again, this routines are based on the INPE/CPTEC modules and programs that are part of their operational assimilation system. I wouldn't have been able to work with bufr files without the help and guidance of Luiz Sapucci and his team +::: diff --git a/content/observations/03-rad-bufr.qmd b/content/observations/03-rad-bufr.qmd index 8c729f9..f6b8f31 100644 --- a/content/observations/03-rad-bufr.qmd +++ b/content/observations/03-rad-bufr.qmd @@ -2,17 +2,17 @@ title: "Radiance obs bufr" --- -If you've read the previous sections, you know that working with bufr files is not easy. It is possible by now that the bufr files for ABI observations (from GOES satellites) are available everywhere. But if you try to run assimilation experiments for "old" cases like me you will need to convert the observations in netCDF to the bufr format. So, this section explains how to do that. +If you've read the previous sections, you know that working with bufr files is not easy. It is possible by now that the bufr files for ABI observations (from GOES satellites) are available everywhere. But if you try to run assimilation experiments for "old" cases like me you will need to convert the observations in netCDF to bufr format. So, this section explains how to do that. ::: callout-important The GSI version I used (available [here](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3)) to assimilate ABI observations assumes that the abi bufrs use the table I mention in this section and has the structure that produce the routine. If you get the bufr files from somewhere else, it may not work, sorry. ::: -The routine I used is based on a a routine written by Jamie Bresch from NCAR/MMM. I made a few modifications to work with the current GSI version. +The routine I used is based on a a routine written by Jamie Bresch from NCAR/MMM. I made a few modifications to work with the current GSI version but it is essentially the same. -It read the netCDF metadata from the files listed in `flist.txt` (usually 1 per channel) and the cloud mask. Then, it calculate geometric/geographic variables like longitude, latitude, the projection, the zenith angle of the sun on that specific time, etc. +It read the netCDF metadata from the files listed in `flist.txt` (usually 1 file per channel and the cloud mask). Then, it calculate geometric/geographic variables like longitude, latitude, the projection, the zenith angle of the sun on that specific time, etc. -After that it will read each variable, calculate the brightness temperature from radiance and the standard deviation of the brightness temperature at each point. It will then write the bufr file using the `bufrtab_NC021046.txt` table. +After that it will read from the files each variable, calculate the brightness temperature from radiance and the standard deviation of the brightness temperature at each point. It will then write the bufr file using the `bufrtab_NC021046.txt` table. Like the other bufr routines, it uses a namelist: @@ -22,7 +22,7 @@ Like the other bufr routines, it uses a namelist: data_dir = '.', ! path of the GRB nc files data_id = 'OR_ABI-L1b-RadF-M3' ! prefix of the downloaded GRB nc files sat_id = 'G16' - bufr_tbl_file = '../bufrtab_NC021046.txt' + bufr_tbl_file = './bufrtab_NC021046.txt' n_subsample = 4 ! stride for reducing the output volume / ``` @@ -47,21 +47,21 @@ The header of a bufr message includes: - SOLAZI: solar azimuth -And the type of message will be NC021046. +And the type of message will be NC021046 (ABI observations in clear sky). ### Variables in the bufr file -The mnemonics for the important variables in the bufr are: **TMBRST** (brightness temperature in Kelvin), **SDTB** (standard deviation for brightness temperature in Kelvin), **NCLDMNT** (% of no cloud). +The mnemonics for the important variables in the bufr are: **TMBRST** (brightness temperature in Kelvin), **SDTB** (standard deviation for brightness temperature in Kelvin), **NCLDMNT** (% of no cloud). It will include: - **Brightness Temperature (TB)** for each infrared channel. -- **Standard Deviation for Brightness Temperature (SDTB)** is a 2D field. The SDTB a each grid point is calculated from the TB field using a 3 by 3 region around the point. If the TB changes much in that region it may indicate that is a cloud border. GSI uses this variable for channel 10 to reject observations contaminated by clouds. +- **Standard Deviation for Brightness Temperature (SDTB)** is a 2D field. The SDTB a each grid point is calculated inside the routine from the TB field using a 3 by 3 region around the point. If the TB changes much in that region, i.e. has a big SDTB, it may indicate that is a cloud border. GSI uses this variable for channel 10 to reject observations contaminated by clouds. - **Cloud Mask** defined as the percentage of no cloud, meaning 1 = no clouds. ## The code -The source code is publicly available in [this repository](https://github.com/paocorrales/goesbufr). The root folder includes a `compile` file as example of how to compile the program. It needs the bufr library to work and you can use the want that get compiled along with GSI. +The source code is publicly available in [this repository](https://github.com/paocorrales/goesbufr). The root folder includes a `compile` file as example of how to compile the program. It needs the bufr library to work and you can use the one that get compiled along with GSI. The `rundir` folder includes all the configuration files, namelists and bufr tables. Once you compile the routine, you can try running an example using the example observations available in the `example_obs` folder. The `flist.txt` and `namelist.goes_nc2bufr` files are ready to work.