Skip to content

Commit

Permalink
General revision of content
Browse files Browse the repository at this point in the history
  • Loading branch information
paocorrales committed Feb 14, 2024
1 parent 8168e7a commit 3a5eeee
Show file tree
Hide file tree
Showing 8 changed files with 181 additions and 189 deletions.
26 changes: 13 additions & 13 deletions content/gsi/01-gsi.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@ bibliography: references.bib

The GSI (Gridpoint Statistical Interpolation) System, is a state-of-the-art data assimilation system initially developed by the Environmental Modeling Center at NCEP. It was designed as a traditional 3DVAR system applied in the gridpoint space of models to facilitate the implementation of inhomogeneous anisotropic covariances [@wu2002; @purser2003a; @purser2003]. It is designed to run on various computational platforms, create analyses for different numerical forecast models, and remain flexible enough to handle future scientific developments, such as the use of new observation types, improved data selection, and new state variables [@kleist2009].

The- 3DVAR system replaced the NCEP regional grid-point operational analysis system by the North American Mesoscale Prediction System (NAM) in 2006 and the *Spectral Statistical Interpolation* (SSI) global analysis system used to generate *Global Forecast System* (GFS) initial conditions in 2007 [@kleist2009]. In recent years, GSI has evolved to include various data assimilation techniques for multiple operational applications, including 2DVAR [e.g., the *Real-Time Mesoscale Analysis* (RTMA) system; @pondeca2011], the hybrid EnVar technique (e.g., data assimilation systems for the GFS, the *Rapid Refresh system* (RAP), the NAM, the HWRF, etc. ), and 4DVAR [e.g., the data assimilation system for NASA's Goddard Earth Observing System, version 5 (GEOS-5); @zhu2008]. GSI also includes a hybrid 4D-EnVar approach that is currently used for GFS generation.
The 3DVAR system replaced the NCEP regional grid-point operational analysis system by the North American Mesoscale Prediction System (NAM) in 2006 and the *Spectral Statistical Interpolation* (SSI) global analysis system used to generate *Global Forecast System* (GFS) initial conditions in 2007 [@kleist2009]. In recent years, GSI has evolved to include various data assimilation techniques for multiple operational applications, including 2DVAR [e.g., the *Real-Time Mesoscale Analysis* (RTMA) system; @pondeca2011], the hybrid EnVar technique (e.g., data assimilation systems for the GFS, the *Rapid Refresh system* (RAP), the NAM, the HWRF, etc. ), and 4DVAR [e.g., the data assimilation system for NASA's Goddard Earth Observing System, version 5 (GEOS-5); @zhu2008]. GSI also includes a hybrid 4D-EnVar approach that is currently used for GFS generation.

In addition to the development of hybrid techniques, GSI allows the use of ensemble assimilation methods. To achieve this, it uses the same observation operator as the variational methods to compare the preliminary field or background with the observations. In this way the exhaustive quality controls developed for variational methods are also applied in ensemble assimilation methods. The EnKF code was developed by the Earth System Research Lab (ESRL) of the National Oceanic and Atmospheric Administration (NOAA) in collaboration with the scientific community. It contains two different algorithms for calculating the analysis increment, the serial Ensemble Square Root Filter [EnSRF, @whitaker2002] and the LETKF [@hunt2007] contributed by Yoichiro Ota of the Japan Meteorological Agency (JMA).

To reduce the impact of spurious covariances on the increment applied to the analysis, ensemble systems apply a localization to the covariance matrix of the errors of the observations $R$ in both the horizontal and vertical directions. GSI uses a polynomial of order 5 to reduce the impact of each observation gradually until a limiting distance is reached at which the impact is zero. The vertical location scale is defined in terms of the logarithm of the pressure and the horizontal scale is usually defined in kilometers. These parameters are important in obtaining a good analysis and depend on factors such as the size of the ensemble and the resolution of the model.
To reduce the impact of spurious covariances on the increment applied to the analysis, ensemble systems apply a localization to the covariance matrix of the errors of the observations $R$ in both the horizontal and vertical directions. GSI uses a polynomial of order 5 to reduce the impact of each observation gradually until a limiting distance is reached at which the impact is zero. The vertical localization scale is defined in terms of the logarithm of the pressure and the horizontal scale is usually defined in kilometers. These parameters are important in obtaining a good analysis and depend on factors such as the size of the ensemble and the resolution of the model.

GSI uses the Community Radiative Transfer Model [CRTM, @liu2008] as an operator for the radiance observations that calculates the brightness temperature simulated by the model in order to compare it with satellite sensor observations. GSI also implements a bias correction algorithm for the satellite radiance observations. The preliminary field estimate with the CRMT is compared with the radiance observations to obtain the innovation. This innovation is then used to calculate a bias that is applied to an updated innovation. This process can be repeated several times until the innovation and the bias correction coefficients converge.
GSI uses the Community Radiative Transfer Model [CRTM, @liu2008] as an operator for the radiance observations that calculates the brightness temperature simulated by the model in order to compare it with satellite observations. GSI also implements a bias correction algorithm for the satellite radiance observations. The preliminary field estimated with CRMT is compared with the radiance observations to obtain the innovation. This innovation is then used to calculate a bias that is then applied to an updated innovation. This process can be repeated several times until the innovation and the bias correction coefficients converge.

## Available observations for assimilation

Expand Down Expand Up @@ -83,7 +83,7 @@ Here is the list of observations that can be assimilated by GSI. In bold are the

## Running GSI

Every assimilation cycle starts with the background, a forecast generated using a numerical model (WRF-ARW for this guide), that was initialized from previous analysis and observations (in bufr format) that enters the GSI system. GSI will also need "fixed" files with information about the observations. This files define which observations are going to be assimilated, they errors and quality control options.
Every assimilation cycle starts with the background, a forecast generated using a numerical model (WRF-ARW for this guide), that was initialized from previous analysis and observations (in bufr format) that enters the GSI system. GSI will also need "fixed" files with information about the observations. This "fix" files define which observations are going to be assimilated, they errors and quality control options.

![Diagram of an assimilation cycle](img/cycle_diagram.png){fig-alt="Diagram that shows a cycle. First the background (forecasts generated using a numerical model, WRF-ARW, initialized from previous analysis) and Observations (from bufr files) enters the GSI system. Inside the system, the Observation Operator takes care of the quality control and bias correction and then calculates the innovation and generates the diag files. Then the ENKF part calculates the update appling the innovation to the background to generate the analysis. The analysis is used to create a new background to star the cycle again"}

Expand All @@ -99,9 +99,9 @@ GSI can also be used with the following background files:
- WRF-Chem GOCART input fields with NetCDF format
- CMAQ binary file

And the official tutorials are a good starting point to grasp the use of this options.
And the [official tutorials in the DTCenter webpage](https://dtcenter.org/community-code/gridpoint-statistical-interpolation-gsi/gsi-tutorial-online) are a good starting point to grasp the use of this options.

GSI can also be run *without observations* to test the code, this is with a single synthetic observation defined in the SINGLEOB_TEST section in the namelist.
GSI can also be run *without observations* to test the code, this is with a single synthetic observation defined in the SINGLEOB_TEST section in the gsi namelist. Another thing to try at the beginning.

The fixed files are located in the `fix/` folder and includes statistic files, configuration files, bias correction files, and CRTM coefficient files[^1]. The information of the configuration files is saved in the output files after running GSI.

Expand All @@ -112,16 +112,16 @@ The fixed files are located in the `fix/` folder and includes statistic files, c
| berror_stats | background error covariance (for variacional methods) | nam_nmmstat_na.gcv, nam_glb_berror.f77.gcv, |
| errtable | Observation error table | prepobs_errtable.global |
| convinfo | Conventional observation information file | global_convinfo.txt |
| satinfo | satellite channel information file | global_satinfo.txt |
| pcpinfo | precipitation rate observation information file | global_pcpinfo.txt |
| ozinfo | ozone observation information file | global_ozinfo.txt |
| satbias_angle | satellite scan angle dependent bias correction file | global_satangbias.txt |
| satbias_in | satellite mass bias correction coefficient file | sample.satbias |
| satbias_in | combined satellite angle dependent and mass bias correction coefficient file | gdas1.t00z.abias.new |
| satinfo | Satellite channel information file | global_satinfo.txt |
| pcpinfo | Precipitation rate observation information file | global_pcpinfo.txt |
| ozinfo | Ozone observation information file | global_ozinfo.txt |
| satbias_angle | Satellite scan angle dependent bias correction file | global_satangbias.txt |
| satbias_in | Satellite mass bias correction coefficient file | sample.satbias |
| satbias_in | Combined satellite angle dependent and mass bias correction coefficient file | gdas1.t00z.abias.new |

## About the GSI code

GSI is writen in fortran and the code is separated in more than 5 hundred files. While GSI has 2 good user guides, not everything is documented and sometimes you will need to read the code.
GSI is written in fortran and the code is separated in more than 5 hundred files. While GSI has [2 good user guides](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3/tree/main/docs), not everything is documented and sometimes you will need to read the code.

To swim around the code I found the `grep -r "key word"` command very useful. Each file and subroutine inside it has a header with information about what it does, changes and input and output arguments. It's worth mentioning a few key files:

Expand Down
22 changes: 11 additions & 11 deletions content/gsi/02-convencionals.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@ Conventional observations are assimilated from PREPBUFR files. NCEP ADP Global U

While the PREPBUFR includes wind derived from satellite observations, GSI ignores this observations and uses the ones provided by the specific bufr file `gdas.t00z.satwnd.tm00.bufr_d`.

PREPBUFR files usually contains observations from a 6 to 12 h window and can be modify using FORTRAN routines provided with the GSI code (see `util/bufr_tools` in the GSI source code folder). You can also create your own bufr file or add new observation to an existing bufr file (see [Working with bufr files](../observations/01-bufr.qmd)).
PREPBUFR files usually contains observations from a 6 to 12 h window and can be modify using FORTRAN routines provided with the GSI code (see `util/bufr_tools` in the [GSI source code folder](https://github.com/paocorrales/comGSIv3.7_EnKFv1.3)). You can also create your own bufr file or add new observation to an existing bufr file (see [Working with bufr files](../observations/01-bufr.qmd)).

### Controlling which observations are assimilated

The assimilation of conventional observations is controlled with the `convinfo` file. Let's check the `global_convinfo.txt` file we get as an example:
The assimilation of conventional observations is controlled with the `convinfo` file. Let's check the `global_convinfo.txt` file we get as an example in the `fix` folder:

``` bash
! otype = observation type (a7, t, uv, q, etc.)
Expand Down Expand Up @@ -41,14 +41,14 @@ The assimilation of conventional observations is controlled with the `convinfo`
The head of the file explains the content of each column but there are a few more things to add:
- type: this is defined by the bufr tables, particular [Table 2](https://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc/table_2.htm). It is worth checking this table as includes information about which observations are assimilated in GFS, errors asociated to specific instruments and other details.
- type: this is defined by the bufr tables, particular [Table 2](https://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc/table_2.htm). It is worth checking this table as includes information about which observations are assimilated in GFS, errors associated to specific instruments and other details.
- twindow: while the assimilation window is defined in the gsi namalist, it is possible to control an assimilation window for specific observations. This is useful if, for example the assimilation window is 3 h and you want to assimilate temperature in a 1h window.
In general you only change the *isue* column to assimilate or not a type of observation and maybe just maybe the *gross*, *ermax*, and *ermin* parameters if you want to modify the quality control of the observations.
In general you only change the *iuse* column to assimilate or not a type of observation and maybe just maybe the *gross*, *ermax*, and *ermin* parameters if you want to modify the quality control of the observations.
### Observation errors and quality control
For **regional assimilation** GSI uses an error table located in the `errtable` file that you'll find in the `./fix` folder under the name `prepobs_errtable.global` (it is confusing that the table for regional errors in in a file called global). Here is a small example of the content of the file for observations from surface stations.
For **regional assimilation** GSI uses an error table located in the `errtable` file that you'll find in the `./fix` folder under the name `prepobs_errtable.global` (yes, it is confusing that the table for regional errors in in a file called global). Here is a small example of the content of the file for observations from surface stations.
``` bash
181 OBSERVATION TYPE
Expand Down Expand Up @@ -76,11 +76,11 @@ GSI will perform a quality control for each observation. In general terms this i
For the gross check, GSI first calculates a ratio:
$$ ratio = (obs - bk)/max(ermin, min(ermax, obserror)) $$ The main error parameters are controlled by the `convinfo` file. The `obserror` is the observation error defined in the prepbufr file for each observation as a result to the quality control perform while generating that file.
$$ ratio = (obs - bk)/max(ermin, min(ermax, obserror)) $$ The main error parameters are controlled by the `convinfo` file. The `obserror` is the observation error defined in the prepbufr file for each observation plus information in the `prepobs_errtable`.
If $ration > gross$ the observation is rejected.
Other piece of information used during the quality control is the quality control flag that is included in the prepbufr file a part if it quality control process. The possible values for conventional observations are:
Other piece of information used during the quality control is the quality control flag that is included in the prepbufr file a part if it quality control process performed by NCEP. The possible values for conventional observations are:
| qc flag | meaning |
|:------------------------------:|:---------------------------------------|
Expand All @@ -90,10 +90,10 @@ Other piece of information used during the quality control is the quality contro
You can find more details about the quality control flags in [Table 7](https://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc/table_7.htm).
GSI can also perform a thinning for conventional observations. You can activate that option for each type of observation changing `ithin = 1` in the `convinfo` file. There are other important columns, `rmesh`, `pmesh`, in the `convinfo` file to configure conventional data thinning:
GSI can also perform a thinning for conventional observations. You can activate that option for each type of observation changing `ithin = 1` in the `convinfo` file. There are other important columns, `rmesh`, `pmesh`, in the `convinfo` file to configure conventional data thinning:
* `ithin`: 0 = no thinning; 1 = thinning with grid mesh decided by `rmesh` and `pmesh`
* `rmesh`: horizontal thinning grid size in km
* `pmesh`: vertical thinning grid size in mb; if 0, then use background vertical grid
- `ithin`: 0 = no thinning; 1 = thinning with grid mesh decided by `rmesh` and `pmesh`
- `rmesh`: horizontal thinning grid size in km
- `pmesh`: vertical thinning grid size in mb; if 0, then use background vertical grid
**For each observation GSI will check different things and change the observation error accordingly.** The final observation error is recorded in the `diag` file and then used during the assimilation.
Loading

0 comments on commit 3a5eeee

Please sign in to comment.