BCtable is an ASCII formatted file to customize the way preprocessor.ESGF processes variables. It allows selecting a particular version of a variable and select it from a particular realm at a particular frequency. Additionally, the variables can be filtered (see Filters below).


The file starts with a set of keyword-value pairs used to set some mandatory metadata (model, experiment) along with any other extra keywords. A blank line separates the keyword-value header from the variable table itself. Variable processing lines start after a line beginning with at least three dashes (---). This is a sample BCtable file:

data_path /oceano/gmeteo/DATA/CMIP5/output1/IPSL/IPSL
project cmip5
institute IPSL
model IPSL-CM5A-MR
experiment rcp85
product output1
remove_leap false
interval 3hour
era_interim_path /oceano/gmeteo/DATA/ECMWF/INTERIM/Analysis

abbr    grib ltype version   ensemble freq realm  table   filter
------- ---- ----- --------- -------- ---- ------ ------- ----------------- 
sftlf    172  1    v20111119  r0i0p0  fx   atmos  fx      only_ic|is_land_mask|percent2one|set_start_time|rename(sftlf)
orog     129  1    v20111119  r0i0p0  fx   atmos  fx      only_ic|set_start_time|rename(orog)
ta       11   109  v20111119  r1i1p1  6hr  atmos  6hrLev  set_hybrid_levels
ua       33   109  v20111119  r1i1p1  6hr  atmos  6hrLev  set_hybrid_levels
va       34   109  v20111119  r1i1p1  6hr  atmos  6hrLev  set_hybrid_levels
hus      52   109  v20111119  r1i1p1  6hr  atmos  6hrLev  set_hybrid_levels
ps       1    1    v20111119  r1i1p1  6hr  atmos  6hrLev  
psl      2    1    v20111119  r1i1p1  6hr  atmos  6hrPlev 
sic      31   1    v20111119  r1i1p1  day  seaIce day     remapnn|sea_masked|tinterp2interval
tos      37   1    v20111119  r1i1p1  day  ocean  day     remapnn|sea_masked|tinterp2interval
soil139  139  112  v20120430  r1i1p1  mon  land   Lmon    BEGIN|use_era_interim|maskregion|fixed2interval|set_extension(grb)|unleap|END
soil170  170  112  v20120430  r1i1p1  mon  land   Lmon    BEGIN|use_era_interim|maskregion|fixed2interval|set_extension(grb)|unleap|END
soil183  183  112  v20120430  r1i1p1  mon  land   Lmon    BEGIN|use_era_interim|maskregion|fixed2interval|set_extension(grb)|unleap|END
soil236  236  112  v20120430  r1i1p1  mon  land   Lmon    BEGIN|use_era_interim|maskregion|fixed2interval|set_extension(grb)|unleap|END
soil39   39   112  v20120430  r1i1p1  mon  land   Lmon    BEGIN|use_era_interim|maskregion|fixed2interval|set_extension(grb)|unleap|END
soil40   40   112  v20120430  r1i1p1  mon  land   Lmon    BEGIN|use_era_interim|maskregion|fixed2interval|set_extension(grb)|unleap|END
soil41   41   112  v20120430  r1i1p1  mon  land   Lmon    BEGIN|use_era_interim|maskregion|fixed2interval|set_extension(grb)|unleap|END
soil42   42   112  v20120430  r1i1p1  mon  land   Lmon    BEGIN|use_era_interim|maskregion|fixed2interval|set_extension(grb)|unleap|END

Lines can be commented out by using a # symbol.

data_path is the path to a directory containing model data for a particular GCM. It should point to the directory where the <experiment>/<frequency>/<realm>/... structure lives.

era_interim_path is the path to a directory containing EraInterim soil data data.

remove_leap if "false" removing 29 Feb desired, and if "true" will be removed if exists.

interval refers to the frequency of the output files after performing temporal inteprolation (tinterp2interval, fixed2interval).

The grib and ltype columns in the variable section should match those in the WRF Vtable used.


Filters are simple processing units that can be combined in a pipe to accomplish the required transformations for a given variable. There might be additional filters (beyond those shown in the BCtable) defined in the prefilter and postfilter variables in preprocessor.ESGF. These apply to all variables. Per-variable filters defined in the BCtable are executed in between the common pre- and post-filters. To avoid the use of the common pre- and/or post-filters for a particular variable, the special tags BEGIN and/or END, respectively, can be used.

There are 2 special filters that can be used at the beginning of a pipe. They create a stream of data from the files stored following the ESGF DRS:

only_closest_to_sdate : Take only the time record closest to the start date of the period requested. This is intended for constant fields or those used only as initial conditions (e.g. soil data).

time_slice : Takes the requested time period from a variable and makes it available for further filtering. This is the default starting filter in preprocessor.ESGF. To override it in the BCtable, use the BEGIN keyword followed by other start filter (e.g. BEGIN|use_era_interim).

use_era_interim : Use ERA-Interim data to replace a missing soil variable (moisture and or temperature). These variables are only required as initial conditions and are therefore only needed for the model spin up. They are not used along the boundaries. The path to the ERA-Interim data MUST be set in the header (era_interim_path, see example above)

Most filters take an input stream, process it, and send out the result as an output stream to be used by the next filter. This is a list of available filters:

celsius2K : Unit conversion from Celsius degrees to Kelvin units.

convert2grb : Converts the stream to GRIB format

tinterp2interval : Interpolation in time to N-hourly frequency (daily files, or 6-hourly to 3-hourly frequency). This filter is controld by "interval" given in the header

fixed2interval : Interpolation in time of fixed/slow-varing variabels to N-hourly frequency. Recommended only for slow-varying fields (SST, sea ice, soil moisture...). This filter is controld by "interval" given in the header.

is_land_mask : This filter flags the current variable as the land mask. This sets this variable to be used for land masking.

maskregion : Masks (in practice, crops, due to missing value compression) a given region. The default in preprocessor.ESGF is the EURO-CORDEX domain.

only_ic : Tags a variable as only for initial conditions. This makes the processor to skip files for different years.

percent2one : Unit conversion from percentage (0 to 100) to values between 0 and 1. Common for fractional land masks or sea ice cover.

remapnn : Regridding to the land mask grid by nearest neighbour interpolation.

rename(name) : Renames a variable. Useful to avoid overwriting files when the same variable is processed twice. E.g. skin temperature might contain SST values over the sea.

ringregion : Masks only the boundaries of a given region, further reducing the file size by removing data not only outside the domain, but also in the interior of the domain. This option is not usable by WRF when sst_update is active and/or a rotated lat-lon projection is in use. In practice, this is unusable for regional climate simulations with WRF, but it could be usable by other models.

sea_masked : The field is defined only over the sea. Mask out all land points according to the variable flagged as is_land_mask.

set_extension(ext) : Changes (or sets) the extension ext to the output data.

set_grb_code : For grib streams (i.e. after the convert2grb filter), sets the grib variable code indicated in the table.

set_grb_ltype : For grib streams (i.e. after the convert2grb filter), sets the grib level type indicated in the table.

set_hybrid_levels : For grib streams (i.e. after the convert2grb filter), converts the hybrid vertical levels coordinate to GRIB conventions.

set_start_time : Sets the date and time of the stream to the start date and time of the selected period. Usually applied to constant fields (orog, sftlf) or to initial conditions (soil variables).

shift_time(shift) : Shifts the time axis by shift. This can be useful when a variable is defined over a time period and the time coordinate does not match the others. An example could be SLP averaged e.g. between 00:00 and 06:00 and stored as time 03:00. If other variables are stored at 00:00, 06:00 and so on, a time shift such as shift_time(-3hour) can be applied. This is not an optimal solution since averaged variables should NOT be used as input for an RCM, which expects instantaneous fields.

unleap : Removes 29 February from the data if "noleap" in the header is set to "true", and if the date in the processed file exists. The filter does not touch anything if the calendar in the files is "360_day" (i.e. each month has 30 days, which is typically set in the GCMs from the Hadley center - HadGEM* GCM model names).

split_soil_mois_grb : This is ad-hoc filter to split the soil moisture information into a single file for each level. Grib output is assumed.

split_soil_temp_grb : This is ad-hoc filter to split the soil temperature information into a single file for each level. Grib output is assumed.