Forward Operators
In data assimilation the process of computing an observation’s expected value from a
model state is called the forward operator
.
In DART the forward operator code is kept separate from any model-specific details.
This is to enable the same forward operator to run with any model of the same type (atmosphere, ocean, etc).
This is accomplished by restricting the forward operator code to only interpolating state fields through a
call to the interpolate()
function. It can make no other assumptions
about the model-specific details, for example the grid size or spacing.
A simple forward operator computes the expected value by calling the
model_mod.f90
interpolate()
function
on a field contained in the model state and returning the value.
The base DART code does this without requiring the user to add additional code.
Generally a model’s interpolate function should be written so it only does spatial interpolation on fields in the model state. Derived values and more complex computations should be placed in forward operator code.
A more complex forward operator either modifies the state value returned by interpolate
or it calls
interpolate for multiple state fields and calculates an expected value from those. The convention is
to make a separate Fortran module for each forward operator and place the code in a
observations/forward_operator/obs_def_xxx_mod.f90
file, where xxx is specific to the
type of forward operator.
Forward operator code will be called by other DART code and so the subroutine calling sequences are fixed. See documentation for MODULE obs_def_mod for more details.
The routine that must return an ensemble-sized array of expected values is usually
called get_expected_xxx
. It is passed a handle to access the state values and
the observation location. It must return an array of expected values, one for each
ensemble member, for that observation type at that location.
Approaches for Common Situations
Users who are familiar with data assimilation systems which are tightly integrated with a particular model often assume that it is either impossible or too inefficient to separate the model details from the forward operator code. In our experience with running DART across a wide range of models, this is not the case.
Here are approaches successfully used for some common situations.
Summing or Integrating a Column Value
Some instruments return an observation of a vertically integrated value, e.g. Total Precipitable Water. If this field is not already in the model state the forward operator needs to sample the model in a vertical column and sum the moisture values to return the total water in the column.
There are 2 common approaches to implementing code of this type.
Use a synthetic set of pressure or height vertical values at the horizontal location. The forward operator code uses the horizontal components of the requested location and constructs a set of synthetic locations with fixed steps in pressure or height in the vertical, calls interpolate for each vertical location, and sums the results.
Use a synthetic set of model levels in the vertical at the horizontal location. The forward operator code uses the horizontal components of the requested location and construct a set of synthetic locations with model level numbers in the vertical, calls interpolate for each vertical location, and sums the results. This approach has the advantage of minimizing the interpolation needed for the model-specific code to compute the vertical value.
In both cases the code must expect errors when the vertical coordinate moves outside of the model grid. Often the code will verify the horizontal location is within the grid, then loop over vertical levels to count the expected number of successful locations. It can then allocate arrays of that size and call interpolate for the actual state vector fields required for this observation type.
Optional Precomputed Fields
If a complex computation is already being performed by the model for an observed type, those values can be added to the state vector and the forward operator code can interpolate values from that field directly. However if this forward operator is used when that field is not present it should have code to do a computation using other fields present in the model state.
See the radar forward operator for an example in the ‘expected fall velocity’ section. If this has already been computed by the model and is available in the model state, the code interpolates in this field and returns. If not, it interpolates values for other consitituent fields and does a generic computation.
Parallelism Implementation Details
The forward operator code is called for each observation to be assimilated. DART does this in parallel with each MPI task computing the values for all ensemble members for a subset of the observations. The state data for the ensembles is distributed across all tasks, so MPI one-sided communication is used for tasks to retrieve needed data values from other tasks.
See sections below for more details on other parallelism options and how this differs from previous versions of DART.
The forward operator is performed in get_obs_ens_distrb_state
.
A limited call tree for get_obs_ens_distrb_state
is shown below.
The QC_LOOP is in get_obs_ens_distrb_state
because the qc across the ensemble is known.
The window opening and closing in
get_obs_ens_distrb_state
is as follows:
State window created (tasks can access other tasks’s memory)
Forward operator called
State window destroyed (tasks can no longer access other tasks’s memory)
QC calculated for the ensemble
However, there may be occasions where having only the first ens_size tasks perform the forward operator is desired. For example, if the forward operator is being read from a file, or the forward operator uses a large portion of the state. Or when debugging it may be easier to have 1 task per ensemble member.
To transpose and do the forward operators without using MPI one-sided communication,
you can use the filter_nml namelist option distribute_state = .false.
The process is the same as above except the window creation and destruction are transposing the state.
State window created (state ensemble is transposed var complete)
Forward operator called
State window destroyed (state ensemble is tranaposed to copy complete)
QC calculated for the ensemble
Note, that if you have fewer tasks than ensemble members some tasks will still be doing vectorized forward operators (because they own more than one ensemble member).
State access
Model_mod routines no longer get an array containing the state. The state is accessed through the function
get_state
.
x = get_state(i, state_handle)
where x is the state at index i. state_handle
is passed from above. During model_interpolate get_state
returns
an array. Durring get_state
returns a single value (the mean state).
Other Parallelism Options
In releases of DART before Manhattan (Lanai and earlier), transposes of the ensemble of states
were used. If all data values for a single ensemble were needed an all-to-some transpose was
used to collect the state vector data onto the first N tasks, where N is the ensemble size.
If all ensemble values for a single item were needed a some-to-all transpose was used to
spread a subset of the state vector data to all tasks, where tasks contained all
ensemble member values for the items they had. This is transpose mode
where the
full state vector must fit into a single task’s memory.
The current Manhattan release runs in distributed mode
where state data is read
into memory and distributed across all tasks. Each task stores all ensemble member
values for a subset of the items in the state vector. The entire filter run keeps
this data distribution and the data is only collected back into individual ensembles
at output time. Collection and I/O can be done on a field-by-field basis so the entire
state vector is never present at one time on a task.
In the Manhattan release it is still possible to run in transpose mode which avoids using MPI one-sided communication. It is generally only feasible for models with smaller state vector sizes since the first N tasks will need to store an entire state vector in memory. However, in most cases run time is still faster for distributed mode even for smaller models because computing forward operators for an entire ensemble at once often amortizes the overhead parts of the computation when computing N values.
Transpose mode collects the entire state vector for an ensemble onto the first N tasks. The forward operator code has access to the entire state vector and can reference any item with a simple array access. Each of the first N tasks computes the expected value for all observations for that single ensemble member. Tasks N+1 and higher are idle during this part of the execution.
In distributed mode each task computes the expected values for a subset of the observations. The observations are equally distributed round-robin. For K observations and T tasks each task gets K/T obs. MPI one-sided communication is used to retrieve needed state values from other tasks.