How DART stores observations: observation sequence (obs_seq) files

Since DART is designed to assimilate observations from any data source, it includes a set of programs to convert observations from their original format to DART’s own observation sequence, or obs_seq, format. The obs_seq format is designed to allow DART to accomodate a myriad of source observation file formats, structure and metadata. Many original source observation files don’t contain the necessary information about the error characteristics and spatial structure of the data needed to perform an assimilation.

There are three types of obs_seq files.

obs_seq.in

An obs_seq.in file actually contains no observation quantities. It may be best thought of as a perfectly laid-out notebook waiting for an observer to fill in the actual observation quantities.

All the rows and columns are ready, labelled, and repeated for every observation time and platform. The obs_seq.in file is generally the start of a “perfect model” experiment.

In a perfect model experiment, one instance of the model is run through the DART program perfect_model_obs - which applies the appropriate forward operators to the model state and writes down the observations generated by the model in the writes them down in the perfectly laid-out notebook.

The completed notebook is then renamed obs_seq.out.

obs_seq.out

An obs_seq.out file contains a linked list of observations. The observations can potentially be (and usually are) from different platforms and of different quantities, each with their own error characteristics and metadata.

An obs_seq.out file containing real data can be generated by using one of DART’s many observation converter programs. Additionally, an obs_seq.out file containing synthetic data can be created by running DART’s perfect_model_obs program.

The observations in the obs_seq.out files are assimilated into the model ensemble by DART’s filter program.

To learn more about the structure of the obs_seq.out file, see Detailed structure of an obs_seq file.

If you want to create an observation sequence file from real observations, you should contact DAReS staff by emailing dart@ucar.edu for advice regarding your specific types of observations.

obs_seq.final

When running an assimilation, DART’s filter program assimilates the observations contained in the obs_seq.out file and generates an obs_seq.final file.

The obs_seq.final file contains everything in the obs_seq.out file and also contains a few additional ‘copies’ of the observation.

Since DART is an ensemble algorithm, each ensemble member must compute its own estimate of the observation for the algorithm. You can save the ensemble members’ estimates of the observation in the obs_seq.final file by setting the num_output_obs_members entry in the filter_nml namelist of input.nml to a value greater than zero.

Minimally, filter will record the mean and spread of the ensemble estimates in the obs_seq.final file.

To learn more about the structure of the obs_seq.final file, see Detailed structure of an obs_seq file.

Using obs_seq.final for observation-space diagnostics

The best method to determine the performance of an experiment in which you assimilate data from real-world sources is to compare the ensemble estimates of the observation to your real-world data. You can estimate the bias and error of the ensemble mean or gauge how many of the real-world observations are actually being assimilated. These diagnostics are known as observation-space diagnostics.

DART provides programs obs_diag and MATLAB observation space diagnostics for you use to quickly assess the performance of your experiment.

Note

Since each ‘observation type’ may require different amounts of metadata to be read or written, any routine to read or write an observation sequence must be compiled with support for those particular observations. The supported observations are listed in the obs_kind_nml namelist of input.nml. For more information, see How DART supports different types of observations: the preprocess program.