Total Precipitable Water Observations
Overview
Several satellites contain instruments that return observations of integrated Total Precipitable Water (TPW). There are two MODIS Spectroradiometers, one aboard the TERRA satellite, and the other aboard the AQUA satellite. There is also an AMSR-E instrument on the AQUA satellite.
These instruments produce a variety of data products which are generally distributed in HDF format using the HDF-EOS libraries. The converter code in this directory IS NOT USING THESE FILES AS INPUT. The code is expecting to read ASCII TEXT files, which contain one line per observation, with the latitude, longitude, TPW data value, and the observation time. The Fortran read line is:
read(iunit, '(f11.6, f13.5, f10.4, 4x, i4, 4i3, f7.3)') &
lat, lon, tpw, iyear, imonth, iday, ihour, imin, seconds
No program to convert between the HDF and text files is currently provided. Contact dart@ucar.edu for more information if you are interested in using this converter.
Data sources
This converter reads files produced as part of a data research effort. Contact dart@ucar.edu for more information if you are interested in this data.
Alternatively, if you can read HDF-EOS files and output a text line per observation in the format listed above, then you can use this converter on TPW data from any MODIS file.
Programs
The programs in the DART/observations/tpw
directory extract data from the distribution text files and create DART
observation sequence (obs_seq) files. Build them in the work
directory by running the ./quickbuild.sh
script.
In addition to the converters, several other general observation sequence file utilities will be built.
Generally the input data comes in daily files, with the string YYYYMMDD (year, month, day) as part of the name. This converter has the option to loop over multiple days within the same month and create an output file per day.
Like many kinds of satellite data, the TWP data is dense and generally needs to be subsampled or averaged (super-ob’d) before being used for data assimilation. This converter will average in both space and time. There are 4 namelist items (see the namelist section below) which set the centers and widths of time bins for each day. All observations within a single time bin are eligible to be averaged together. The next available observation in the bin is selected and any other remaining observations in that bin that are within delta latitude and delta longitude of it are averaged in both time and space. Then all observations which were averaged are removed from the bin, so each observation is only averaged into one output observation. Observations that are within delta longitude of the prime meridian are handled correctly by averaging observations on both sides of the boundary.
It is possible to restrict the output observation sequence to contain data from a region of interest using namelist settings. If your region spans the Prime Meridian min_lon can be a larger number than max_lon. For example, a region from 300 E to 40 E and 60 S to 30 S (some of the South Atlantic), specify min_lon = 300, max_lon = 40, min_lat = -60, max_lat = -30. So ‘min_lon’ sets the western boundary, ‘max_lon’ the eastern.
The specific type of observation created in the output observation sequence file can be select by namelist. “MODIS_TOTAL_PRECIPITABLE_WATER” is the most general term, or a more satellite-specific name can be chosen. The choice of which observations to assimilate or evaluate are made using this name. The observation-space diagnostics also aggregate statistics based on this name.
Namelist
This namelist is read from the file input.nml
. Namelists start with an ampersand ‘&’ and terminate with a slash ‘/’.
Character strings that contain a ‘/’ must be enclosed in quotes to prevent them from prematurely terminating the
namelist.
&convert_tpw_nml
start_year = 2008
start_month = 1
start_day = 1
total_days = 31
max_obs = 150000
time_bin_start = 0.0
time_bin_interval = 0.50
time_bin_half_width = 0.25
time_bin_end = 24.0
delta_lat_box = 1.0
delta_lon_box = 1.0
min_lon = 0.0
max_lon = 360.0
min_lat = -90.0
max_lat = 90.0
ObsBase = '../data'
InfilePrefix = 'datafile.'
InfileSuffix = '.txt'
OutfilePrefix = 'obs_seq.'
OutfileSuffix = ''
observation_name = 'MODIS_TOTAL_PRECIPITABLE_WATER'
/
Item |
Type |
Description |
---|---|---|
start_year |
integer |
The year for the first day to be converted. (The converter will optionally loop over multiple days in the same month.) |
start_month |
integer |
The month number for the first day to be converted. (The converter will optionally loop over multiple days in the same month.) |
start_day |
integer |
The day number for the first day to be converted. (The converter will optionally loop over multiple days in the same month.) |
total_days |
integer |
The number of days to be converted. (The converter will optionally loop over multiple days in the same month.) The observations for each day will be created in a separate output file which will include the YYYYMMDD date as part of the output filename. |
max_obs |
integer |
The largest number of obs in the output file. If you get an error, increase this number and run again. |
time_bin_start |
real(r8) |
The next four namelist values define a series of time intervals that define time bins which are used for averaging. The input data from the satellite is very dense and generally the data values need to be subsetted in some way before assimilating. All observations in the same time bin are eligible to be averaged in space if they are within the latitude/longitude box. The input files are distributed as daily files, so use care when defining the first and last bins of the day. The units are in hours. This item defines the midpoint of the first bin. |
time_bin_interval |
real(r8) |
Increment added the time_bin_start to compute the center of the next time bin. The units are in hours. |
time_bin_half_width |
real(r8) |
The amount of time added to and subtracted from the time bin center to define the full bin. The units are in hours. |
time_bin_end |
real(r8) |
The center of the last bin of the day. The units are in hours. |
delta_lat_box |
real(r8) |
For all observations in the same time bin, the next available observation is selected. All other observations in that bin that are within delta latitude or longitude of it are averaged together and a single observation is output. Observations which are averaged with others are removed from the bin and so only contribute to the output data once. The units are degrees. |
delta_lon_box |
real(r8) |
See delta_lat_box above. |
min_lon |
real(r8) |
The output observations can be constrained to only those which lie between two longitudes and two latitudes. If specified, this is the western-most longitude. The units are degrees, and valid values are between 0.0 and 360.0. To define a box that crosses the prime meridian (longitude = 0.0) it is legal for this value to be larger than max_lon. Observations on the boundaries are included in the output. |
max_lon |
real(r8) |
The output observations can be constrained to only those which lie between two longitudes and two latitudes. If specified, this is the eastern-most longitude. The units are degrees, and valid values are between 0.0 and 360.0. To define a box that crosses the prime meridian (longitude = 0.0) it is legal for this value to be smaller than min_lon. Observations on the boundaries are included in the output. |
min_lat |
real(r8) |
The output observations can be constrained to only those which lie between two longitudes and two latitudes. If specified, this is the southern-most latitude. The units are degrees, and valid values are between -90.0 and 90.0. Observations on the boundaries are included in the output. |
max_lat |
real(r8) |
The output observations can be constrained to only those which lie between two longitudes and two latitudes. If specified, this is the northern-most latitude. The units are degrees, and valid values are between -90.0 and 90.0. Observations on the boundaries are included in the output. |
ObsBase |
character(len=128) |
A directory name which is prepended to the input filenames only. For files in the current directory, specify ‘.’ (dot). |
InfilePrefix |
character(len=64) |
The input filenames are constructed by prepending this string before the string ‘YYYYMMDD’ (year, month, day) and then the suffix is appended. This string can be ‘ ‘ (empty). |
InfileSuffix |
character(len=64) |
The input filenames are constructed by appending this string to the filename. This string can be ‘ ‘ (empty). |
OutfilePrefix |
character(len=64) |
The output files are always created in the current directory, and the filenames are constructed by prepending this string before the string ‘YYYYMMDD’ (year, month day) and then the suffix is appended. This string can be ‘ ‘ (empty). |
OutfileSuffix |
character(len=64) |
The output filenames are constructed by appending this string to the filename. This string can be ‘ ‘ (empty). |
observation_name |
character(len=31) |
The specific observation type to use when creating the output observation sequence file. The possible values are:
These must match the parameters defined in the ‘obs_def_tpw_mod.f90’ file in the DART/obs_def directory. There is a maximum limit of 31 characters in these names. |
Known Bugs
The input files are daily; be cautious of time bin boundaries at the start and end of the day.
Future Plans
This program should use the HDF-EOS libraries to read the native MODIS granule files.
This program could loop over arbitrary numbers of days by using the time manager calendar functions to increment the bins across month and year boundaries; it could also use the schedule module to define the bins.