North American Integrated Fine Particle Data Sets
Bret Schichtel, Stefan Falke, and Rudolf Husar, Center for Air Pollution Impact and Trend Analysis (CAPITA), 7/23/98
Long term North American PM data sets representative of urban and rural air quality are being created by integrating data from research and routine fine particle monitoring networks that are or have been in operation. The data sets will be composed of fine, coarse, and PM10 mass, elemental composition, organics and ions. In addition, visibility and extinction and scatting coefficinets will be included.
The process for creating the data sets involves gathering the data from various data suppliers, reformatting the data to a standard format and adding location, variable and data sampling metadata. The data and metadata will be available for each individual network including all locations and variables, as well as integrated datasets where select variables are grouped into one data set from select networks.
This is an on going project with intermediate data sets periodically created. This report describes the currently available data sets networks and providers, the processing of the data for integration, and format of the final data sets.
This project is being supported by EPAs Office of Air Quality Planning and Standards (OAQPS).
Currently Available Integrated Data Sets
Data Set: NAMPM_F
Data URL: http://capita.wustl.edu/datawarehouse/Datasets/CAPITA/NAMPM_fine/Data/NAMPM_f.html
Networks: IMPROVE, NESCAUM, GAViM, SCENES, CARB, NPMR, and NAPS
Variables: Fine mass, elemental composition including organics, and ions
#Stations: 172
Time Span: 3/88 12/97
Note: The elements included in this data set are only those that are include in the IMPROVE data. However, the organic variables from all data sets are included.
Data Set: NAMPM25
Data URL: http://capita.wustl.edu/datawarehouse/Datasets/CAPITA/NAMPM_fine/Data/NAMPM25.html
Networks: IMPROVE, NESCAUM, GAViM, SCENES, CARB, NPMR, NAPS, PHLS, PHCO, and AIRS
Variables: Fine mass (PM2.5)
#Stations: 318
Time Span: 3/88 12/97
These data are provided as was received from their sources. No quality control of the data, beyond that done by the supplying organizations, has been performed on the data. Some networks corrected their data to a reference temperature and pressure, others do not. No attempt has yet been made to reconcile these differences.
Data Networks and Suppliers
The data networks and their suppliers to be included in the North American integrated data sets are listed in tables 1-3:
Table 1. Data sets processed.
|
Network |
Network Abbr. |
Supplying Organization |
# Sites |
Time Span |
PM Variables |
|
Interagency Monitoring of Protected Visual Environments (IMPROVE) |
IMPR |
National Park Service |
93 |
3/88 2/98 |
fine PM mass, elements, organics, ions, bab; PM10 |
|
Measurement of Haze and Visual Effects (MOHAVE) |
MOHA |
National Park Service |
43 |
1/10/2/15/92 |
fine PM mass, elements, organics, ions, bab; PM10 |
|
Pacific Northwest Regional Visibility Experiment Using Natural Tracers (PREVENT) |
PREV |
National Park Service |
34 |
6/909/90 |
fine PM mass, elements, organics, ions, bab; PM10 |
|
Stack Filter Units (SFU) |
SFU |
National Park Service |
80 |
7/79 11/93 |
fine PM mass, elements, organics, ions, bab |
|
Winter Haze Intensive Tracer Experiment (WHITEX) |
WHIT |
National Park Service |
14 |
1/1/87 2/18/87 |
fine PM mass, elements, organics, ions, bab |
|
Northeast States for Coordinated Air Use Management (NESCAUM) |
NESC |
NESCAUM |
11 |
9/88 11/93 |
fine PM mass, elements, organics, ions, bab |
|
Guelph Aerosol and Visibility Monitoring program (GAViM) |
GAVM |
University of Guelph, Ontario |
4 |
6/94 12/97 |
fine PM mass, elements, bab |
|
National PM Research Monitoring Network |
NPMR |
EPA |
2 |
3/95 12/97 |
fine and coarse PM mass, elements, and organics |
|
Aerometric Information Retrieval System (AIRS) |
AIRS |
EPA |
~1500 |
1/85 12/97 |
PM2.5, PM10, TSP |
|
Philadelphia Saturation Study |
PHLS |
EPA |
16 |
9/11/94 9/10/94 |
PM2.5, PM10 |
|
Philadelphia 1979-1983 Study. |
PHCO |
EPA |
9 |
4/24/79 12/26/83 |
PM1.5 fine, coarse, total, PM10, coarse > 10 um, Sulfate, Nitrate, Lead |
|
Philadelphia 1992- 1995 Study |
PHCO |
EPA |
1 |
12/5/92 12/4/95 |
PM2.5 fine, coarse, total |
|
Clean Air Status and Trends Network (CASTNet) Visibilility Chemistry |
CAST |
EPA |
12 |
10/93 12/97 |
fine PM mass, elements, organics, ions, bab |
|
Clean Air Status and Trends Network (CASTNet) Dry Deposition Chemistry |
CAST |
EPA |
94 |
1/87 12/97 |
fine ions |
|
Subregional Cooperative Electric Utility, Department of Defense, National Park Service, and EPA study (SCENES) |
SCEN |
EPRI |
7 |
11/84 10/89 |
fine PM mass, elements, organics, and bab |
|
Eulerian Model Evaluation Field Study (US and Canada) (EMEFS) |
EMEF |
EPRI |
129 |
6/88 6/90 |
fine ions |
|
California Air Resource Board (CARB) |
CARB |
California Air Resource Board |
26 |
1/89 8/97 |
fine and coarse mass, and elements |
|
National Air Pollution Surveillance Network (NAPS) |
NAPS |
Environment Canada |
29 |
1/90 12/96 |
fine and coarse PM mass, elements, and ions |
|
Tennessee Valley Authority (TVA) |
TVA1 |
Tennessee Valley Authority |
9 |
5/80 9/87 |
fine mass; PM10 |
Table 2. Data sets to be processed.
|
Network |
Network Abbr. |
Supplying Organization |
# Sites |
Time Span |
Variables |
|
|
Eastern Regional Air Quality Study (ERAQS) |
ERAQ |
EPRI |
9 |
11/78-3/80 |
total mass, ions, organics, and bab |
|
Table 3. Data sets on order.
|
Network |
Network Abbr. |
Supplying Organization |
# Sites |
Time Span |
Variables |
|
SEAVS |
SEAV |
National Park Service |
?? |
|
fine mass + |
|
Mexican Taxes Border Study |
MTBS |
National Park Service |
18 |
9/9/96-10/13/96 |
fine mass, elements, ++ |
|
Sulfate Regional Experiment (SURE) |
SURE |
EPRI |
56 |
1977 - 1978 |
fine mass, ions, organic carbon |
|
Canadian Air and Precipitation Monitoring Network (CAPMoN) |
CAPM |
NATChem Particle |
10 |
1983 - 1997 |
fine mass + |
|
Canadian Acid Aerosols Monitoring Program (CAAMP) |
CAAM |
NATChem Particle |
?? |
5/92 3/96 |
fine mass, PM10 |
|
New Brunswick Precipitation (and Air) Monitoring Network (NBPMN) |
NBPM |
NATChem Particle |
11 |
1980 - 1997 |
TSP, PM2.5, PM10 trace metals |
|
Deseart Reseach Institue |
DRI |
DRI |
|
|
|
Data Processing and Format
A relational database for each networks data is created consisting of a main Data table containing all of the data, a Location table containing the location metadata, and a Variable table containing the variable metadata. The data are then passed through a set of standardization routines that creat a data set ready for integration with all other processed data.
The standardization process includes:
0 - Valid values
1 - Data below the instruments minimum detection limit
2 - Questionable data, i.e. data values flagged due to non standard sampling, potential contamination, etc.
3 Invalid sample
NULL Flag did not exist
Data Table
A sample Data table is presented in Table 2. The first two columns, Loc_Code and Date are the Key or dimensional fields identifying the monitor and sample date and time for each data record respectively. Additional information about the monitoring site is contained in the Location table. Each of the remaining columns contain the data for a single variable. The variable code (Var_Code ) is used for the column name, additional variable metadata is located in the Variable Table.
Table 4. A sample Data Table.
(------------Key Fields----------------) (----------- ---------------Variable Fields----------------------------)
|
Loc_Code |
Date |
MF_cf111 |
Al_cf112 |
Si_cf112 |
P__cf112 |
S__cf112 |
|
NAPSCANS1HAL |
1/5/90 |
14800 |
15.3 |
0.45 |
28.6 |
1830 |
|
NAPSCANS1HAL |
1/11/90 |
13100 |
19.2 |
25.4 |
45.3 |
2230 |
|
NAPSCANS1HAL |
1/17/90 |
21300 |
1 |
37.2 |
66.3 |
3660 |
|
NAPSCANS1HAL |
1/23/90 |
17500 |
1 |
19.3 |
58.8 |
3420 |
|
NAPSCANS1HAL |
1/29/90 |
9100 |
12.8 |
35.9 |
18.7 |
1400 |
Location Table
The Location table is made up of the location code (Loc_Code) followed by station name and location information (Table 5). Each station is assigned a unique Loc_Code based upon the "Station ID" format used in the Canadian NAtChem Particle database. Encoded in the location code is the network as well as location information (see Table 6). The Loc-Code for monitoring sites with AIRS site codes use characters 1-4 for the Network abbreviation and 5-13 for the AIRS site code. The Location tables StationID field contains the original location identifier that comes with the database.
Table 5. A sample Location Table.
|
Loc_Code |
StationID |
Land Use |
Network |
Country |
state/province |
City |
Address |
Station Name |
Loc_Lon |
Loc_Lat |
Elevation |
|
NESCUSNY1WHM |
WHMO1 |
NESC |
United States |
New York |
Whiteface Mt., NY |
-1.28893 |
0.774635 |
639.94 |
|||
|
NESCUSCT1MOM |
MOMO1 |
NESC |
United States |
Connecticut |
Mohawk Mt., CT |
-1.27933 |
0.730129 |
459.84 |
|||
|
NESCUSVT1PMR |
PMRF1 |
NESC |
United States |
Vermont |
Proctor Maple R. F. Underhill 1 |
-1.27176 |
0.777253 |
396.16 |
|||
|
NESCUSVT2PMR |
PMRF2 |
NESC |
United States |
Vermont |
Proctor Maple R. F. Underhill 2 |
-1.27176 |
0.777253 |
396.16 |
|||
|
NESCUSMA1QUR |
QURE1 |
NESC |
United States |
Massachusetts |
Quabbin Summit, MA |
-1.26245 |
0.738274 |
310.83 |
Table 6. The Loc_Code encoding definitions.
|
Characters |
Definition |
|
1-4 |
Network Abbreviation |
|
5-6 |
Country |
|
7-8 |
State/Province |
|
9 |
Monitor number |
|
10-12 |
Location Name Abbreviation |
Variable Table
The Variable table is made up of the variable abbreviation (Var_Abbr) followed by the variable name and sampler information (Table 7). Each variable is assigned a unique Var_Abbr. Encoded in the Var_Abbr is the species type and sampler and analysis methods. Table 8 lists the meaning of each character in the Var_Abbr.
Table 7. A sample Variable Table.
|
Var_Abbr |
Var_Desc |
Units |
Species_Abbr |
Species_Name |
Attribute |
Cut Point |
Temp |
Sampler |
Filter |
Analysis Method |
|
MF_cf111 |
Fine Mass |
ng/m3 |
MF |
Fine Mass |
Concentration |
<2.5 um |
dichotomous |
Teflon |
Gravimetric |
|
|
MF_ef111 |
Fine Mass |
ng/m3 |
MF |
Fine Mass |
Error |
<2.5 um |
dichotomous |
Teflon |
Gravimetric |
|
|
MF_mf111 |
Fine Mass |
ng/m3 |
MF |
Fine Mass |
Minimum Detection Limit |
<2.5 um |
dichotomous |
Teflon |
Gravimetric |
|
|
MF_ff111 |
Fine Mass |
ng/m3 |
MF |
Fine Mass |
Flag |
<2.5 um |
dichotomous |
Teflon |
Gravimetric |
|
|
NA_cf112 |
Sodium |
ng/m3 |
NA |
Sodium |
Concentration |
<2.5 um |
dichotomous |
Teflon |
PIXE or XRF |
|
|
NA_ef112 |
Sodium |
ng/m3 |
NA |
Sodium |
Error |
<2.5 um |
dichotomous |
Teflon |
PIXE or XRF |
|
|
NA_mf112 |
Sodium |
ng/m3 |
NA |
Sodium |
Minimum Detection Limit |
<2.5 um |
dichotomous |
Teflon |
PIXE or XRF |
|
|
NA_ff112 |
Sodium |
ng/m3 |
NA |
Sodium |
Flag |
<2.5 um |
dichotomous |
Teflon |
PIXE or XRF |
Table 8. The Var_Abbr encoding definitions. The meaning of the abbreviations and codes are listed in the Variable Table.
|
Characters |
Definition |
|
1-3 |
Species Abbreviation |
|
4 |
Attribute |
|
5 |
Cut Point |
|
6 |
Sampler Type |
|
7 |
Filter |
|
8 |
Analysis Method |
Data Quality Control
No quality control of the data beyond that done by the supplying organization has been performed on the data sets. However, as the data are used and problems identified appropriate procedures to remedy the problems in the data sets will be conducted.
Data File Formats
The data are provided in three file formats:
Associated with each fixed length ascii file is a data dictionary which gives the field names, the number of bytes and NULL value. The order of the fields in the dictionary from top to bottom is the order of the fields in the ASCII files from left to right.