NORTH AMERICAN INTEGRATED FINE PARTICLE DATA SET

Paper 99-398

Bret Schichtel

Center for Air Pollution Impact and Trend Analysis, Washington University, One Brookings Drive, Campus Box 1124, St Louis, Missouri 63130 bret@mecf.wustl.edu

Stefan R. Falke and Rudolf B. Husar

Center for Air Pollution Impact and Trend Analysis, Washington University, One Brookings Drive, Campus Box 1124, St Louis, Missouri 63130 bret@mecf.wustl.edu

ABSTRACT

Two long term North American fine particle (<2.5 micrometers) data sets were created by integrating data from 18 historical and active monitoring networks supplied by eight different organizations. One data set consists of PM2.5 mass and the other data set consists of PM2.5 mass and its elemental composition, organic and elemental carbon, ions, and light absorption for ~600 urban and rural monitoring sites in the US and Canada from 1979 through February 1997. Data processing involved reformatting the data into a common format and units with uniform geographic and temporal coding, and creating a consistent set of data flags. A consistent set of metadata describing the networks variables, sampling sites, samplers and analysis methods was also added. Subsequently, the data were merged into a single database. No modifications were made to the data values beyond unit conversions. Data used in the integrated data sets came from the following networks: IMPROVE, NESCAUM, GAViM, SCENES, CARB, NAPS, AIRS, MOHAVE, PREVENT, WHITEX, Nation Park Service’s SFU, CASTNet, National PM Research Monitoring Network, Tennessee Valley Authority, and two specialty studies in Philadelphia. The integrated data sets are made publicly available via the world wide web.

INTRODUCTION

The promulgation of the new PM2.5 air quality standard requires the measurement of ambient PM2.5 mass and its constituents for compliance testing, determining source attribution, model evaluation, and air quality tracking and evaluation. The national PM2.5 network started to collect data in January 1999 and will require another three years of monitoring before compliance testing of the annual standard can begin. However, there are a number of historical and currently active PM monitoring programs with multiple years of data that can be drawn upon to obtain a better understanding of the PM2.5 concentrations and their causes and trends. Many of these networks have been drawn together and integrated into a single North American Fine Particle Database.

The creation of the integrated database involved gathering the data from various data suppliers, reformatting the data to a standard format and adding location, variable and data sampling metadata. Many of the data sets grouped together collected particulate samples using different samplers, sampling duration, sample analysis techniques and some of the networks corrected the data to a reference temperature and pressure while other did not. No modifications were made to the data to account for these differing sampling and reporting techniques.

DATABASE DESCRIPTION AND DATA AVAILABILITY

The North American Integrated PM fine database contains approximately 600 locations with some data spanning nearly 20 years (1979 – 1998) (Figure 1). Canada has 60 of the stations with the remaining stations in the US. The database was created by integrating 18 data sets supplied by eight different organizations (Table 1). Each network contained one or more of the following variables, particulate mass, elemental composition, organic and elemental carbon, ions and absorption. Two data sets were created from the database and are available on the world wide web:

North American Integrated PM2.5 Data Set, located at: http://capita.wustl.edu/datawarehouse/Datasets/CAPITA/NAMPM_25/Data/NAMPM25.html. This data set contains only PM2.5 mass and data quality flags from all monitoring networks.

North American Integrated PM2.5 Speciated Data Set, located at: http://capita.wustl.edu/datawarehouse/Datasets/CAPITA/NAMPM_fine/Data/NAMPM_f.html. This data set contains the PM2.5 mass and its elemental composition, organic and elemental carbon, and ion concentrations and associated quality flags from all monitoring networks.

The data formats for these data sets are described in the Appendix.

DATA SUPPLIERS AND NETWORKS

This section describes the content of each network and the networks custodians or data supplier. Also, any special data processing that was required to integrate a network's data with the other data sets is described.

IMPROVE

The Interagency Monitoring of Protected Visual Environments (IMPROVE)1 network was established to protect visibility at Class I areas. The IMPROVE steering committee is composed of representatives from the National Park Service (NPS), the Forest Service (USFS), the Bureau of Land Management, the Fish and Wildlife Service (FWS), the Environmental Protection Agency, and regional-state organizations.

The IMPROVE fine particle network collects PM2.5 and PM10 samples over a twenty four hour period every Monday and Friday using IMPROVE samplers. The network consists of 93 monitoring sites, located in rural areas (Figure 2), operating between 3/88 to present. The PM samples are analyzed for PM2.5 mass and its elemental constituents, organics, ions, light absorption and PM10 mass. The data set contains the concentrations, minimum detection limit, error, and data quality flag. The data were downloaded from the University of California Davis' ftp site: caesar.ucdavis.edu

NESCAUM

The NESCAUM (Northeast States for Coordinated Air Use Management) fine mass network was an extension of the IMPROVE network supported by NESCAUM and the National Park Service. The network consisted of 10 IMPROVE type samplers located in rural areas of the northeastern US (Figure 2), and collected three 24 hour samples per week.2 The PM samples were analyzed for PM2.5 mass and its elemental constituents, organics, ions, and light absorption. The data set contains the concentrations, minimum detection limit, error, and data quality flags. The data were downloaded from the University of California Davis ftp site: caesar.ucdavis.edu

GAViM

Guelph Aerosol and Visibility Monitoring program (GAViM) is run by the Guelph Scanning Proton Microprobe (GSPM) laboratory. The network consists of four Canadian monitoring sites (Figure 2) using IMPROVE samplers and analytical protocols. The network collects 24 hour samples every Wednesday and Saturday, and analyzes them for PM2.5 mass and its elemental constituents, and light absorption. The network has been operational since 6/94. The data set contains the concentrations, minimum detection limit, error, and data quality flags. The data were downloaded from the GSPM laboratory web site: http://www.physics.uoguelph.ca/PIXE/airq/airq.html

NPS -SFU

The National Park Service’s Stack Filter Units (NPS-SFU) network consisted of 80 monitoring sites which collected particulate samples in rural regions throughout the United States (Figure 3). The network operated from 7/79 to 11/93 with monitoring sites coming on and off line throughout this time period. The network used two stage stacked filter samplers collecting fine (< 2.5 mm) and coarse (> 2.5 mm) particulate samples over a 72 hour sampling period from 7/79 -5/86 and 24 hour sampling period from 6/86 - 11/93. The samples were analyzed for PM2.5 mass and its elemental constituents and light absorption and PM coarse mass and its elemental constituents. The data set contains the concentrations, minimum detection limit, error, and data quality flag. The data were obtained from the National Park Service.

MOHAVE

The project MOHAVE (Measurement of Haze and Visual Effects) was established to determine what contributions the Mohave Power Plant and other sources make to haze at the Grand Canyon National Park and other mandatory Class I areas. The MOHAVE network employed 43 IMPROVE type samplers in the Southwest (Figure 3) collecting daily particulate samples over a 24 hour sampling period. Several sites collected two 12 hour samples a day. The network collected data over a winter and summer period from 1/10/–2/15/92 and 7/11/–9/2/92 respectively. The particulate samples were analyzed for PM2.5 and its elemental constituents, organics, ions, light absorption and PM10. The data set contains the concentrations, minimum detection limit, error, and quality flags. The data were obtained from the National Park Service.

All sites which collected two 12 hour samples per day were aggregated to 24 hour samples prior to integration with other data.

PREVENT

The Pacific Northwest Regional Visibility Experiment Using Natural Tracers (PREVENT) network was established to study visibility causes and effects in Washington state, west of the Cascades. The network consisted of 34 monitors located in Washington and Oregon (Figure 3). Daily particulate samples were collected from 6/90–9/90 and analyzed for PM2.5 mass and its elemental constituents and light absorption. The data set contains the concentrations and error. The data were obtained from the National Park Service.

WHITEX

The Winter Haze Intensive Tracer Experiment (WHITEX) was established to study the visibility impacts of emissions from the Navajo Generating Station. The database contained data from 13 locations which sampled from 1/1/87 – 2/18/87 (Figure 3). A number of different samplers were employed at each location, including IMPROVE, stack filter unit, dichotomous samplers, and SCISAS. Samples were collected every 6 hours, 12 hours, and 24 hours depending on the site and sampler. The particulate samples were analyzed for PM2.5 mass and its elemental constituents, organics, ions, and light absorption. The data were obtained from the National Park Service.

Only data from one sampler per monitoring site were extracted from the database and integrated with data from the other data sets. Nine sites used IMPROVE samplers three sites use stack filter units and one site used the SCISAS sampler. All data were aggregated to 24 hour samples.

NAPS

The National Air Pollution Surveillance (NAPS) Network was established to monitor and assess the air quality in Canadian urban regions. Fine (< 2.5 mm) and coarse (> 2.5 mm and < 10 mm) particulate data from 29 sites operating for some time between 1/90 to 12/96 were available (Figure 4). The data were collected over 24 hour periods every 6th day. The samples were analyzed for fine and coarse mass, their elemental constituents and ions. The data set contains the concentrations, and data quality flag. The data were obtained from Environment Canada, http://www.etcentre.org/NAPS/NAPS_main_page.html

CARB

The California Air Resource Board (CARB) collects fine (< 2.5 mm) and coarse (> 2.5 mm and (< 10 mm) particulate samples at 26 monitoring sites throughout California from 1/89 - to the present (Figure 4). The particulate samples are collected over 24 hour periods every 6th day using dichotomous samplers. The samples are analyzed for fine and coarse mass and their elemental composition. Only concentration values are available. The data were obtained from CARB: http://www.arb.ca.gov/aqd/aqd.htm.

AIRS

The Aerometric Information Retrieval System (AIRS) network consists of 119 PM2.5 monitoring sites which collected 24 hour samples every 6th day from 1/85 - 12/97 (Figure 4). The monitoring sites are located throughout the US mostly in and around urban and industrial regions. The AIRS PM2.5 data were obtained from the AIRS database at EPA.

CASTNet - Visibility Chemistry

The purpose of the Clean Air Status and Trends Network (CASTNet) Visibility Chemistry network is to measure visibility and related parameters defining status and trends. The network consists of 12 monitoring sites located in rural areas of the Eastern US, which collected data for some time between 10/93 and the present (Figure 5). Three stage filter packs were used to collect 24 hour particulate samples every 6th day. The particulate samples are analyzed for PM2.5 and its elemental constituents, organics, ions, and light absorption. The data set contains the concentrations, and data quality flag. The data were obtained from the USEPA.

CASTNet - Dry Deposition

The Clean Air Status and Trends Network (CASTNet) Dry Deposition Network measures fine (<2.5 mm) ions at 96 sites between 1/87 and the present (Figure 5). Three stage filter packs are used to collect weekly particulate samples. The data set contains the concentrations, and data quality flag. The data were obtained from the USEPA.

National PM Research Monitoring Network

The National PM Research Monitoring Network was established with the primary objective of providing ambient air quality data for relating health effects to chemical and/or physical properties of PM and to support emerging regulatory implementation and development issues. This network began collecting fine and coarse speciated PM data and meteorological data in Phoenix AZ in February of 1995. Monitoring platforms at Baltimore, MD and Fresno, CA were added in 1997. The monitoring platforms had a dichotomous sampler collecting fine (<2.5 mm) and coarse (>2.5 mm and < 10 mm) 24 hour integrated particulate samples every 3 days and a dual fine particle sequential sampler (DFPSS) collecting fine 24 hour integrated particulate samples every day. The fine particulate samples are analyzed for PM2.5 and its elemental constituents, and organics. The coarse particulate samples are analyzed for mass and elemental constituents. The data set contains the concentrations and error. The data were obtained from the USEPA.

Only the Baltimore and Phoenix data through 1997 (Figure 5) were available for integration with the other data networks.

Philadelphia 1992- 1995 Study

The Philadelphia 1992- 1995 Study measured PM2.5 and PM10 at one monitoring sites throughout the Philadelphia metropolitan area from 5/92 – 4/95 (Figure 5). The network collected 24 hour samples every day. The data were obtained from the USEPA.

Philadelphia Saturation Study

The Philadelphia Saturation Study measured PM2.5 and PM10 at sixteen monitoring sites in Philadelphia from 9/11/94 – 10/9/94 (Figure 5)3. The network collected 24 hour sample every other day. The data were obtained from the USEPA.

SCENES

The Subregional Cooperative Electric Utility, Department of Defense, National Park Service, and EPA study (SCENES) was a long-term observational study conducted by several industry and government groups to understand the factors influencing atmospheric visibility in the southwestern United States.

The SCENES network collected fine (< 2.5 mm) and total (< 15 mm) particulate samples at seven sites from 11/84 – 10/89 (Figure 6). Particulate samples were collected every third day using WRAQS-2 and SCISAS samplers running over 8, 12, 16, and 24 hour periods, depending on the location and year. The particulate samples were analyzed for PM2.5 and its elemental constituents, organics, ions and light absorption. The data set contains the concentrations, minimum detection limit, error, and data quality flag. The data can be obtained from the Electric Power Research Institute (EPRI) at: http://src.com/~epriasdc/index.htm.

The SCENES data incorporated into the integrated data sets came from Vasconcelos4. If a 24 hour sample was not available for a given location and time, it was created by aggregating the two twelve or four eight hour samples together. Only the concentration values were available for inclusion in the integrated data sets.

EMEFS

The purpose of the Eulerian Model Evaluation and Field Study (EMEFS) network was to evaluate comprehensive regional Eulerian acid deposition models from US and Canada. The EMEFS network is a composite of the following networks: APIOS (OME); CAPMon (AES); FADMP (FCG); MODES (TVA); MODES-GRAD (EPA); MODES- VAR (EPA); NDDN (EPA) OEN (EPRI). The EMEFS data set consists of data from 129 stations over the eastern US and Ontario, Canada from 6/88 - 5/90 (Figure 6). The particulate data were collected over 24 hour periods using filter pack techniques, and analyzed for ions. The data set contains the concentrations, minimum detection limit, and data quality flag. The data were obtained from the Electric Power Research Institute (EPRI) at: http://src.com/~epriasdc/index.htm.

TVA

The Tennessee Valley Authority (TVA) network consists of 9 monitoring sites in Tennessee and surrounding states (Figure 6). PM2.5 and PM10 samples were collected every 6th day using dichotomous samplers from 5/80 – 9/87. Only concentration values were available. The data were obtained from the Tennessee Valley Authority.

DATA PROCESSING AND QUALITY CONTROL

In order to merge the data from the different networks the data were passed through a set of standardization routines homogenizing the data formats and metadata. The standardization process included:

No quality control of the data beyond that done by the supplying organization has been performed on the data sets. However, as the data are used and problems identified appropriate procedures to remedy the problems in the data sets will be conducted.

DISCUSSION

The integration process has completed the first steps of homogenization, description and finally integration of the data from the multiple networks. This integration process has grouped together data collected using different samplers, sampling duration, sample analysis techniques and some of the networks corrected the data to a reference temperature and pressure while other did not. The next step in the integration process will be to assess the impact of these network variations on the PM concentrations and possibly adjust some the data to account for these network variations. However, the North American Fine Particle Database is still a rich resource for studying and addressing fine particulate issues. These data have already become the foundation of several analyses that are available on EPA's PM2.5 Analysis Workbook - Virtual Workgroup Web site at http://capita.wustl.edu/databases/userdomains/pmfine/.

ACKNOWLEDGMENTS

This project is supported by EPA’s Office of Air Quality Planning and Standards (OAQPS). The authors would like to thank all of the data suppliers who help in getting us the data and providing us with assistance in translating and describing the data sets.

1. Sisler, J.F.; Huffman, D.; Latimer, D.A.; Malm, W.C; Pitchford, M. Report #ISSN No. 0737-5352-26 CIRA, CSU, Fort Collins, CO., 1993.

  1. Poirot, R.L.; Galvin, P.J.; Gordon, N.; Quan, S.; Arsdale, A.V.; Flocchini, R.G. "Annual and seasonal fine particle composition in the Northeast: Second year results from the NESCAUM monitoring network" presented at the 84th Annual A&WMA Meeting; Vancouver, Canada, 1991, Paper No. 91-49.1.
  2. R.J. Tropp, S.F. Sleva, W. Ramadan, C.J. Harris, and N.J Berg, Jr., Results of the 1994 Philadelphia PM2.5 and PM10 Saturation Study, Presented at the 89 Annual Air & Waste Management Meeting & Exhibition, 96-MP3.03, Nashville, TN, June 23-28, 1996.
  3. Vasconcelos, L.A.; Aerosol and Transport Climatology at the Grand Canyon; Doctoral Dissertation, Washington University, St. Louis, MO 1995.

5. Husar, R.B., Frank, N.H. Interactive Exploration and Analysis of EPA's Aerometric Information and Retrieval System (AIRS) Data Sets. Air & Waste Management Association 84th Annual Meeting, June 16-21, Vancouver, BC., 1991.

6. NAtChem. The National Atmospheric Chemistry Database For Particles and Related Trace Gases / Toxics website: http://airquality.tor.ec.gc.ca/natchem/particles/

APPENDIX

Data File Format

The data set consists of a main data table containing the fine mass concentration and flag values, and location and variable tables which describe the monitoring sites and variables. The data are available in three file formats: Fixed Length ASCII, Voyager5, and Microsoft Access

Data Table

A sample Data table is presented in Table A-1. The first two columns, Loc_Code and Date are the Key or dimensional fields identifying the monitor and sample date and time for each data record respectively. Additional information about the monitoring site is contained in the Location table. Each of the remaining columns contain the data for a single variable. The variable code (Var_Code ) is used for the column name, additional variable metadata is located in the Variable Table.

Location Table

The Location table is made up of the location code (Loc_Code) followed by the location name, Longitude and Latitude (Table A-2). Each station is assigned a unique Loc_Code based upon the "Station ID". The Station ID format is based on the format used in the Canadian's NAtChem – Particle database6. Encoded in the location code is the network as well as location information (see Table A-3). The Loc_Code for monitoring sites with AIRS site codes use characters 1-4 for the Network abbreviation and 5-13 for the AIRS site code. The Location table’s StationID field contains the original location identifier that comes with the database. The Location Name is made up of the Location Code, the station name, elevation and sampling starting date.

Variable Table

The Variable table is made up of the variable abbreviation (Var_Abbr), variable descriptive information (Table A-4). Each variable is assigned a unique Var_Abbr based upon the species, attribute (concentration or flag) and sampling cut point. Table A-5 lists the meaning of each character in the Var_Abbr.

TABLES

Table 1. Data sets processed.

Network

Network Abbr.

Data Source

# Sites

Time Span

PM Variables

Interagency Monitoring of Protected Visual Environments (IMPROVE)

IMPR

University of California Davis
caesar.ucdavis.edu

93

3/88 – 2/98

PM2.5, elemental composition, organics, ions, bab; PM10

Northeast States for Coordinated Air Use Management (NESCAUM)

NESC

University of California Davis
caesar.ucdavis.edu

11

9/88 – 11/93

PM2.5, elemental composition, organics, ions, bab

Guelph Aerosol and Visibility Monitoring program (GAViM)

GAVM

University of Guelph, Ontario
http://www.physics.uoguelph.ca/PIXE/airq/airq.html

4

6/94 – 12/97

PM2.5, elemental composition, bab

National Park Service’s Stack Filter Units (SFU)

SFU

National Park Service

80

7/79 – 11/93

PM2.5, elemental composition, bab; and Coarse PM mass and elemental composition

Measurement of Haze and Visual Effects (MOHAVE)

MOHA

National Park Service

43

1/10/–2/15/92
7/11/–9/2/92

PM2.5, elemental composition, organics, ions, bab; PM10

Pacific Northwest Regional Visibility Experiment Using Natural Tracers (PREVENT)

PREV

National Park Service

34

6/90–9/90

PM2.5, elemental composition, bab

Winter Haze Intensive Tracer Experiment (WHITEX)

WHIT

National Park Service

13

1/1/87 – 2/18/87

PM2.5, elemental composition, organics, ions, bab

National Air Pollution Surveillance Network (NAPS)

NAPS

Environment Canada
http://airquality.tor.ec.gc.ca/rdmq

29

1/90 – 12/96

PM2.5, elemental composition, organics, and coarse PM mass and elemental composition

California Air Resource Board (CARB)

CARB

California Air Resource Board
http://www.arb.ca.gov/aqd/aqd.htm

26

1/89 – 8/97

PM2.5, elemental composition, and coarse PM mass and elemental composition

Aerometric Information Retrieval System (AIRS)

AIRS

EPA

119

1/85 – 12/97

PM2.5

Clean Air Status and Trends Network (CASTNet) Visibility Chemistry

CAST

EPA - National Exposure Research Lab (NERL)

12

10/93 – 12/97

PM2.5, elemental composition, organics, ions, bab

Clean Air Status and Trends Network (CASTNet) Dry Deposition

CAST

EPA - National Exposure Research Lab (NERL)

96

1/87 – 12/97

Ions

National PM Research Monitoring Network

NPMR

EPA - National Exposure Research Lab (NERL)

2

3/95 – 12/97

PM2.5, elemental composition, organics and coarse PM mass and elemental composition

Philadelphia Saturation Study

PHLS

EPA

16

9/11/94 – 10/9/94

PM2.5, PM10

Philadelphia 1992- 1995 Study

PHCO

EPA

1

12/5/92 – 12/4/95

PM2.5, coarse PM mass, total PM mass

Subregional Cooperative Electric Utility, Department of Defense, National Park Service, and EPA study (SCENES)

SCEN

EPRI
http://src.com/~epriasdc/index.htm

7

11/84 – 10/89

PM2.5, elemental composition, organics, bab

Eulerian Model Evaluation and Field Study (EMEFS)

EMEF

EPRI
http://src.com/~epriasdc/index.htm

129

6/88 – 5/90

Ions

Tennessee Valley Authority (TVA)

TVA1

Tennessee Valley Authority

9

5/80 – 9/87

PM2.5; PM10

Table A-1. A sample Data Table.

Loc_Code

Date

MF_cf

MF_ff

NAPSCANS1HAL

1/5/90

14.8

0

NAPSCANS1HAL

1/11/90

13.1

0

NAPSCANS1HAL

1/17/90

21.3

0

NAPSCANS1HAL

1/23/90

17.5

2

NAPSCANS1HAL

1/29/90

9.1

0

 

Table A-2. A sample Location Table.

Loc_Code

Loc_Name

Loc_Lon

Loc_Lat

IMPRUSUT1ARC

IMPRUSUT1ARC__Arches National Park;_Devils Garden Campgr_1722_03/02/1988

-1.9127980

0.67681439

IMPRUSSD1BAD

IMPRUSSD1BAD__Badlands National Park;_Park Headquarters___760_03/02/1988

-1.7791979

0.76346988

IMPRUSNM1BAN

IMPRUSNM1BAN__Bandelier National Monument;_Fire tower____2000_03/02/1988

-1.8546591

0.62464351

IMPRUSTX1BIB

IMPRUSTX1BIB__Big Bend National Park;_3 miles SE of Pant_1067_03/02/1988

-1.8009951

0.51186132

IMPRUSCA1BLI

IMPRUSCA1BLI__Bliss State Park(TRPA);_1/4 mile beyond he_2043_11/17/1990

-2.0961399

0.68038739

Table A-3. The Loc_Code encoding definitions.

Characters

Definition

1-4

Network Abbreviation

5-6

Country

7-8

State/Province

9

Monitor number

10-12

Location Name Abbreviation

Table A-4. A sample Variable Table.

Var_Abbr

Var_Desc

Units

Species_Abbr

Species_Name

Attribute

Cut Point

MF_cf

Fine Mass Concentration

ug/m3

MF

Fine Mass

Concentration

<2.5 um

MF_ff

Fine Mass Flag

ug/m3

MF

Fine Mass

Flag

<2.5 um

Table A-5. The Var_Abbr encoding definitions. The meaning of the abbreviations and codes are listed in the Variable Table.

Characters

Definition

1-3

Species Abbreviation

4

Attribute

5

Cut Point

FIGURES

Figure 1. Monitoring site locations and time trends for the North American Fine Particle Database.

Figure 2. Monitoring site locations and time trends for the IMROVE, NESCAUM, and GAViM particulate networks

Figure 3. Monitoring site locations and time trends for the NPS-SFU, MOHAVE, PREVENT, and WHITEX particulate networks

 

Figure 4. Monitoring site locations and time trends for the NAPS, CARB, and AIRS particulate networks

Figure 5. Monitoring site locations and time series for the CASTNet -Dry Deposition, CASTNet Visibility Chemistry, Philadelphia saturation Study, Philadelphia 1992 - 1995 study and National PM Research Monitoring networks

Figure 6. Monitoring site locations and time series for the SCENES, EMEFS, and TVA particulate networks.