1 / 17

Analysis of Chromium Emissions Data

Analysis of Chromium Emissions Data. Nagaraj Neerchal and Justin Newcomer, UMBC and OIAA/OEI and Mohamed Seregeldin, Office of Air Quality Planning and Standards, EPA, RTP. Objective.

dmitri
Download Presentation

Analysis of Chromium Emissions Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of Chromium Emissions Data Nagaraj Neerchal and Justin Newcomer, UMBC and OIAA/OEI and Mohamed Seregeldin, Office of Air Quality Planning and Standards, EPA, RTP

  2. Objective • To develop a protocol (methodology) for obtaining confidence bounds for the “Mean Chromium Emissions” for each welding process and rod type combination. • Incorporate all the data, including the averages, to the best of our ability.

  3. About The Data • Three Welding Processes • GMAW, SMAW, FCAW • Three Rod Types • E308, E309, and E316 • Multiple Sources of Data • Some report individual measurements • Some report only averages without the original observations. • Units of reporting vary—all are converted to g/kg

  4. Summary Statistics Note: Summary Statistics based only on observations with single measurement.

  5. Combining Rod Types • Combine E308+E316 because of the similar technology and small sample size • Sample Sizes:

  6. Summary Statistics After Combing Data for Rod Types Note: Summary Statistics based only on observations with single measurement.

  7. Traditional Approaches • Assume Normality? • Normality is not a good assumption for this data set at all • Sample sizes are very small for certain combinations • Bounds obtained assuming normality give meaningless results (e.g. negative bounds) when the data does not follow normality • 95% Confidence Intervals for the Mean: Note: Summary Statistics based only on observations with single measurement.

  8. Traditional Approaches • Transform the data to normality • Optimal transformation for Total Chromium data is different from optimal for Chrom6 data. • It is hard to transform the confidence bounds back to the original scale (mean of the log is not the same log of the mean!) • Box-Cox Log-Likelihood Plots:

  9. Traditional Approaches • Weighted regression to incorporate the averages

  10. Traditional Approaches • Weighted Regression • Estimates have good properties (such as BLUE) in general—not only for normal data • But the confidence bounds are sensitive to the normality assumption, especially when the sample sizes are small as in our case.

  11. Traditional Approaches • Nonparametric Approaches? • Nonparametric approaches usually use ranks. When only averages are reported we completely lose the information regarding ranks. Therefore, means can not be incorporated into nonparametric approaches. • Bootstrapping? • Made popular by Bradley Efron in the 1980’s • Efron and Tibshirani (1993) • Millard, S. P. and Neerchal, N. K. (2000)

  12. Bootstrapping • What is Bootstrapping? • Resampling the observed data • It is a simulation type of method where the observed data (not a mathematical model) is repeatedly sampled for generating representative data sets • Only indispensable assumption is that “observations are a random sample from a single population” • There are some fixes available when the single population assumption is violated as in our case. • Can be implemented in quite a few software packages: e.g. SPLUS, SAS • Millard and Neerchal (2000) gives S-Plus code

  13. Bootstrapping - The Details Bootstrapping inference is based on the distribution of the replicated values of the statistic : T*1,T*2,….T*B. For example, Bootstrap 95% Upper Confidence Bound based on T is given by the 95th percentile of the distribution of T*s.

  14. Bootstrapping Single Tests Data Note: Columns in yellow represent the 95% upper confidence bound

  15. Bootstrapping the Combined Data • Group the data points according to the number of tests used in reporting the average, within each welding process and rod type combination. Then bootstrap within each such group. • i.e. for GMAW and E316: Note: Each color represents a separate group

  16. Bootstrapping - Results Note: Columns in yellow represent the 95% upper confidence bound

  17. Final Remarks • Normality assumption is not appropriate for either Total Chromium or Chromium6 data. • Weighted regression model can accommodate the averages into the estimates. • Bootstrapping the data seems to be a way to ensure that meaningful confidence bounds are obtained • More work is needed to study the robustness of Bootstrapping results with respect to some extreme values in the data

More Related