430 likes | 538 Views
A Bayesian Likelihood Display Approach to Enhancing Detection of Illicit Radioactive Materials at Border Crossings and Ports. Raja Parasuraman George Mason University Thomas Sanquist Pacific Northwest National Laboratory.
E N D
A Bayesian Likelihood Display Approach to Enhancing Detection of Illicit Radioactive Materials at Border Crossings and Ports Raja ParasuramanGeorge Mason UniversityThomas SanquistPacific Northwest National Laboratory IEEE Transactions on Systems, Man, and Cybernetics: Part C: Applications, in press
National Academy of Sciences PanelMaking the Nation Safer • 16 Technology Solutions offered • ALL involved • Human operators • Human-machine interfaces • Organizational changes • Selection and training issues • NO Human Factors analysis • 1 Panelist: Don Norman • 1 Presenter: Raja Parasuraman
Overview • Human Factors Problem: Radiation Portal Monitors (RPMs) • Have high false alarm rates and very low true threat detection rate • Create excessive workload for customs personnel • Lower operator trust and may create “cry wolf” effect • Theoretical approach • Bayesian theory • Signal detection theory and ROC analysis • Likelihood display concept • Field studies at Canada-US border crossings
Radiation Portal Monitors (RPMs) • Object (e.g. truck) passes through large panel of RPM • Detectors measure (gamma) and (neutron) radiation • Alarm results if radiation count exceeds threshold • If alarm occurs, secondary confirming scan conducted • Identification of alarm source • Resolution and discharge
Naturally Occurring Sources of Radiation (NORMs) • Ceramics • Fertilizer • Cat litter • Some fruits and vegetables (e.g., bananas) • Medical implants
Daily RPM Freight Activity at All Canada-US Border Crossings • Truck False Alarm Rate: 0.5 - 2% • Total Daily False Alarms in 2006: 468 • Typical Daily False Alarms at Single Crossing: 30-40 Source: North American Transportation Statistics Database, 2007
What is the Detection Sensitivity (d’) of RPMs? • With a known radioactive test source, RPM yields an alarm 95% of the time • Hit rate = 0.95 (International Atomic Energy Agency, 2002). • When no radioactive source is present , RPM alarms 2% of the time • False alarm rate = 0.02 • Detection sensitivity d = z[p(hit)) – z(p(false alarms)] = 3.72 • Good, but not great • Improved radiation detectors currently under development
What is the Sensitivity (d’) of RPMs for Classification of True Threats? • Detection performance refers to any radioactive source, including innocent radiation (NORMs) • When an illicit radioactive source is present, RPM alarms 95% of the time • Hit rate rate = 0.95 • When non-threatening radiation is present (NORM), RPM alarms 98% of the time • False alarm rate = 0.98 • Threat classification sensitivity d’ = 0
ROC Analysis Detection Classification
How Can Classification Sensitivity Be Increased? • Use more reliable algorithms for classification • However, even if d’ can be made higher, the low base rate of true threats poses a problem • Need to increase the posterior probability of a true threat given an alarm — positive predictive value (PPV) of alarm
Bayes’ Theorem Easy form: Odds in favor of hypothesis = Prior odds x Likelihood ratio of data H1 = Hypothesis H2 = Alternative hypothesis D = Data (Diagnostic information) P(H1) = Prior probability of hypothesis 1 P(H2) = Prior probability of hypothesis 2 P(H1/D) = Posterior probability of hypothesis 1 given data P(H1/D) = P(H1) x P(D/ H1) P(H2/D) P(H2) P(D/ H2) Reverend Thomas Bayes. (1763) "Essay Towards Solving a Problem in the Doctrine of Chances”. Philosophical Transactions of the Royal Society of London.
Underestimation of effect of base rates • H1 = Breast cancer • H2 = Normal • D = Positive mammography test • P(H1) = Population base rate of breast cancer = 1% = .01 • P(H2) = 1 - .01 = .99 • P(D/ H1) = Given cancer (H1), positive test (D) occurs 90% of the time • How well do people estimate the probability of cancer given a positive test?
P(H1/D) = P(H1) x P(D/ H1) P(H2/D) P(H2) P(D/ H2) Underestimation of effect of base rates • Prior odds = P(H1) / P(H2) = .1/.99 = .01 (1 in 100 chance) • P(D/ H1) = 90% correct = .9 • P(D/ H2) = 1 - .9 = .1 • Likelihood ratio = P(D/ H1) / P(D/ H2) = .9/.1 = 9 (9 to 1 chance) • Posterior odds of breast cancer = .01 x 9 = .09 or about a 1 in 10 chance • Most people misestimate the odds as closer to 9 to 1 than to 1 in 10
Base Rate Estimates for RPM • Nuclear smuggling incidents in Europe as monitored • by the International Atomic Energy Agency • 450 attempts in 10 years = 45/year * • 11,000,000 annual border crossings • IAEA Base rate = 45/11,000,000 = 0.000041 • United Nations Non-Proliferation Committee estimate • based on additional data on theft of weapons-grade plutonium from countries of former Soviet Union • UNNPC Base rate = 0.0000076 IAEA UNNPC * Orlov, V.A. (2004). Illicit nuclear trafficking and the new agenda. IAEA Bulletin, 46:1, 53-56.
RPM Posterior Probability of True Alarm • Posterior probability of a true threat given an alarm? • IAEA estimate: Posterior probability = 0.0000039 • UNNPC estimate: Posterior probability = 0.00000073 • Alarm rate = 0.5 - 2% of traffic at Canada-US crossings • True alarm: ONE every 2 to 6 years P(threat/alarm) = P(alarm/threat) P(alarm/threat) + P(alarm/no threat)([1-base rate]/base rate)
RPM Human Factors Problems Workload in resolving and clearing freight false alarms requires 30-45 minutes and two customs officers Staffing shortages exacerbate the workload problem Study conducted by Pacific Northwest National Lab at two Canada-US borders N=12 customs personnel (8-hour work shift) NASA-TLX workload and Trust (credibility) ratings of RPM at different periods during shift
How Can RPM Performance Be Enhanced? Setting a minimum decision criterion Setting a decision criterion that maximizes the posterior probability of a correct alarm response The second step is usually ignored in automated alerting system design
RECEIVER OPERATING CHARACTERISTIC (ROC) 1.0 ß = 0 .8 .6 Min. Decision Threshold ßf P(R|S) .4 .2 ß = ∞ 0 .2 .4 .6 .8 1.0 P(R|N) Max. Permissible False Alarm Rate f Step 1. Setting Minimum Decision Thresholds for Warning Systems: [Necessary but Not Sufficient]
Example Using Step 1 Requirement: For a warning system with d’ = 6, the false alarm rate should not exceed .001 Then the decision threshold for the system should be set such that ßf = 1.71 or higher For ßf =1.71, only .18% of hazardous events will be missed, and only .1% of non-hazardous events will be responded to falsely
1.0 0.9 0.8 0.7 0.6 Increasing ß 0.5 Posterior Probability p(S|R) 0.4 0.3 0.2 0.1 0.0 0.00 0.02 A Priori Probability p or Base Rate 0.04 0.06 0.08 0.10 POSTERIOR PROBABILITY FUNCTION
Despite the apparently impressive statistics of the warning system with d’ = 6, if the base rate or a priori probability p is low, then the posterior probability p(threat/alarm) can also be very low: If p = .001, then the posterior probability p(threat/alarm) = .49 Hence only 1 in 2 alarms will be true alarms. If p = .0001 the posterior odds of a true alarm are only 1 in 11!
Step 2. Setting Warning System Parameters for High Posterior Probability or Positive Predictive Value (PPV) 0.5 < p(threat/alarm) = PPV < 1.0 Determine decision criterion ßPPV that leads to a minimum posterior probability PPV
POSTERIOR PROBABILITY FUNCTION 1.0 0.9 0.8 0.7 Space of admissible 0.6 alarm performance ß = ß 0.5 PPV Posterior Probability p(threat/alarm) 0.4 Minimum required 0.3 posterior probability PPV 0.2 Minimum base 0.1 rate b 0.0 0.00 0.02 0.04 0.06 0.08 0.10 A Priori Probability p or Base Rate
Example Using Step 2 For a warning system with d’ = 6, the false alarm rate should not exceed .005, and the posterior probability PPV must exceed .8 Then, if the base rate b is .001, ßPPV must be at least 181.1 The posterior odds of a true alarm with these parameters will then be 8 in 10 However, even better performance could be obtained if d’ could be increased. But by how much?
Design Criteria for Increasing the RPM Predictive Value of a True Threat P(threat/alarm) = P(alarm/threat) P(alarm/threat) + P(alarm/no threat)([1-base rate]/base rate) [1] P(alarm/no threat) = K P(alarm/threat) [2] Where K = base rate (1- PPV) / PPV (1 - base rate) [3] d = z[p(alarm/threat)) – z(p(alarm/no threat)] [4] Equations 1-4 can be displayed as a family of functions for the required minimum d’ for a specified PPV and hit rate
Likelihood Displays as an Additional Approach Current radiation detection technologies cannot reach these required levels of d’ (6 and higher) Likelihood display concept (Sorkin & Kantowitz, 1988) Number of levels of uncertainty for optimal visualization not clear: N = 3, 4, 5….? In time-stressed conditions, not much benefit beyond 4 levels (Schinzer et al., 2000)
Likelihood Display Concept • Levels of resolution • Example: Danger (D) Uncertain (U) Safe (S) • D - S • D - U - C • D - D/U - U - U/S - S • Likelihood displays • improve user diagnostic performance (St. John & Mannes, 2001) • reduce user workload (Sorkin & Kantowitz, 1988) • How many levels needed? • ~ 4 may be sufficient (Schinzer et al., 2000)
Predictive Probabilistic and Temporal Conflict Avoidance Displays (courtesy of Jason Telner & Paul Milgram, University of Toronto)
RPM Likelihood Display Concept Use radiation energy spectra Use cargo manifest data Statistical processor Weighted threshold Bayesian classifier Neural network classifier 3 Level Display No material of concern Alert – Naturally occurring radioactive material Alarm – Radioactive threat material
Data Input for Likelihood Display • Energy spectra for NORMs • Background • Fertilizer • Tile • Energy spectra for threat material • Weapons grade Plutonium • Highly enriched Uranium • Cargo manifest data • Available on Dept. of Commerce and Dept. of Transportaion Databases • Unable to use: separate computer databases, cannot be accessed by Border customs computers
Energy Spectra for NORMs and Threat Material NORM classification at Middle frequencies Threat classification at Low frequencies WGPu = Weapons grade Plutonium HEU = Highly enriched Uranium
RPM Likelihood Display Radioactive threat material Naturally occurring radioactive material No material of concern
The Bottom Line: RPM Likelihood Display Performance • Test conducted during 7 month period in 2004 at 1 Crossing • Compared regular and Likelihood display RPM displays • Likelihood display RPM with Energy Window Ratio Processor • N=14 customs personnel • Paper records of cargo manifest analyzed post hoc • 1,740 alarms occurred • 1,617 NORMS • 123 Medical treatment of driver or passenger
Likelihood Display RPM False Alarm Performance • 88% reduction of false alarms (from 1,617 to 201) using Energy Window Ratio Likelihood Display • 100% reduction of false alarms (from 1,617 to 0) if we could have used Energy Window + Cargo Information (post-hoc analysis) • Medical alarms not affected significantly: these are surrogates for true threat materials
Conclusions • Current RPMs have high false alarm rates and very low true threat detection rates • RPM operators have high workload and low trust • A Bayesian signal detection theory approach can be used to set design criteria for RPM systems with high posterior probability of true threat detection • A likelihood display concept based on energy spectra and cargo data can enhance RPM performance • Field study confirms that Likelihood Display RPM associated with lower workload and higher trust
Future Work • Fuse computerized cargo manifest database with RPM system • Use Bayesian or neural network methods to classify threats based on energy spectra, cargo data, and other information • Expand likelihood uncertainty levels and examine different display candidates • Validate in field studies