1 / 65

Industrial Diagnosis by Hyper Space Data Mining

Explore how hyperspace data mining revolutionizes industrial diagnosis with examples in steel making, gasoline production, and chemical plants. Discover solutions for improving product quality, resolving bottlenecks, and enhancing yield. Learn about MasterMiner™ demo and web-based diagnostic systems. Discover the evolution of diagnostic techniques and data mining technologies, such as correlation analysis, neural networks, and hyper space data mining. Uncover the applications of hyperspace data mining in optimizing processes and achieving better results in complex industrial scenarios. Stay ahead with cutting-edge tools like data loaders, factor analysis, and target-factor analysis provided by MasterMiner™ to streamline your industrial diagnosis processes.

karenortega
Download Presentation

Industrial Diagnosis by Hyper Space Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Industrial Diagnosis by Hyper Space Data Mining Presented at AAAI 99 Spring Symposiumon Equipment Diagnosis Stanford University March 23, 1999 Dr. Dongping (Daniel) Zhu Zaptron Systems, Inc.Mountain View, CA 94043Tel: 650-966-8700, Fax: 650-966-8780E-mail: zhu@zaptron.comhttp://www.zaptron.com Zaptron, 1999

  2. OUTLINE • Diagnosis overview: applications & technologies • Hyperspace data mining • Diagnostic examples • product quality control (steel making) • resolve bottleneck (gasoline production) • improve yield (chemical plan) • Conclusions • MasterMiner™ demo Zaptron, 1999

  3. Diagnosis &Trouble-Shooting • Cost of support to products/services • Customer satisfaction • Key Issues • how to best approach the same problem next time • how to use history information - data mining • how to update KB • Solutions • on-line help • web-based, remote diagnostics • knowledge management tools • data mining (history data are available) Zaptron, 1999

  4. D Mining KD(D+K) K Updating A Web-based Diagnostic System Call Centers Service Teams Support Teams Data Collecting Mechanisms Standardization Data Management Product Delivery Mechanisms Training tools Web-based diagnosis On-line Help SW Remote Repairs Factor Analysis KB manage Zaptron, 1999

  5. History Database Fault Physics Primary Cases Cause Analysis Fix Fault Diagnose Rule Base Diagnostic Matrix Self Learning Query New data & Cases Update Database Rule-based Diagnostic Process Zaptron, 1999

  6. Data Base {a, b} KB {Mij} Web Users Interviewer (fi, hj) WebGUI K Collector (aijl, bikl) Analyzer, Visualizer KB Builder (Mijk) Problem Solver (Search Engine) Self Learner rijk Expert System Architecture Zaptron, 1999

  7. Evolution of Diagnostic Techniques • Equipment and Processes • Sensors • Data • Databases • Data Models • Data Patterns (behavior in space) • Data Fusion, sensor fusion • Data Mining • Data …… Zaptron, 1999

  8. Data Mining: Techniques • Correlation/association analysis • Factor analysis • Trend prediction & forecasting • Neural networks • Genetic algorithms • Fuzzy logic, expert systems • Uncertainty reasoning (DS, rough sets) • Bayessian Networks • Hyper space data mining - • find data pattern first • no model assumption • provide solutions to failure isolation/recognition Zaptron, 1999

  9. Hyper Space Data Mining • Introduction • Diagnosis - An optimization problem • A Hyper Space Technology • Application Examples • SW: MasterMiner™ Zaptron, 1999

  10. A General Issue • For any system - find a model to describe Relation ships Nonlinear High noise M-variant (no model) Operating data record In situ sensor report Raw materials composition Design/operating process parameters Failure & fault Bottle neck Energy use Cost/risk Quality Yield/returns Reliability Productivity Zaptron, 1999

  11. A Catch 21 Problem Data Pattern <--?--> Data Model Questions: • what type of data to collect • which data to use in modeling Solution: • Hyperspace data mining Zaptron, 1999

  12. To Start - A Real Case Aluminum Production Problem Target: to Optimize the Leaching Rate of Al2O3 Factors: • a1 - Fe/Al in the ore • a2 - Sodium Na/(Al2O3+Fe2O3)) • a3 - leaching temperature • a4 - lime (CaO)/(SiO2-TiO2) 2 Solutions: • Principal Component Analysis (PCA) by SAS JMP or RS/1 - bad • Hyperspace data mining by Zaptron MasterMiner™ - good result Zaptron, 1999

  13. Can you see the pattern? • If not, do data mining to separate into subspaces Zaptron, 1999

  14. A Real Case - PCA Result: no separation Zaptron, 1999

  15. A Real Case - MasterMiner: good separation Zaptron, 1999

  16. MasterMiner2nd step: complete separation Zaptron, 1999

  17. A Real Case - MasterMiner: build a model Zaptron, 1999

  18. Steps in Data Mining History Data Separability Test Pretreatment: local view, delete outliers Linearity, topological type, correlation, association, best matching point, NN points Data Mining Feature reduction (entropy, voting) Feature Selection Inequality, equations, PLS, sensitivity, advisory Modeling (PH, MREC, ANN, GA) State diagnosis by using current operation data Extrapolation to optimal zone for max yield Propose an optimal operating condition or new materials Equations as criteria for optimal control Map description of cross-sections of normal op zone & failure zones Zaptron, 1999

  19. Clustering - Data Separation PCA - projection in the max separable direction Fisher: line projection with max distance between clusters MREC: projective geometry, better than either Data Base Data Mining Data Patterns One-sided (voting) Inclusive (entropy) Exclusive Sandwich Zaptron, 1999

  20. Software Architecture DataBase Pattern Recognitin GUI KnowBase Artificial Neural Nets Genetic Algorithm Zaptron, 1999

  21. MasterMiner™ Functions Zaptron, 1999

  22. MasterMiner™ Tools • Data loading, editing, sorting, calculation • Preprocessing: statistics, Feature selection, folding • Factor analysis • target-factor analysis • factor-factor analysis • Projections • Fisher, LMAP, PCA, PLS, MREC • Modeling • envelope, auto-box, Sphere, KL, ANN (train, estimation, sensitivity) • Extrapolation • PLS vector (linear),Simplex, appending, Zaptron, 1999

  23. Virtual Mining Tools for Convex and concave space • Virtual mining in hyper space • Hidden projection - tunnel model • Envelope - generate a convex polyhedron • Use “auto-box” for concave polyhedrons of samples • Interchange of data classes • Folding transform (to change data pattern in space) • Virtual mining of data samples • divide into multiple segments • convert concave polyhedron into convex ones • build the model for each subspace • separability went from 31% to 96% in one case Zaptron, 1999

  24. Virtual Mining Methods (b) The Envelop-Boxing method (a) Tunnel model to separate data samples in hyper space (c) Generate convex polyhedrons from a concave one Zaptron, 1999

  25. Iterative Feature Selection/Reducton • Data pattern classified into 2 topological classes • “one-sided class” • “inclusive class” • Hidden projections applied • Projected factors are orthogonal in hyper space • Feature selection method (highly effective): • Entropy method is used for inclusive pattern • Voting method is used for one-sided pattern • Reduce features to reduce noise & complexity • e.g., good result based on 5 features out of 500 • Reduced feature set needs to pass Separation test Zaptron, 1999

  26. MREC - Map Recognition Method MREC - Projection in the best direction, complete separation in 2 steps PCA: No separation Zaptron, 1999

  27. We have Improved the Quality of alloy steels carbon fiber reinforced, resin-based composite materials Bi2O3-containing High Tc superconductors rare earth containing phosphor electrode materials of Ni/H batteries VPTC ceramic semi-conductor high temperature, SiC-based structural ceramics high-polymers: PVC, synthetic fiber & rubber, polyethylene, ... high energy materials semi-conductor devices MOCVD method of III-V compound film Zaptron, 1999

  28. We have applied MasterMiner™ to Industrial Optimization & Diagnosis • Petrochemical industry • distillation • hydro-cracking • vapor recovery • platinum reforming • delayed cooking • de-waxing • vinyl acetate • polypropylene • jet fuel (Union Oil recipe, yield 87% -> 94%, +6,000 ton/yr) • increase life of catalyst in polyvinyl plant (catalyst cost $1.2MM) • etc. Zaptron, 1999

  29. We have applied MasterMiner™ to Industrial Optimization & Diagnosis • Metallurgical Industry • blast furnace • casting • alloy steels quality improving (60% -> 80%) • energy saving in aluminum production • Automobile Industry • electro-plating • heat treatment • Chemical Industry • PVC, polyformaldhyde • butadiene rubber Zaptron, 1999

  30. Data Mining Process Optimization Materials Design Application Areas Equipment Process Diagnosis Petrochemical Industry Metallurgical Industry Semiconductor Industry • GOAL: Optimal control of complex processes involving • Heat transfer • Mass transfer • Fluid flow • Chemical reactions Zaptron, 1999

  31. Pattern Recognition Methods • Linear Regression (LS) - “forced fitting” • LS fitting coefficients as model parameters, the “best wish” • PCA - principal component analysis • projection in “best” direction, select two directions, LS • LMAP - linear mapping • NN - neural nets • blind learning, over-fitting, forced fitting • origin at cluster center, covered with an ellipsoidal, PCA • MREC - map recognition (non linear) • polyhedrons, hidden projections, separation, back-mapping • NNREC - neural nets + MREC Zaptron, 1999

  32. Comparison of Various Methods CONDITION METHOD TO USE 1. (in some cases) Rule-based expert systems Mechanism known 2. (in 20% cases) Linear regression, statistical method Linear w/o noise 3. (in most cases) Hyper-space data mining Highly noisy Multi-variant Non Gaussian Zaptron, 1999

  33. good separation No separation Why not Principle Component Analysis (PCA) ? Principle Component Analysis (PCA)Data Mining by MasterMiner Linear nonlinear, Hierarchical Gaussian Non-Gaussian Low noise High noise Use all data in modeling Use subset of data in modeling 20 projections 2 projections Zaptron, 1999

  34. Why not Least Square Only ? PLSapplies whenPRESS < 0.3 (1/4 of cases in our practice) PROJECT PRESS (Error) synthetic rubber 0.2052 (can use PLS) steel plate for ship building 0.6419 (can not use PLS) rare earth phosphor 0.3067 Baoshan Iron & Steel 0.3441 Ni/H battery 0.7389 Ni/H materials 0.1932 propylene recovery (noisy data) 0.7755 propylene recovery 0.3752 solvent oil 0.3975 VPTC 0.1330 hydro-cracking plant 0.2055 methanol production 0.8255 casting for car 0.9157 Zaptron, 1999

  35. Wrong zone by ANN c Zone by MasterMiner b Why not Neural Networks (GA) Only ? • Over-fitting problem by NN (GA) • Industrial records are not complete • e.g. Leaching rate problem at an aluminum Co. • Leaching rate = f(a, b, c, T) • A cross-section of the • optimal zone: • by ANN: too large • by our Yield Mater™: smaller Zaptron, 1999

  36. Applications in Diagnosis • Equipment setup • steel making (roller distance, • oil refinery (bottleneck in gasoline production) • chemical plans (cooling pipe length, inlet position) • Process optimization • drug fermentation • environmental emission controls • materials manufacturing Zaptron, 1999

  37. Blasting furnace Steel making Casting Hot rolling Cold rolling E.g. 1 Steel Making ST14 steel plate for auto body • German equipment, yield 10,000 tons/yr • Problem - “deep pressing” property • 100 = 5x20 factors in 5 stages • 2 major factors: • N2 - Nitrogen content should be reduced • d1/d2- distance ratio of cold rollers increased • Benefit - wasted steel reduced by 5 times Zaptron, 1999

  38. 2nd issue: QC in ST14 Steel Plate Making Feed of Scrap, CaO, MgO, Iron Ore O2 blower Ladle Zaptron, 1999

  39. Problem Background • After each batch, samples were taken in a 3-min test for QC • Need to control the amount of O2 blown and scrap added • Japanese case-based reasoning SW --> 65% separability • Problem: ST14 quality is off-spec • We used MasterMiner to build a model for QC • Target: FC (C content in steels, 17-30% by customer spec) • 13 Factors • Model built and used to control product quality • Result: 100% separability, products are on-spec Zaptron, 1999

  40. Feature Selection Feature selectedProperty LY age of O2 gun (years)PLH height of O2 gun DYSLT O2 amount (m3) before sampling DYCD C content at sampling time (10-2 %) DYTEMP liquid iron temperature when sampling (C°) PCAO amount of CaO used PMGO amount of MgO added PORE amount of iron ore added WCH total charge of the converter in ton TOIRON total liquid iron SCAPT amount of scrap LDLIFE life of ladle used to transport liquid iron QO2 amount of O2 blown after sampling Zaptron, 1999

  41. 114 Sample Data Zaptron, 1999

  42. Target-Feature Maps Zaptron, 1999

  43. Data Separation by MasterMiner: 100% Zaptron, 1999

  44. Data Separation by PCA: 30% Zaptron, 1999

  45. Feature Selection (1) - Principle component regression Zaptron, 1999

  46. Feature Selection (2) - PLS (partial least square) Zaptron, 1999

  47. Feature Selection (3) - KW method (linear) Zaptron, 1999

  48. Tunnel Models: 32 Inequalities Zaptron, 1999

  49. Quality Control Issue • Solve the set of 32 equations • or use “appending” operation • assign values to uncontrollable factors • add N random samples • project them onto the N-dimensional space • select those falling into the optimal space • Results: • The C content of ST14products are on-specs Zaptron, 1999

  50. Add Random Samples (green) Zaptron, 1999

More Related