570 likes | 921 Views
FWA for Noisy Optimization Problems. JunQi Zhang (张军旗) Department of Computer Science and Technology, Tongji University, Shanghai, China zhangjunqi@tongji.edu.cn. Content. Noisy Optimization Problem Resampling Methods Fireworks Algorithms From Resampling to Non-resampling in FWA
E N D
FWA for Noisy Optimization Problems JunQi Zhang (张军旗) Department of Computer Science and Technology, Tongji University, Shanghai, China zhangjunqi@tongji.edu.cn
Content • Noisy Optimization Problem • Resampling Methods • Fireworks Algorithms • From Resampling to Non-resampling in FWA • Novel Directions for Noisy Optimization
Classes of Uncertainties (A)Environmental uncertainty:Changing environmental and uncertain operating conditions (via the a-variable) f=f(x,a) (B)Input uncertainty:Design parameter tolerances and actuator imprecision to a certain degree of accuracy f=f(x+b,a) (C) Output uncertainty:Uncertainties concerning the observed system performance f’=f’[f(x+b,a)] • Beyer, H. G., Sendhoff, B., "Robust optimization–a comprehensive survey",Computer Methods in Applied Mechanics and Engineering, vol.196,no.33-34, pp. 3190–3218,2007.
噪声优化问题的优化目标 2019-TEVC-Robust Multiobjective Optimization via Evolutionary Algorithms
Input Uncertainty and Multi-fidelity • Input Uncertainty • 2018-TAC-Simulation Budget Allocation for Selecting the Top-m Designs with Input Uncertainty • 2019-TEVC-New Sampling Strategies When Searching for Robust Solutions • Multi-fidelity • 2018-TEVC-A Generic Test Suite for Evolutionary Multifidelity Optimization • 2019-TAC-Efficient Simulation Budget Allocation for Subset Selection Using Regression Metamodels
Noisy Optimization Problem • Without noise: is the number of dimensions. is the feasible region of . • With noises: • Additive: • Multiplicative:
Benchmark Functions • Wang, Handing, YaochuJin, and John Doherty,"A Generic Test Suite for Evolutionary MultifidelityOptimization",IEEE Transactions on Evolutionary Computation,vol.22,no.6,pp. 836-850,2018. • G.H.Wu, R.Mallipeddi, P. N. Suganthan,"Problem Definitions and Evaluation Criteria for the CEC 2017 Competition and Special Session on Constrained Single Objective Real-Parameter Optimization", Technical Report, 2016. • J. J. Liang, B. Y. Qu, P. N. Suganthan, et al.,"Problem Definitions and Evaluation Criteria for the CEC 2015 Competition on Learning-based Real-Parameter Single Objective Optimization",Technical Report, 2014. • J. J. Liang, B. Y. Qu,P. N. Suganthan, Alfredo G. H., "Problem definitions and evaluation criteria for the CEC 2013 special session on real-parameter optimization", IEEE Congress on Evolutionary Computation (CEC), 2013. • K. Tang, X. Li, P. N. Suganthan, Z. Yang and W. Thomas, "Benchmark functions for the cec' 2010 special session and competition on large-scale global optimization", Technical Report, 2010. • R. Mallipeddi, P. N. Suganthan, " Problem definitions and evaluation criteria for the CEC 2010 competition on constrained real parameter optimization ", IEEE Congress on Evolutionary Computation (CEC), 2010. • K. Tang, X. Yao, P. N. Suganthan, C. MacNish, Y. P. Chen, C. M. Chen, Z. Yang, "Benchmark functions for the CEC’2008 special session and competition on large scale global optimization", IEEE Congress on Evolutionary Computation (CEC), 2008. • P. N. Suganthan, N. Hansen, J. J. Liang, K. Deb, Y. –P. Chen, A. Auger, S. Tiwari, "Problem definitions and evaluation criteria for the CEC 2005 special session on real-parameter optimization", IEEE Congress on Evolutionary Computation (CEC), 2005.
Benchmark Functions on Large-Scale Optimization in CEC 2010 Original Additive Noisy Multiplicative Noisy
Effects of Noisy Fitness Evaluation • Undesirable Selection Behavior • A superior candidate may be erroneously believed to be inferior, causing it to be eliminated. • An inferior candidate may be erroneously believed to be superior, causing it to survive and reproduce. • Undesirable effects • The system does not retain what it has learnt. • Exploitation is limited. • Fitness does not monotonically improve with generation. • The learning rate is reduced. • 1.Di Pietro A, While L, Barone L, "Applying evolutionary algorithms to problems with noisy, time-consuming fitness functions", IEEE Congress on Evolutionary Computation (CEC), pp. 1254–1261,2004.
Learning and Optimization are Both Needed in Noisy Environments
Resampling Methods-ER • Equal resampling (ER) fitness Resample fitness 0 Average fitness
Resampling Methods-ERN • ERN • Evaluate all solutions for times • Allocate extra budget to top solutions fitness 0 re-evaluations Resample in the first round Resample in the second round Average fitness Particle
Resampling Methods-OCBA [1-2]: Idea • Optimal Computing-Budget Allocation • Maximize the probability of identifying the true optimal design • Maximize the simulation accuracy with a given simulation budget • Minimize the total number of simulation runs in order to achieve a desired simulation accuracy [1] [2]
Ordinal Optimization (by Ho et al. [1-2]) • Idea • “Order” is much more robust and easier against noise than “Value” • 80/20 rule: easier to identify the tallest guy then their heights in a classroom • Goal softening • Don’t insist on getting the “Best” but be willing to settle for the “Good Enough” • Recent pioneering algorithm • Optimal Computing-Budget Allocation (OCBA, by Chen et al. in 2000) [1] [2]
OCBA: Solution • Approach: Chen et al. (2000) • Common approximation to P{Correct Selection} denoted as (P{CS}) • Derive the allocation ratio based on Karush–Kuhn–Tucker (KKT) conditions • Asymptotically Optimal Solution
OCBA: Variants • JunQi Zhang, etc., " Approximately Optimal Computing-Budget Allocation for Selection of Best and Worst Designs", IEEE Trans. On Automatic Control,2015(SCI, EI). • JunQi Zhang, etc., " Approximate Simulation Budget Allocation for Subset Ranking ", IEEE Trans. On Automation Science and Engineering,2015(SCI, EI).
Resampling Methods-Learning Automata ? Reward Probability Set {0.10, 0.45, 0.84, 0.76, 0.20 ,0.40, 0.60, 0.70, 0.50, 0.30} Input: {0} Environment Estimator: de(i)=w(i)/z(i) {0.10, 0.45, 0.79, 0.82, 0.20 ,0.40, 0.60, 0.70, 0.50, 0.30} Chosen Probability {0.00, 0.10, 0.20, 0.50, 0.00, 0.00, 0.10, 0.20, 0.00, 0.00} Learning Strategy Automaton Output: α4 [1] K. S. Narendra and M. A. L. Thathachar, “Learning automata—A survey,” IEEE Trans . SMC-B ,1974 [2] M. A. L. Thathachar and K.R.Ramakrishnan, “Hierarchical system of learning automata”,IEEE Trans . SMC-B ,1981 [3]窦如静,何成武,华东计算技术研究所 , “学习自动机概述”,自动化学报,1984 [4] K. S. Narendra and M. A. L. Thathachar, “Learning Automata:An Introduction”. Englewood Cliffs,NJ:Prentice-Hall, 1989 [5] K. Najim and A. S. Poznyak, “Learning Automata: Theory and Applications”, New York: Pergamon, 1994 [6] Anastasios A. Economides, ”Multiple response learning automata”, IEEE Trans . SMC-B ,1996 [7] M. A. L. Thathachar and P. S. Sastry,”Varieties of Learning Automata: An Overview”. IEEE Trans . SMC-B ,2002 [8] Mohammad S. Obaidat, Georgios I. Papadimitriou and Andreas S. Pomportsis,”Guest editorial-Learning automata: Theory, paradigms, and applications”. IEEE Trans . SMC-B ,2002
Advantages of LA • Learning in the probability space instead of solution space • Be good at noisy environment • Structure and parameter of LA are self-adaptive in noisy environment
Roles of Action Probability P • Adaptively determines the amount of budget • Allocates the budget among actions to identify the optimal one
Learning Automata : CPrp Estimator: de(i)=w(i)/z(i) {0.10, 0.45, 0.79, 0.82, 0.20 ,0.40, 0.60, 0.70, 0.50, 0.30} First Step: Second Step: Action Probability {0.00, 0.10, 0.20, 0.50, 0.00, 0.00, 0.10, 0.20, 0.00, 0.00} [1] S. Mukhopadhyay and M. A. L. Thathachar, “Associative learning of boolean functions,” IEEE Trans . SMC-B , 1989.
Learning Automata : DPri 0.98 0.99 1.0 P(t) P(t+1) P(t+2) (a)Continuous case 0.98 0.99 1.0 P(t) P(t+1) P(t+2) (b)Discrete case [1] B. J. Oommen and J. K. Lanctôt, “Discretized pursuit learning automata”,IEEE Trans. SMC-B,1990 [2] Georgios I. Papadimitriou, “Hierarchical descerized pursuit nonlinear learning automata with rapid convergence and high accuracy”, IEEE Trans. TKDE, 1994 [3] B. J. Oommen and Mariana Agache, “Continuous and discretized pursuit learning schemes: various algorithms and their comparison”,IEEE Trans. SMC-B,2001
Learning Automata : DGPA(Discretized Generalized Pursuit Algorithm) Solution Space Solution Space [1] M. Agache and B. J. Oommen, “Generalized pursuit learning schemes: New families of continuous and discretized learning automata,” IEEE Trans. SMC-B, 2002 [2] Katja Verbeeck and Ann Nowe, “Colonies of learning automata,” IEEE Trans. SMC-B, 2002
Learning Automata : LELA • Mean of the action probability E{Pm(t)} of the optimal action versus t. (a) E1. (b) E2. (c) E3. (d) E4. (e) E5. • Variance coefficiencyVar{Pm(t)}/E{Pm(t)} of the optimal action versus t. (a) E1. (b) E2. (c) E3. (d) E4. (e) E5. • JunQi Zhang, etc., "Last-position Elimination-based Learning Automata", IEEE Trans. on Cybernetics (1区,IF: 7.384),vol. 44,no. 12, pp. 2484-2492, 2014(SCI, EI).
Learning Automata : FLA • JunQi Zhang, etc., "Fast and Epsilon-optimal Discretized Pursuit Learning Automata", IEEE Trans. on Cybernetics (1区,IF: 7.384),vol. 45,no. 10, pp. 2089-2099, 2015(SCI, EI).
Learning Automata : LA-OCBA • JunQi Zhang, etc., "Incorporation of Optimal Computing Budget Allocation for Ordinal Optimization into Learning Automata", IEEE Trans. On Automation Science and Engineering (2区, IF: 3.502), vol. 13,no. 2, pp. 1008-1017, 2016(SCI, EI).
Learning Automata : Stochastic Point Location • Deterministic Point Location Problem • Learning Mechanism (LM): a robot, an algorithm • Deterministic Point: * is constant • Environment: Guides an LM to left or right Target parameter * *
Stochastic Point Location Problem p • Stochastic Environment • Probability p 1-p Target parameter *
Target Parameter: Static or Dynamic? Static: * stays constant Dynamic: * changes
Learning Automata : SHSSL Symmetrical Hierarchical Stochastic Searching on the Line (SHSSL) • JunQi Zhang, etc., "Symmetrical Hierarchical Stochastic Searching on the Line in Informative and Deceptive Environments", IEEE Trans. on Cybernetics (1区,IF: 7.384),vol. 47,no. 3, pp. 626-635, 2017(SCI, EI). • JunQi Zhang*, etc., "Fast Variable Structure Stochastic Automaton for Discovering and Tracking Spatiotemporal Event Patterns", IEEE Trans. On Cybernetics (1区,IF: 7.384),vol. 48,no. 3, pp. 890-903, 2018(SCI, EI).
Example of SHSSL in Deceptive Environments • d = 3,p = 0.2 • * = 0.9123 and the green feedback is the truth. Step 1: Current node: Node{0,1} Feedback of environment: [R,L,L] Decision Table: Informative Decision: LeftChild Next node: Node{1,1} Informative Table
Example of SHSSL in Deceptive Environments Step 2: Current node: Node{1,1} Feedback of environment: [L,L,L] Decision Table: Informative Decision: Parent Next node: Node{0,1} Informative Table
Example of SHSSL in Deceptive Environments Step 3: Current node: Node{0,1} Feedback of environment: [L,R,R] Decision Table: Informative Decision: Parent Next node: Node{-0,1} Informative Table
Example of SHSSL in Deceptive Environments Step 4: Current node: Node{-0,1} Feedback of environment: [L,L,R] Decision Table: Deceptive Decision: RightChild Next node: Node{-1,2} Deceptive Table
Example of SHSSL in Deceptive Environments Step 5: Current node: Node{-1,2} Feedback of environment: [L,R,R] Decision Table: Deceptive Decision: LeftChild Next node: Node{-2,3} Deceptive Table
Example of SHSSL in Deceptive Environments Step 6: Current node: Node{-2,3} Feedback of environment: [L,L,L] Decision Table: Deceptive Decision: Parent Next node: Node{-1,2} Deceptive Table
Example of SHSSL in Deceptive Environments Step 7: Current node: Node{-1,2} Feedback of environment: [L,L,R] Decision Table: Deceptive Decision: RightChild Next node: Node{-2,4} Deceptive Table
Example of SHSSL in Deceptive Environments Step 8: Current node: Node{-2,4} Feedback of environment: [R,L,R] Decision Table: Deceptive Decision: RightChild Next node: Node{-3,8} Deceptive Table
Example of SHSSL in Deceptive Environments Step 9: Current node: Node{-3,8} Feedback of environment: [L,R,R] Decision Table: Deceptive Decision: LeftChild Next node: Node{-3,8} Deceptive Table
PSO-LA & LAPSO & PSO-OCBA Apply LA, SubsetLA, OCBA and SHSSL/ASS to PSO respectively • JunQi Zhang, etc., "Integrating Particle Swarm Optimization with Stochastic Point Location Method in Noisy Environment", IEEE International Conference on Systems, Man, and Cybernetics(SMC), Budapest, Hungary,pp. 145-150, 2016( EI). • JunQi Zhang, etc.,"A Learning Automata-based Particle Swarm Optimization Algorithm for Noisy Environment", IEEE Congress on Evolutionary Computation(CEC),pp. 141-147, 2015(EI,ISTP), (Corresponding Author).(Runner-Up Overall Paper Award). • JunQi Zhang, etc., "Integrating Particle Swarm Optimization with Learning Automata to Solve Optimization Problems in Noisy Environment", IEEE International Conference on Systems, Man, and Cybernetics(SMC), pp. 1451-1456, San Diego, CA, USA, 2014( EI).
Challenges for Noisy Optimization • Resampling costs much • Budget of resampling is still hard to predetermine • JunQi Zhang, etc., "From Resampling to Non-Resampling: A Fireworks Algorithm Based Framework for Solving Noisy Optimization Problems", The 8th International Conference on Swarm Intelligence(ICSI), Fukuoka, Japan,pp. 485-492, 2017( EI). (Best Paper)
The Proposed FWA for Noisy Optimization (FWANO) • Utilize the consensus of top sparks to substitute the best spark when deciding the sparks allocation. The “fitness” of the consensus: Firework Spark Top spark The consensus
The Consensus • When the amplitude is small, should be large, should be small. • When the amplitude is large, should be small, should be large. Firework Spark Top spark The consensus The number of top sparks: The number of resampling:
From Resampling to Non-resampling • The tendency of dynamic amplitude: • The change of and : , is the maximum evaluations is the current evaluations The number of top sparks The number of resampling 0 Generation
Algorithm FWANO Initialize n fireworks and evaluate their fitness f(xi) for B times while (stopping criterion not met) do Update the number of resampling B Update the number of top sparks N Compute the number of sparks Compute the amplitude for each firework for each firework do Generate explosion sparks Re-evaluate generated sparks Select the new firework independently Compute the consensus for each firework end for end while
Experimental Results • FWANOs perform: • better than algorithm without resampling methods. • CoFFWA deteriorates severely when the noise level increases. • better than algorithms with only resampling methods. • FWANO-ER outperforms CoFFWA-ER on 13 out of 15 functions • FWANO-ERN outperforms CoFFWA-ERN on 12 out of 15 functions • The idea of from resampling to non-resampling saves the resampling resources
Challenges for Uncertain Environments • Challenges • Resampling still costs too much • Whether an environments is noisy is uncertain • Degree of noise is unknown • Budget of resampling is hard to predetermine • Idea • Non-resampling • DEPSO(Dual-Environmental PSO) • DEFWA(Dual-Environmental FWA)
Motivation1: Group Decision-Making in Animal Societies • Decisions are made not by individuals alone, but groups collectively in animals. • Prior studies show that shared decisions, in which many individuals pooled their individual information, are advantageous • They are more likely to be correct than decisions made by one or a few “leaders”. L. Conradt and T. J. Roper, “Group decision-making in animals,” Nature, vol. 421, no. 6919, pp. 155–158, 2003. C. List, “Democracy in animal groups: a political science perspective,” Trends in Ecology & Evolution, vol. 19, no. 4, pp. 168–169, 2004.
Motivation2: A General Algorithm in Dual Environments Silver D , Hubert T , Schrittwieser J , et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play, Science, 2018, 362(6419):1140-1144.
DEPSO • is consensus of the top-k particles with best fitness in their current iteration • c is used to balance the exploration and exploitation. • The historical information,gbest/pbest, is no longer used and is replaced by a consensus • The exemplar/leader is not one/two individuals but through group decision • JunQi Zhang, etc.,"Dual-Environmental Particle Swarm Optimizer in Noisy and Noise-free Environments", IEEE Trans. On Cybernetics (1区,IF: 7.384),vol. 49,no. 6, pp. 2011-2021, 2019(SCI, EI).
Theoretical Analysis Noise-free Environment Noisy Environment PSO DEPSO