Parallel global optimization with the particle swarm algorithm
Present day engineering optimization problems often impose large computational demands, resulting in long solution times even on a modern high‐end processor. To obtain enhanced computational throughput and global search capability, we detail the coarse‐grained parallelization of an increasingly popu...
Saved in:
Published in | International journal for numerical methods in engineering Vol. 61; no. 13; pp. 2296 - 2315 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Chichester, UK
John Wiley & Sons, Ltd
07.12.2004
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Present day engineering optimization problems often impose large computational demands, resulting in long solution times even on a modern high‐end processor. To obtain enhanced computational throughput and global search capability, we detail the coarse‐grained parallelization of an increasingly popular global search method, the particle swarm optimization (PSO) algorithm. Parallel PSO performance was evaluated using two categories of optimization problems possessing multiple local minima—large‐scale analytical test problems with computationally cheap function evaluations and medium‐scale biomechanical system identification problems with computationally expensive function evaluations. For load‐balanced analytical test problems formulated using 128 design variables, speedup was close to ideal and parallel efficiency above 95% for up to 32 nodes on a Beowulf cluster. In contrast, for load‐imbalanced biomechanical system identification problems with 12 design variables, speedup plateaued and parallel efficiency decreased almost linearly with increasing number of nodes. The primary factor affecting parallel performance was the synchronization requirement of the parallel algorithm, which dictated that each iteration must wait for completion of the slowest fitness evaluation. When the analytical problems were solved using a fixed number of swarm iterations, a single population of 128 particles produced a better convergence rate than did multiple independent runs performed using sub‐populations (8 runs with 16 particles, 4 runs with 32 particles, or 2 runs with 64 particles). These results suggest that (1) parallel PSO exhibits excellent parallel performance under load‐balanced conditions, (2) an asynchronous implementation would be valuable for real‐life problems subject to load imbalance, and (3) larger population sizes should be considered when multiple processors are available. Copyright © 2004 John Wiley & Sons, Ltd. |
---|---|
Bibliography: | ArticleID:NME1149 NIH National Library of Medicine - No. R03 LM07332 Whitaker Foundation AFOSR - No. F49620-09-1-0070 istex:2643B5CB021DFD16F12ACDB0DFF347D82D3AFF26 ark:/67375/WNG-ZBHBSWW2-M Ph.D. ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 ObjectType-Article-2 ObjectType-Feature-1 E-mail: fregly@ufl.edu |
ISSN: | 0029-5981 1097-0207 |
DOI: | 10.1002/nme.1149 |