Parallel global optimization with the particle swarm algorithm

Present day engineering optimization problems often impose large computational demands, resulting in long solution times even on a modern high‐end processor. To obtain enhanced computational throughput and global search capability, we detail the coarse‐grained parallelization of an increasingly popu...

Full description

Saved in:
Bibliographic Details
Published inInternational journal for numerical methods in engineering Vol. 61; no. 13; pp. 2296 - 2315
Main Authors Schutte, J. F., Reinbolt, J. A., Fregly, B. J., Haftka, R. T., George, A. D.
Format Journal Article
LanguageEnglish
Published Chichester, UK John Wiley & Sons, Ltd 07.12.2004
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Present day engineering optimization problems often impose large computational demands, resulting in long solution times even on a modern high‐end processor. To obtain enhanced computational throughput and global search capability, we detail the coarse‐grained parallelization of an increasingly popular global search method, the particle swarm optimization (PSO) algorithm. Parallel PSO performance was evaluated using two categories of optimization problems possessing multiple local minima—large‐scale analytical test problems with computationally cheap function evaluations and medium‐scale biomechanical system identification problems with computationally expensive function evaluations. For load‐balanced analytical test problems formulated using 128 design variables, speedup was close to ideal and parallel efficiency above 95% for up to 32 nodes on a Beowulf cluster. In contrast, for load‐imbalanced biomechanical system identification problems with 12 design variables, speedup plateaued and parallel efficiency decreased almost linearly with increasing number of nodes. The primary factor affecting parallel performance was the synchronization requirement of the parallel algorithm, which dictated that each iteration must wait for completion of the slowest fitness evaluation. When the analytical problems were solved using a fixed number of swarm iterations, a single population of 128 particles produced a better convergence rate than did multiple independent runs performed using sub‐populations (8 runs with 16 particles, 4 runs with 32 particles, or 2 runs with 64 particles). These results suggest that (1) parallel PSO exhibits excellent parallel performance under load‐balanced conditions, (2) an asynchronous implementation would be valuable for real‐life problems subject to load imbalance, and (3) larger population sizes should be considered when multiple processors are available. Copyright © 2004 John Wiley & Sons, Ltd.
Bibliography:ArticleID:NME1149
NIH National Library of Medicine - No. R03 LM07332
Whitaker Foundation
AFOSR - No. F49620-09-1-0070
istex:2643B5CB021DFD16F12ACDB0DFF347D82D3AFF26
ark:/67375/WNG-ZBHBSWW2-M
Ph.D.
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
E-mail: fregly@ufl.edu
ISSN:0029-5981
1097-0207
DOI:10.1002/nme.1149