Novel Asynchronous Algorithms and Software for Large Sparse Systems (HPC Sandpit)

Project: Research

Description

The solution of large sparse systems, both linear and nonlinear is a key numerical technology underpinning many areas of computational science and engineering, including climate and environmental modelling, nuclear fusion, materials science and computational chemistry. The reliance of these and other application domains on sparse system solution means that they all face difficulties in achieving extreme scalability, since the underlying algorithms are highly synchronous. This project aims to develop more scalable numerical methods through the use of asynchronous iterative algorithms. In asynchronous iterations, the order in which components of the solution are updated is arbitrary and the past values of components that are used in the updates are also selected arbitrarily. This is a model for parallel computation in which different processors work independently and have access to data values in local memory. Coping with fault tolerance, load balancing, and communication overheads in a heterogeneous computation environment is a challenging undertaking for software development. In traditional synchronous algorithms each iteration can only be performed as quickly as the slowest processor permits. If a processor fails, or is less capable, or has an unduly heavy load, then this markedly impacts on iteration times. The use of asynchronous methods allows one to overcome many of the communication, load balancing and fault tolerance issues we now face and which limit our ability to scale to the extreme. An important feature of this project is the close coupling throughout the development of algorithms and software with the needs of two exemplar applications, along with the deployment and testing of prototypes in these applications. The applications are the design optimization of orthopaedic and dental implants and SmartGrids within power systems. Both applications need improved algorithms in order to solve their challenging problems on future parallel systems and they present linear systems with different characteristics, thus providing both a useful test bed for the software and a means to demonstrate during the project the benefits of the new algorithms.

Key findings

There are high expectations for the application of HPC in numerous areas, however, as a number of similar projects around the world confirmed, some optimization problems may be very difficult to translate in a parallel form that can benefit from HPC. However, new developments and a move towards possibly more decentralise operation may enable achieving better results regarding application of HPC in power system operation.
StatusFinished
Effective start/end date1/02/111/10/14

Funding

  • EPSRC (Engineering and Physical Sciences Research Council): £77,547.00

Fingerprint

Fault tolerance
Resource allocation
Computational chemistry
Dental prostheses
Communication
Orthopedics
Materials science
Linear systems
Scalability
Software engineering
Numerical methods
Fusion reactions
Data storage equipment
Testing
Design optimization