Cloud HPC - Optimizing Cost and Performance to Maximize the Results
Public and private clouds are changing the shape of high performance computing (HPC) in many industries.
Cloud brings the ability to choose dynamically the size, the shape and the constituent parts of a HPC system. Choosing to pay on a daily or even hourly basis enables people to run simulations on demand or to accelerate (burst) beyond their own compute resources when the need arises.
Within the cloud, software performance and reliability have direct and immediate cost-implications.
Its important to run as efficiently as possible and be able to resolve problems as they arise.
The HPC industry relies on Allinea tools for HPC and multi-node software development and performance: the tools are available for cloud deployment too.
Optimizing the cost of simulation
The first step that all HPC cloud users must take is to optimize the cost of running simulations.
This is achieved by measuring and analyzing the performance of applications - which ensures that the performance is properly understood.
Allinea Performance Reports measures the key parameters inside an application that determine the cost of a simulation:
- Processor performance and utilization
- I/O and communication
- Peak memory requirements
- Application efficiency
It provides guidance and steps to take that can improve these parameters.
This is used to optimize the way an application runs: the instance types, number of instances and time required.
It also identifies which applications are suitable for cloud deployment - for example if communication is a dominant factor in execution time.
Using the performance tools Allinea MAP and Allinea Performance Reports led to a 66% reduction for one genomic application in Amazon (AWS) EC2 running costs.
Development and testing in the cloud
Developers are already using public and private cloud for software development.
- Independent Software Vendors (ISVs) deploy and test their software flexibly - resolving performance issues and bugs by creating right-sized cloud systems on demand.
- Research scientists that do not wish to hog precious internal HPC systems can choose to work on a separate cloud system whilst preparing software.
The development tool kit, Allinea Forge is designed to work with remote systems and enables developers to work as effectively on a local system as in the cloud.
Debugging with DDT enables the developer to fix application issues such as memory leaks, crashes or incorrect results.
Profiling with MAP enables performance to be optimized by developers - by analyzing down to the function call and source code:
- Communication time cost
- CPU usage - including the impact of hyperthreading if enabled
- Workload balance
- Memory usage
The tools enable the true impact of cloud deployment on the application performance to be seen and optimized for more results or less cost.