allinea - Now part of ARM

Allinea MAP - C/C++ profiler and Fortran profiler for high performance Linux code

Allinea MAP - C/C++ profiler and Fortran profiler for high performance Linux code

performance profiler screen shots

Optimize code with clear, low-overhead profiling - in the leading parallel tool suite

Allinea MAP is the profiler for parallel, multithreaded or single threaded C, C++, Fortran and F90 codes.  It provides in depth analysis and bottleneck pinpointing to the source line.  Unlike most profilers, it's designed to be able to profile pthreads, OpenMP or MPI for parallel and threaded code.

Using it is easy - there's no need to instrument your code or remember arcane compilation settings. Just compile your code with -g and start as you would run the code normally:

    $ map my_application.exe

or for MPI users

    $ map mpirun -np 128 ./bt_128_C


Allinea MAP is part of the Allinea Forge tool suite and is tried and tested on everything from the world’s largest machines to embedded processors.

Results that make sense without a two-day training course

After your program finishes, Allinea MAP shows you the lines of source code that took the longest. With time spent computing in green and communicating in blue, it's the clearest way to see what actually happened during the run.

Josh Strodtbeck, Senior Research Engineer

I took one of our standard cases and ran it through MAP and could see all the bottlenecks.  I knew right away where to start chasing things down.

Josh Strodtbeck, Senior Research Engineer
Convergent Science

Sharing the Allinea Forge interface used also by the debugger Allinea DDT, many of the Allinea MAP views will be familiar - the source code, the parallel stack view and quick-open with autocomplete for file navigation all integrate seamlessly with performance views designed by the same team.

It all just works, as you'd expect from the leaders in HPC development tools.

Performance you can trust, from one core to tens of thousands

Unlike the classic trace-based generation of performance tools, Allinea MAP never drowns you or your file system in data.

Adaptive sampling rates combined with Allinea's leading on-cluster merge technology ensure that exactly the right amount of data is recorded - whether you run on a workstation for ten minutes or on a remote supercomputer with tens of thousands of processes for a week.

With Allinea MAP you never need to worry about whether you chose the right metrics or level of implementation. Everything is turned on, all the time, with just 5% wall-clock overhead.

Profiling that's refreshingly simple, surprisingly deep

Everything about Allinea MAP is designed to get out of your way and show the performance of your code. From the default selection of views to the adaptive sampling rate, we strive to take decisions about how to run a profiler away - leaving all the really interesting decisions about your code.

We haven't confused simplicity of use with shallowness, though. Allinea MAP lets you drill deep down into the performance of your code:

  • Check memory usage, floating-point calculations, OpenMP thread usage, MPI usage and power usage at a glance
  • Flick to the CPU view to see the percentage of vectorized SIMD instructions, including AVX extensions used in each part of the code
  • See how the amount of time spent in memory operations varies over time and processes - are you making efficient use of the cache?
  • Zoom in to any part of the timeline, isolate a single iteration and explore its behaviour in detail
  • Everything shows aggregated data, preferring distributions with outlying ranks labelled to endless lists of processes and threads, ensuring the display is as visually scalable as our industry-leading backend is.

Read profiling features in detail

Maximize the return on your HPC investment

Allinea provide a wide range of material and training courses to maximize the ROI of your hardware investment.

When coupled with our training programs, its rapid-result nature acts to increase performance awareness across your user base and to push more codes to reach higher levels of parallelism, increasing large-scale cluster utilization and providing documented justification for existing and future large-scale HPC investments.