Los Alamos Message Passing Interface

Los Alamos Message Passing Interface is an implementation of the Message Passing Interface (MPI).
Download

Los Alamos Message Passing Interface Ranking & Summary

Advertisement

  • Rating:
  • License:
  • LGPL
  • Price:
  • FREE
  • Publisher Name:
  • Advanced Computing Laboratory
  • Publisher web site:
  • http://public.lanl.gov/lampi/

Los Alamos Message Passing Interface Tags


Los Alamos Message Passing Interface Description

Los Alamos Message Passing Interface is an implementation of the Message Passing Interface (MPI). Los Alamos Message Passing Interface is an implementation of the Message Passing Interface (MPI) motivated by a growing need for fault tolerance at the software level in large high-performance computing (HPC) systems.This need is caused by the vast number of components present in modern HPC systems, particularly clusters. The individual components -- processors, memory modules, network interface cards (NICs), etc. -- are typically manufactured to tolerances adequate for small or desktop systems.When aggregated into a large HPC system, however, system-wide error rates may be too great to successfully complete a long application run. For example, a network device may have an error rate which is perfectly acceptable for a desktop system, but not in a cluster of thousands of nodes, which must run error free for many hours or even days to complete a scientific calculation.LA-MPI has two primary goals: network fault tolerance and high performance. Network fault tolerance is acheived by implementing a highly efficient checksum/retransmission protocol. The integrity of delivered data is (optionally) verified at the user-level using a checksum or CRC. Data that is corrupt (or never delivered) is retransmitted.As for high performance, LA-MPI's lightweight checksum/retransmission protocol allows us to achieve low latency messaging. Furthermore, the flexible approach taken to the use of redundant data paths in a network-device-rich system leads to high network bandwidth since different messages and/or message-fragments can be sent in parallel along different paths. Also, since LA-MPI is developed for use on the the large systems at Los Alamos National Laboratory we have verified that LA-MPI is scalable to over 3,500 processes.An alternative solution to the network fault tolerance problem is to use the TCP/IP protocol. We believe, however, that this protocol -- developed to handle unreliable, inhomogeneous and oversubscribed networks -- performs poorly and is overly complex for HPC system messaging, and that LA-MPI's lightweight checksum/retransmission protocol is a more appropriate choice.Here are some key features of "Los Alamos Message Passing Interface":· Standard compliant (MPI version 1.2 integrated with ROMIO for MPI-IO)· Highly portable· Open source (LGPL)· Thread safe· Optimized for SMP systems, including NUMA architectures· Network fault tolerant (data integrity checked at user level)· Message-fragment striping across multiple network devicesWhat's New in This Release:· Namespace conflicts have been fixed.· Error detection and handling of fragments has been improved.· Bugs in memory barriers and spinlocks for x86 and x86_64 architectures have been fixed.· Profiling and backtracing support have been added.· Asynchronous I/O has been disabled by default as a workaround for problems with some filesystems.· Minor timeout bugs have been fixed.


Los Alamos Message Passing Interface Related Software