High Performance Computing

Commodity Compilers

[Commodity Hardware] [Commodity Windows NT] [Commodity Compilers] [Commodity Networking] [Commodity Project Status] [Commodity Links]

 

 

 

Alpha Single Node Performance

The 500MHz DEC Alpha EV56 workstations we have are gigaflop peak machines. For simple benchmarks it is a 100Mflop system:

  • Linpack (201) delivers 110 MFlops
  • Linpack (200) delivers 97 Mflops
  • Livermore Loops (geom. mean) delivers 103 MFlops

These were compiled with Digital Visual FORTRAN and run under Windows NT. The performance is almost identical to the same benchmarks run under Digital UNIX (without the KAP preprocessor). The overhead for running under Windows NT is negligible compared with UNIX, a surprising result for perhaps.

For real world application performance is similarly impressive. We used the Alpha cluster to perform partitioning of a 15 million element unstructured, tetrahedral grid, which requires 2 Gbytes real memory. Initially one IBM SP2 node with 256Mbytes of RAM on the Southampton machine was reconfigured to page off five SCSI disks to be able to handle this job. This took nine hours to complete and necessitated running in the overnight queue, and only on the specific reconfigured node.

The same job took six hours to complete on an AlphaNT node (with 256 Mbytes RAM) paging off a single EIDE drive. Reconfiguring the swap file took six mouse clicks and a reboot! We were able to do eight partitioning jobs in parallel overnight without having to fight through any queues. This highlights another advantage of commodity systems, they are affordable and can be installed at the research group level and managed locally.

Alpha Linux and NT Compiler Performance

At present the only Alpha Linux FORTRAN compiler is egcs/g77. This is not tuned for the Alpha hardware, and has limited optimisation capability. Tests using Livermore Loops show a big difference in performance compared with the Digital Visual FORTRAN compiler.

  • Livermore Loops (geom. mean) delivers 13 MFlops (cf: 102 MFlops using DVF)

Serial versions of the NAS Parallel benchmarks (Class W) that we have run show this trend. Intel Linux compilers, such as those from Portland Group and Absoft, show similar performance hikes. It is essential that a good optimising compiler is used to achieve good levels of price/performance.

More performance figures will be published soon.

Details of the current state of this work are given on the status page. Please feel free to contact ktakeda@soton.ac.uk regarding this research.