Our Alpha NT cluster was purchased to provide a platform for cost-effective scientific computing, as well as a system for developing underlying cluster technology.
The Beowulf project has focussed on Intel-Linux systems to keep costs low and provide flexibility. In order to provide cheap parallel cluster computing a mixture of Digital UNIX compilation and Linux compute nodes is the most effective route right now. However, it has recently been pointed out that statically linking the necessary DEC UNIX libraries to the executable to make them run under Linux is a breach of licensing if the Linux nodes do not also have DEC UNIX licenses. The special effects in the latest Titanic movie were generated on a cluster of 200 DEC Alphas running Windows NT (as file servers) and Linux (for computations) connected by 100Mbps ethernet (http://www.linuxjournal.com/article.php?sid=2494). By definition, Beowulf systems should run open-source operating systems, and in this respect our system is not a Beowulf.
There are several reasons why we have chosen to use Windows NT on our cluster.
The Alpha processor offers good price/performance for scientific applications. However, this is only true when a good, globally-optimising compiler is used. Our tests bear this out, with the Digital Visual FORTRAN compiler significantly outperforming the egcs/g77 compiler, as expected. Unfortunately, there are no good FORTRAN compilers for Alpha Linux at present (Q4 98).
In order to run the best compilers we are forced to choose between Windows NT and Digital UNIX. The latter option is costly though, both in terms of licensing and the need for specialist (eg: SCSI) hardware. This largely offsets the gains made in using commodity machines. We are pursuing the long term goal of delivering an effective remote and local parallel computing service directly under Windows NT.
Microsoft Developer Studio under Windows NT provides an excellent integrated development environment and increases productivity compared with using traditional, command-line tools and editors. Our graduate students find it painful to resort to a basic UNIX programming environment (on our SGI Origin, SP2 and CS2 systems) after long periods of time using MS Developer Studio.
Another major reason for using Windows NT is that industry is moving towards this platform across their enterprises, for a variety of reasons. Sooner or later they will be expecting to run their compute-intensive, parallel applications in this environment. As one of the aims of our groups is to foster academic-industrial partnership, we are happy to help make the transition from UNIX to NT a smooth one for those wishing to take that path.
Of course, nobody can ignore the continuing rise of Linux across the IT world, and we are proponents of the open-source philosophy. We are therefore maintain our cluster as dual-boot in order to take advantage of all available technologies to help make High Performance Computing more affordable for all those who want it.
Windows NT 4.0 is certainly quite different to UNIX in many respects. Many of its shortcomings are not surprising considering that it has only been in existence for a few years. Problems with stable remote logins and running graphical applications across the network are currently being tackled by us. The Microsoft NT Services for UNIX and Windows Terminal Server are aimed at overcoming these limitations and are under test at Southampton. Some security issues are being addressed in NT 5.0 which we are currently testing in Beta.
In terms of the MPI and PVM implementations there are a few serious problems. WinMPICH runs under Administrator accounts with full system privileges as it was originally intended to be run shared-memory on a single machine. It can also leave dead processes hanging on remote machines which must be killed off manually. The commercial version of WinMPICH, MPI/Pro is far more robust and has security and clean startup and shutdown mechanisms. It also directs I/O sensibly.
PVM currently requires pvmd3 daemons to be started manually on remote processes and fails to redirect I/O properly. As PVM is still in Beta some of these problems may yet be fixed.
We have ported several FORTRAN and C applications to Windows NT. In general it is straightforward, partly due to the comprehensive nature of the Digital Visual FORTRAN compiler. This supports legacy, F90 and F95 source code, and provides compatibility with most proprietary unformatted FORTRAN file formats (IEEE, Cray, VAX). We have only used the MS Visual C++ compiler for standard, ANSI C code, in terms of scientific applications.
The main problem is with file handling and UNIX Makefiles. Windows NT is not