
Dear committee members:
Many of you are aware that our department's ``compute engine'', Erwin, is starting to show its age. When Erwin was bought, about four years ago, it was quite a fast machine. Computer technology has made a lot of progress since then, and Erwin is now somewhat slower than the average desktop PC. For comparison, a 200 MHz Pentium Pro PC is 1.3 to 2 times as fast as Erwin. In addition, Erwin has begun to have hardware failures. We have seen several mysterious crashes during the last year.
With this in mind, the main item in our HEET request this year is a successor for Erwin. For this purpose we would like to purchase a cluster of PCs (``nodes'') which will form a parallel supercomputer providing our department with over 1 GigaFlop of computing power. The technology behind this cluster was developed at NASA's Center of Excellence in Space Data and Information Sciences (CESDIS) and is currently in use at a number of educational and research institutions. The cluster we propose would increase our department's computing power by a factor of 20 to 30.
This system, called ``Beowulf,'' has several advantages over conventional supercomputers. First, it's cheap: a beowulf cluster at CESDIS has equaled the performance of an IBM SP2 with a comparable number of nodes, at less than one tenth the price. The use of commodity components allows one to shop around for the best price among a number of vendors.
Secondly, these systems are easily upgraded. New, faster, nodes can simply be dropped into place when they become available, and old nodes can be re-used as desktop PCs.
Besides NASA, other institutions with Beowulf clusters include Los Alamos National Lab, Caltech, Drexel University and Clemson University. Similar clusters, not based on Beowulf, are in use at the Max-Planck-Institut, DESY, Purdue, Cornell, Fermilab and the University of Mannheim. The widespread and apparently growing use of PC clusters provides a large base of experience from which we can draw. (See the references at the end of this memo.)
All of the users in our department would immediately be able to take advantage of such a cluster. Even tasks which are not easily parallelizeable would benefit, since each individual node is at least 30% faster than Erwin. Theorists with special interest in this cluster include a number of people who already use PVM (Parallel Virtual Machine), one of the components of Beowulf. Hank Thacker is very interested in these clusters because one of his collaborators at Fermilab has ported the CANOPY package to Beowulf. Hank already has a lot of code written for CANOPY, and would like to be able to run it here in our department. Experimentalists, such as Ralph Minehart, would benefit from the ability to analyze several large data sets in parallel.
We also ask that you consider these further improvements to our computing infrastructure during this HEET cycle:
In the following pages, I list the items requested, along with individual descriptions and cost justifications. The total cost of all items is $75,829.00 .

DESCRIPTION:
Note that I have listed the prices of the individual components, rather than the price of an assembled node, since it would be much cheaper to buy components and construct the nodes ourselves. If we were to buy assembled nodes, the price would be about 30% higher.
There are several different ways to construct such a cluster. The prices above are based on the Drexel Beowulf cluster. In this configuration, all sixteen nodes are plugged into a 16-port ethernet switch (see below). One of the nodes (the frontend) acts as a gateway between the cluster and the outside world, requiring a second ethernet card. Because of this additional function, we have requested a slightly faster CPU also.

DESCRIPTION:
This switch forms the backplane along which the nodes of the Beowulf cluster communicate. The switch and the ethernet cards in the nodes form an isolated 100 Megabit per second network. This network communicates with the outside world through a second ethernet interface in one of the nodes.

DESCRIPTION:
Given the exponential growth of the World-Wide-Web, and the ever-increasing importance of our department's web presence, our current web server is no longer adequate. The NT server described above would greatly increase the speed and reliability of our web site, as well as giving us a sophisticated set of web authoring and management tools.

DESCRIPTION:
Our current NT server, Newton, is adequate for the job it currently performs: providing file and print services for the departmental computer labs (rooms 315, 216 and 22). It does not have the capacity or reliability necessary in a departmental file server, though. We would like to purchase the NT server described above, featuring an advanced RAID disk controller which would give very fast disk access combined with fault tolerance. The automatic restart and temperature monitoring features would ensure that the server was restarted immediately after a crash, and protect against a failure of environmental control in the computer room. (Such failures have occurred several times during the past few years.) While the fault-tolerant RAID system allows the system to keep running without data loss when a single disk dies, the hot-swap disk system allows us to replace the dead disk without shutting down the server. If our goal is uninterrupted, reliable service, we need such a file server for our department.

DESCRIPTION:
To further improve the performance of the departmental file server, we would like to install a dedicated ``authentication/information'' server. Microsoft recommends that logins to an NT domain be managed by a computer dedicated to this function, not a file server. This computer could also act as an information server, supplying DHCP, DNS and WINS services.

DESCRIPTION:
The purchase of the eight additional computers described above would address two issues: the need for more seats in the microcomputer lab (room 315) and the need for better computers in the room 22 computer lab. There are currently 5 75MHz pentium computers in room 315. Room 22 has 3 75 MHz pentiums and 5 other computers of various types, all older and slower than the pentiums. By purchasing 8 new computers for room 315, we could replace the 5 older PCs in 22 with the computers currently in 315, giving both 22 and 315 eight matching computers each.
REFERENCES:

