Return to CACR Home
About CACR

News &
Events
Research

Computing
Resources
Publications

Contact
Information/
People
Visitor's Center
Help Desk

 

tinycitlogo.gif (3120 bytes)

 

How to Build a Beowulf:
Introduction
bluerule.gif (857 bytes)

by Thomas Sterling, Paul Angelino, Don Becker, Jan Lindheim, & John Salmon

It is possible today to assemble a collection of commodity mass market hardware components and freely available software packages in a day and be executing real world applications by dinner time to achieve a sustained performance at greater than 5 GFLOPS at a total cost of around $50,000. Furthermore, on almost a daily basis, these numbers are improving. All around the country, primarily in small science and engineering labs, groups and departments are implementing these Pile-of-PCs to satisfy real needs in the near term. Some are using Pile-of-PCs for carrying out science computations at a fraction of the cost of conventional commercial machines delivering the same performance. Some universities are using Pile-of-PCs as low cost parallel computers to teach parallel programming. Yet others use Pile-of-PCs as an in situ experimental testbed to conduct computer science research. Beowulf is a class of Pile-of-PCs that leverages those low cost mass market systems that support Unix-like operating systems at low or no cost and for which source code is readily available. Experience gained by those installing Beowulf-class systems has led to a maturing of the techniques employed in implementation and application of such systems. This document is intended to convey much of the detailed knowledge accrued by Beowulf users to assist new people in the field as they too try their hand at homebrew supercomputing.

A typical Beowulf system may comprise 16 nodes interconnected by 100 base T Fast Ethernet. Each node may include a single Intel Pentium II or Digital Alpha 21164PC processer, 128-512 MBytes of DRAM, 3-30 GBytes of EIDE disk, a PCI bus backplane, and an assortment of other devices. At least one node will have video card, monitor, keyboard, CD-ROM, floppy drive and so forth. But so fast is the technology evolving and the price-performance and price feature curve changing that no two Beowulfs ever look exactly alike. Of course, this is also because the pieces are almost always acquired from a mix of vendors and distributors. The power of de facto standards for interoperability of subsystems has generated an open market that provides a wealth of choices for customizing one's own version of Beowulf or just maximizing cost advantage as prices fluctuate among sources. Such a system will run the Linux operating system freely available over the net or in low cost and convenient CD-ROM distributions. In addition, publicly available parallel processing libraries such as MPI and PVM are used to harness the power of parallelism for large application programs. A Beowulf system such as described here, taking advantage of appropriate discounts, costs about $40K including all incidental components such as low cost packaging and a generous assembly cost of $100/hr.

The Beowulf approach represents a new business model for acquiring computational capabilities. It complements rather than competes with the more conventional commercial systems supplier approach. Beowulf is not for everyone. Any site that would include a Beowulf cluster should have a systems admin person already involved in supporting the network of workstations and PCs that inhabit the workers' desks. The same talents and expertise can be used to manage a Beowulf but a site without such skill base should probably not follow the Beowulf path. Beowulf is a parallel computer. It will not just run a uniprocessor "dusty deck" and benefit from all of the computing resources. A site must expect to run parallel programs, either acquired from others or developed in-house. A site without such skill base should probably not follow the Beowulf path. Beowulf is loosely coupled and is a distributed memory environment. It runs message passing parallel programs that do not assume a shared address space across processors. Its long latencies require a favorable balance of computation to communication and code written to balance the workload across processing nodes. A site without this class of application codes should probably not follow the Beowulf path. But within the constrained regime in which Beowulf is appropriate, the Beowulf path will always provide the best performance to cost and often comparable performance per node to vendor offerings. This means for restricted computing budgets, more science is done, more engineering problems are solved, more students acquire experience. Beowulf assists parallel computer vendors by providing a low entry level cost to parallel systems, expanding the role of parallel computing and the number of people capable of using parallel computers. This will create a larger customer base for vendor parallel computers in the long term.

This document comprises a set of notes and briefing charts intended as a tutorial on the implementation and application of Beowulf-class clustered computers. It covers many details related to the selection, acquisition, assembly, installation, use, and programming of Beowulf-class systems. These notes were formulated by pratitioners in the field, all of whom have direct hands-on experience, and some of whom have directly contributed to the development of the enabling technologies that have made Beowulf-class systems feasible.

- July, 1998

NEXT: Building a Beowulf System