How to Build a Beowulf:
Introduction

by Thomas Sterling, Paul Angelino, Don Becker, Jan Lindheim, & John Salmon
It is possible today to assemble a collection of commodity mass market hardware
components and freely available software packages in a day and be executing real world
applications by dinner time to achieve a sustained performance at greater than 5 GFLOPS at
a total cost of around $50,000. Furthermore, on almost a daily basis, these numbers are
improving. All around the country, primarily in small science and engineering labs, groups
and departments are implementing these Pile-of-PCs to satisfy real needs in the near term.
Some are using Pile-of-PCs for carrying out science computations at a fraction of the cost
of conventional commercial machines delivering the same performance. Some universities are
using Pile-of-PCs as low cost parallel computers to teach parallel programming. Yet others
use Pile-of-PCs as an in situ experimental testbed to conduct computer science research.
Beowulf is a class of Pile-of-PCs that leverages those low cost mass market systems that
support Unix-like operating systems at low or no cost and for which source code is readily
available. Experience gained by those installing Beowulf-class systems has led to a
maturing of the techniques employed in implementation and application of such systems.
This document is intended to convey much of the detailed knowledge accrued by Beowulf
users to assist new people in the field as they too try their hand at homebrew
supercomputing.
A typical Beowulf system may comprise 16 nodes interconnected by 100 base T Fast
Ethernet. Each node may include a single Intel Pentium II or Digital Alpha 21164PC
processer, 128-512 MBytes of DRAM, 3-30 GBytes of EIDE disk, a PCI bus backplane, and an
assortment of other devices. At least one node will have video card, monitor, keyboard,
CD-ROM, floppy drive and so forth. But so fast is the technology evolving and the
price-performance and price feature curve changing that no two Beowulfs ever look exactly
alike. Of course, this is also because the pieces are almost always acquired from a mix of
vendors and distributors. The power of de facto standards for interoperability of
subsystems has generated an open market that provides a wealth of choices for customizing
one's own version of Beowulf or just maximizing cost advantage as prices fluctuate among
sources. Such a system will run the Linux operating system freely available over the net
or in low cost and convenient CD-ROM distributions. In addition, publicly available
parallel processing libraries such as MPI and PVM are used to harness the power of
parallelism for large application programs. A Beowulf system such as described here,
taking advantage of appropriate discounts, costs about $40K including all incidental
components such as low cost packaging and a generous assembly cost of $100/hr.
The Beowulf approach represents a new business model for acquiring computational
capabilities. It complements rather than competes with the more conventional commercial
systems supplier approach. Beowulf is not for everyone. Any site that would include a
Beowulf cluster should have a systems admin person already involved in supporting the
network of workstations and PCs that inhabit the workers' desks. The same talents and
expertise can be used to manage a Beowulf but a site without such skill base should
probably not follow the Beowulf path. Beowulf is a parallel computer. It will not just run
a uniprocessor "dusty deck" and benefit from all of the computing resources. A
site must expect to run parallel programs, either acquired from others or developed
in-house. A site without such skill base should probably not follow the Beowulf path.
Beowulf is loosely coupled and is a distributed memory environment. It runs message
passing parallel programs that do not assume a shared address space across processors. Its
long latencies require a favorable balance of computation to communication and code
written to balance the workload across processing nodes. A site without this class of
application codes should probably not follow the Beowulf path. But within the constrained
regime in which Beowulf is appropriate, the Beowulf path will always provide the best
performance to cost and often comparable performance per node to vendor offerings. This
means for restricted computing budgets, more science is done, more engineering problems
are solved, more students acquire experience. Beowulf assists parallel computer vendors by
providing a low entry level cost to parallel systems, expanding the role of parallel
computing and the number of people capable of using parallel computers. This will create a
larger customer base for vendor parallel computers in the long term.
This document comprises a set of notes and briefing charts intended as a tutorial on
the implementation and application of Beowulf-class clustered computers. It covers many
details related to the selection, acquisition, assembly, installation, use, and
programming of Beowulf-class systems. These notes were formulated by pratitioners in the
field, all of whom have direct hands-on experience, and some of whom have directly
contributed to the development of the enabling technologies that have made Beowulf-class
systems feasible.
- July, 1998
NEXT: Building a Beowulf System