The Beowulf Project at CACR
bluerule.gif (857 bytes)

CACR Research Home

The Tutorial

Beowulf-class computers utilize cost-effective, mass-market, multi-hyphenated, commodity off-the-shelf technologies to deliver scientific and engineering computational cycles at the lowest possible price. These systems exploit a confluence of trends: commodity silicon technology, including microprocessor performance and memory density, has improved tremendously in the past few years. Commodity networking, especially fast Ethernet at 100 megabits/sec, has made it  possible to design distributed-memory systems with tolerable bandwidths and latencies. Free operating systems, such as Linux, are available, reliable, well-supported, and are distributed with complete source code, encouraging the development of additional tools including low-level drivers, parallel file systems, and communication libraries. Industry standard parallel programming environments, e.g., MPI and PVM, are commonplace across the spectrum of high-end supercomputers, and are also available for and well-suited to Beowulf-class systems.

The Center for Advanced Computing Research at Caltech has been at the forefront of developing Beowulf systems and employing these low cost commodity parallel systems in a variety of scientific and engineering applications. As of August 2001, CACR houses four separate cluster machines.

Naegling

filmgrainwall.jpg (124770 bytes)Naegling is CACR's general use cluster available for development of basic MPI codes and for getting acquainted with cluster computing in a non-critical resource
environment. 

Naegling has 64 compute nodes, each with a Venus motherboard with 200 Mhz Pentium Pro, 128 MB RAM,  3.1 GB EIDE disk, and 100 Mb/s Ethernet adapter. The front-end machine has 128 MB extra RAM and 40 GB extra storage space. 

Naegling was decommissioned in 2003.

The ASCI Cluster 

The ASCI Cluster is dedicated to development of the Virtual Test Facility: small scaling runs, long production runs, and validating code's robustness/correctness.

The system consists of 2 servers: a login/development server that has two 1GHz Pentium III processors and a file server that has a single 1 GHz Pentium III processor and 500 GB in RAID5. For compute nodes, the system has 100 machines, each having an Intel Pentium III, running at 1 GHz, 1 GB of RAM and a 30 GB disk. 9 of these computers also have a high performance graphics adapters to be used in experiments with parallel high performance graphics. 

Both servers have gigabit Ethernet and the 100 compute node interconnect is 100 Base-T. The system has a total of 102 GB of RAM and 3.5 TB in storage capacity.  This new system will be used to host VTF3D development and production work.

More information on Caltech's ASCI Center can be found on the project website.

The Tier2 Prototype

With Caltech's High Energy Physics department, CACR has constructed the prototype "Tier2 Center" cluster.  Tier2 centers are part of the distributed data analysis model that has been adopted by CERN's Large Hadron Collider (LHC) experiments, due to start operations in 2006. The Tier2 prototype cluster will be used to evaluate the utility of such a large cluster in the context of computing for the Compact Muon Solenoid (CMS) experiment. 

The configuration for the prototype consists of Linux rackmounted computational nodes each with dual Pentium III CPUs and dual internal disks, a powerful dual 1GHz CPU data server with 2GByte of RAM, connected to twin RAID disk arrays each of 0.5 TBytes, and Gigabit network interfaces to CACR's high performance network. A second server has dual 1GHz processors, 1GB RAM and an integrated 1.2 TByte of disk in ATA RAID. All servers and the switch are interconnected with Gigabit Ethernet. The server provides high speed access to large samples of simulated, reconstructed and analyzed events from the CMS detector. 

The VizCluster

The VizCluster is currently being implemented to increase the size of a volume capable of being rendered, by merging results from individual PC's to create one single composite image. not only through regular ethernet but also using a devoted pixel bus for image transfer across the machines. 

The machine is a cluster of eight graphics workstations (Compaq SP750 running Windows2000) with Sepia-2 boards and VolumePro 500 ray casting accelerators. A ninth workstation functions as a display for the eight node cluster, with a standard OpenGL accelerator. System components are on loan from Compaq

Further information on the vizcluster see http://www.cacr.caltech.edu/projects/ldviz/pvr/