Beowulf - A Challenge for Classical High-End Systems

Dr. Thomas Sterling

Center for Advanced Computing Research
California Institute of Technology
and
High Performance Computing Group
NASA Jet Propulsion Laboratory


Abstract

During the last decade of the last Millenium, the "killer micro" was heralded as the harbinger of the demise of big iron as represented by the sequence of Cray Research Inc. supercomputers culminating in the T90. A new generation of distributed shared memory MPPs such as the HP Exemplar family exploiting the industry investment in COTS microprocessors and memories largely replaced the custom architectures until the TOP500 list was dominated by them. But price-performance advantage was hard won, and the opportunities presented by the microprocessor were, at least in part, offset by custom designed internal communication and memory systems. Scalability and new performance thresholds were achieved at a dramatic pace but price-performance remained high, partly because of the modest high-end market and the challenge of complex system software. These systems emphasized low latency, high bandwidth, and in some cases, global address spaces (with or without cache coherency). They did not emphasize low cost. Yet the same killer micros that enabled scalable multiprocessors to reach a Teraflops performance, also with the help of an enormous mass market, put PCs on people's desks with performance of previous generation supercomputers, but at a cost of previous generation home entertainment equipment. And in the office, laboratory, and industrial environment, these PCs required networked interconnectivity. By the second half of the last decade, a new force in high end computing emerged, the Beowulf-class system exploiting the low cost not only of the parts, but of the subsystems they comprised for mass market computing.
Initial work with clusters of workstations such as the COW and NOW projects along with cycle harvesting of local area network resources suggested that even loosely coupled clusters could perform certain distributed computing tasks well. But it was the emergence of Beowulfs using really cheap PCs and LAN technology combined with the free Linux Unix-like operating system and community standard message passing libraries that broke the price-performance barrier and generated a new family of parallel high end computing systems. At first, these were "do it yourself supercomputers" but industry, both new and established, has taken on the challenge of building a business model around this mix of resources. Originally defying the early trends and dismissing the users as hobbyists, both the user and supplier communities have begun to find a way to satisfy both vendor and customer requirements through Beowulf-class systems and other PC cluster families. This presentation will address the opportunities afforded by Beowulfs motivating rapid acceptance and the challenges to industry to developing a commercial Beowulf product line. Creative marketing and management is finding that embracing this strategy need not be a threat to current commercial offerings but rather stimulate a rapid growth in market acceptance of parallel computing while opening up an entirely new level of product placement. Equally important is that industry participation as the premier supplier of Beowulfs is providing potential customers with an easier, high-confidence path to low cost distributed computing. A new synergy of technology and manufacturing is resulting in an explosion of user engagement in what had been a highly restricted domain only a few years ago with a range of new business opportunities resulting as well.