Caltech Center for Advanced Computing Research » Posts for tag 'high-performance computing'

CACR at SC09 in Portland

SC09Logo4cShadow

Visit us at Booth 2135!

At the 2009 Supercomputing (SC) Conference being held in Portland, Oregon November 14-20, CACR will be highlighting our research in computational biology, computing and networking for high-energy physics, data analysis for neutron scattering experiments, hypervelocity impact simulations, and time-domain astronomy. The SC Conference is the premier international conference for high performance computing (HPC), networking, storage and analysis.

Among the demonstrations at the CACR exhibit will be the Caltech entry in SC’s Bandwidth Challenge. The Bandwidth Challenge is an annual competition for leading-edge network applications developed by teams of researchers from around the globe. The Caltech entry for this year’s challenge is entitled Moving towards Terabit/sec Scientific Dataset Transfers: the LHC Challenge. This entry will demonstrate Storage to Storage physics dataset transfers of up to 100 Gbps sustained in one direction, and well above 100 Gbps in total bidirectionally, using a total of fifteen 10Gbps drops at the Caltech Booth.

Caltech’s PSAAP center will be represented in the NNSA exhibit as one of five centers of excellence focusing on predictive science. A talk entitled, “UQ Pipeline Ballistic Impact Simulations – Methods and Experiences”, will be given by Sharon Brunett in the NNSA exhibit (Booth 735) on Tuesday, November 17 at 5:15PM.

CACR Participation in Energy-Efficient HPC Working Group

Chip Chapman, CACR Facilities Manager, has joined the Energy Efficient High Performance Computing Working Group. This group, founded at SC08 in Austin, TX also includes participation from several national labs and other major HPC manufacturers and users.

The activities of this committee include:

  • Market pull strategies (collectively influencing vendors)
  • HPC/SC energy performance metrics and benchmarking
  • Computer center (infrastructure) energy performance metrics and benchmarking
  • Best practices, case studies, and lessons learned in design and operation of super computer centers
  • Energy efficient design guidelines and specifications for super computer centers
  • Improving software for energy efficiency
  • Integrating energy efficiency into SC09’s technical program and HPCC (energy challenge) – subject to organizer’s approval

CACR’s goals in participating in this initiative are to keep our infrastructure as efficient as possible and to help Caltech’s Facility department make informed choices when preparing upgrades or modifications to the existing and future HPC computer rooms on campus.

NSF Award: Development of a Research Infrastructure for the Multithreaded Computing Community Using the Cray XMT Platform

An award of $994,408 from the National Science Foundation was made to the project entitled “Development of a Research Infrastructure for the Multithreaded Computing Community Using the Cray XMT Platform.” The subcontract for Caltech/CACR (PI Ed Upchurch) will fund the porting of significant science applications to an XMT system. CACR will assess the XMT’s performance and compare it with the performance on other parallel architectures at CACR.

With the advent of MPI and Linux clusters, message-passing architectures are today the dominant approach for parallel computing systems, and the high-end computing community has developed a strong infrastructure to support this. With the trend towards multicore processors, however, the situation is changing. The major processor developers all envision placing tens to hundreds of cores on a single die, each running multiple threads. To take advantage of this, the CS community will need focus on how to develop efficient multithreaded programs in shared memory. The goal of the project is to bring a diverse group of researchers with extensive experience with shared-memory multithreading together as a community, and to jointly develop a shared infrastructure needed to broaden its impact for developing software to run on the next generation of computer hardware.

The first objective of the program is to acquire computer hardware as a shared community resource capable of efficiently running, in experimental and production modes, complex programs with thousands of threads in shared memory. The Cray XMT system, scheduled for delivery in the first half of 2008, is an ideal platform for this.

The second objective of the program is assembling a software infrastructure for developing and measuring the performance of programs running on this hardware. This will include algorithms, data sets, libraries, languages, tools, and simulators to evaluate architectural enhancements for future hardware.

The third objective of the project building stronger ties between the people themselves, creating ways for researchers at the partner institutions to collaborate and communicate their findings to the broader community.

The academic partners on the team are the University of Notre Dame, Georgia Institute of Technology, University of California, Berkeley, University of California, Santa Barbara, University of Delaware, and the California Institute of Technology. The team will also collaborate with Sandia National Laboratories, who has agreed to host the Cray XMT system and will provide supplementary funding.

For further information on the Caltech subcontract, contact Ed Upchurch

First Phase of TeraGrid Goes into Production

The first computing systems of the National Science Foundation’s TeraGrid project are in production mode, making 4.5 T eraflops of distributed computing power available to scientists across the country who are conducting research in a wide range of disciplines, from astrophysics to environmental science.

The TeraGrid is a multi-year effort to build and deploy the world’s largest, most comprehensive distributed infrastructure for open scientific research. The TeraGrid also offers storage, visualization, database, and data collection capabilities. Hardware at multiple sites across the country is networked through a 40-gigabit per second backplane — the fastest research network on the planet.

The systems currently in production represent the first of two deployments, with the completed TeraGrid scheduled to provide over 20 T eraflops of capability. The phase two hardware, which will add more than 11 T eraflops of capacity, was installed in December 2003 and is scheduled to be available to the research community this spring.

“We are pleased to see scientific research being conducted on the initial production TeraGrid system,” said Peter Freeman, head of NSF’s Computer and Information Sciences and Engineering directorate. “Leading-edge supercomputing capabilities are essential to the emerging cyberinfrastructure, and the TeraGrid represents NSF’s commitment to providing high-end, innovative resources.”

The TeraGrid sites are: Argonne National Laboratory; the Center for Advanced Computing Research (CACR) at the California Institute of Technology; Indiana University; the National Center for Supercomputing Applications (NCSA) at the University of Illinois, Urbana-Champaign; Oak Ridge National Laboratory; the Pittsburgh Supercomputing Center (PSC); Purdue University; the San Diego Supercomputer Center (SDSC) at the University of California, San Diego; and the Texas Advanced Computing Center at The University of Texas at Austin.

“This is an exciting milestone for scientific computing — the TeraGrid is a new concept and there has never been a distributed computing system of its size and scope,” said NCSA interim director Rob Pennington, the TeraGrid site lead for NCSA. “In addition to its immediate value in enabling new science, the TeraGrid project is a tool for the development of a national cyberinfrastructure, and the cooperative relationships forged through this effort provide a framework for future innovation and collaboration.”

“The TeraGrid partners have worked extremely hard during the two-year construction phase of this project and are delighted that this initial phase of what will be an unprecedented level of computing and data resources is now online for the nation’s researchers to use,” said Fran Berman, SDSC director and co-principal investigator of the TeraGrid project. “The TeraGrid is one of the foundations of cyberinfrastructure that will provide even more computing resources later this year.”

The computing systems that entered production this month consist of more than 800 Itanium-family IBM processors running Linux. NCSA maintains a 2.7-teraflop cluster, which was installed in spring 2003, and SDSC has a 1.3-teraflop cluster. The 6-teraflop, 3,000-processor HP AlphaServerSC Terascale Computing System (TCS) at PSC is also a component of the TeraGrid infrastructure.

“The launch of the National Science Foundation’s TeraGrid project provides scientists and researchers across the nation with access to unprecedented computational power,” said David Turek, vice president of Deep Computing with IBM.”Working with the NSF, IBM is committed to the continued development of breakthrough Grid technologies that benefit our scientific/technical and commercial customers.”

Allocations for use of the TeraGrid were awarded by the NSF’s Partnerships for Advanced Computational Infrastructure (PACI) last October. Among the first wave of researchers to use the TeraGrid are scientists studying the evolution of the universe and the cleanup of contaminated groundwater, simulating seismic events, and analyzing biomolecular dynamics.

Among the allocations awarded included one for Caltech physicist Harvey Newman . Newman leads a team of investigators who are developing codes for CERN’s Compact Muon Solenoid (CMS) collaboration. The CMS experiment will begin operation at the Large Hadron Collider (LHC) in 2007. The Caltech team’s planned use of the TeraGrid will be a valuable and possibly critical factor in the success of several planned “Data Challenges” for CMS. These Challenges are designed to test the readiness of the global Grid-enabled computing system being developed for the experiment, in collaboration with partner projects such as PPDG, GriPhyN, iVDGL, DataTAG, LCG, and others. The TeraGrid will further a program of developing optimized search strategies for the Higgs particles, thought to be responsible for mass in the Universe, for super-symmetry, and for investigating new physics processes beyond the Standard Model of particle physics.

To learn more about the TeraGrid, go to www.teragrid.org

CACR & JPL team members of CASCADE project

Cray Inc. Signs $49.9 Million A greement for Second Phase of DARPA Petaflops Computing Systems Program.

Cray Inc. today announced that it, together with New Technology Endeavors, Inc. , have signed an agreement with the Defense Advanced Research Projects Agency (DARPA) to participate in the second phase of DARPA’s High Productivity Computing Systems Program. The program will provide Cray and its university research partners with $49.9 million in additional funding over the next three years to support an advanced research program aimed at developing a commercially available system capable of sustained performance in excess of one petaflops (a million billion calculations per second).

Ed Upchurch as Caltech/JPL ‘ s Principal Investigator leads the team and Thomas Sterling is Chief Technologist . Other research partners include Stanford University  and The University of Notre Dame , and the Principal Investigator of the project is Cray’s chief scientist, Burton Smith . “The DARPA HPCS program in general and the Cray Cascade Petaflops computer project in particular is an important opportunity for Caltech and JPL to directly, significantly, and substantively influence the direction of future supercomputer systems architecture and software ,” Sterling says. ” Through this program, innovative concepts developed by scientists at Caltech and JPL, in collaboration with colleagues at the University of Notre Dame, in the field of advanced PIM architecture will be developed and transferred to real world end users from academic research through this industrial partnership. The result may be little less than the next revolution in supercomputing.”

DARPA formed the High Productivity Computing Systems Program to foster development of the next generation of high productivity computing systems for both the national security and industrial user communities.  Program goals are for these systems to be more broadly applicable, much easier to program and more resistant to failure than currently available high performance computing systems.

Five computer-makers, including Cray, were selected for the first phase concept study that was initiated in mid-2002, and all five firms submitted proposals for the second phase. Cray, along with IBM and Sun Microsystems, were selected to continue to the second phase, where further definition and validation of the proposed systems will occur. In mid-2006, DARPA plans to select up to two vendors for the final phase, a full-scale development phase with initial prototype deliveries scheduled for 2010.

More information about the Cascade project can be found at http://www.cray.com/cascade/ .