Dieter an Mey
Computing Center
Aachen University of Technology

Stephan Schmidt
Institute for Jet Propulsion and Turbomachinery
Aachen University of Technology

HP's Software Development Environment
- Practice and Experiment -

In the context of preparing an existing Fortran program for parallelisation, it is in our experience advantageous to migrate to Fortran90 in order to be able to make use of its dynamic memory management features. However, transforming a well vectorized Fortran77 CFD code into a more flexible Fortran90 program using a hybrid parallelization model still is an adventure. Here we report on our experiences with PANTA, a 3D Navier-Stokes solver that is extensively used in the modeling of turbomachinery.

During this process, which is ongoing, various software tools are employed:

  • HP: HP Fortran90 autoparallelizing compiler, CXperf Performance Analyzer

  • SGI: MIPSpro Fortran90 autoparallelizing compiler

  • Fujitsu: fjsamp Sampler

  • Simulog: Foresys Fortran Engineering System (http://www.simulog.fr)

  • Etnus: TotalView parallel debugger (on SGI) (http://www.etnus.com)

  • Kuck & Ass. Inc.: Visual KAP for OpenMP, Guide, Assure, GuideView, AssureView, PerView

    (http://www.kai.com)
  • Pallas: Vampir, VampirTrace (http://www.pallas.com)

  • Awk-Scripts

    Foresys allows automatic restructuring of Fortran77/90 and conversion from Fortran77 to Fortran90. It also checks the syntactic correctness of the program and generates a code which is well (machine-) readable, and thus allows easy semi-automatical postprocessing. Awk-scripts are used to replace the fixed dimensions of all arrays by allocatable declarations and to generate corresponding allocate statements.

    Parallelization on a loop level uses the automatic parallelization features of HP's f90 compiler as well as manually introduced OpenMP directives. These directives are translated with the Guide preprocessor. Tools which automatically introduce OpenMP directives into the code like Visual KAP for OpenMP and the SGI MIPSpro Fortran90 compiler are also considered. The tuning tools CXperf and GuideView help improving the code modifications.




  • Foresys introduces comments into the code which can be used to extract a global common block cross reference list. This is very helpful to specify the communication structure of the coarse grain parallelization. Another approach uses OpenMP directives, which can be analyzed with the OpenMP verification tool Assure. The results are then evaluated for the MPI parallelization.

    As current commercial OpenMP compilers do not yet support nested parallelism and MPI is not suited for loop level parallelization, it is at the moment necessary to use OpenMP parallelization within MPI tasks. The runtime behavior of the MPI parallelization can be visualized with Vampir after trace files have been written with the VampirTrace library. GuideView is even able to visualize the behavior of the hybrid code.

    It is our experience, that such a combination of various tools for tuning and parallelization of Fortran90 codes frequently exposes problems and deficiencies. For example, the beta version of the widely used parallel debugger TotalView is only now being released for HP-UX at the end of March. We also observed that the usage of Fortran90 modules and their privatization with the threadprivate OpenMP directive is not part of the first version of the OpenMP API, but is already supported by the Guide preprocessor. It also turns out that the usage of allocatable arrays leads to drastic performance losses with the HP Fortran90 compiler, which hopefully may be reduced with the new +fastallocatable option (V 2.4).

    Nevertheless, the first results reached with the code modifications so far are quite promising. Despite the problems we encountered the conversion of a static vector code to a dynamic hybrid parallel code seems to be feasible with reasonable manpower and the help of suitable software tools.