Welcome to

Atlasmaker

Technical Instructions

Atlasmaker is a grid-based workflow manager for building astronomical image atlases.

Atlasmaker can run on a Unix workstation, or on a parallel machine. Parallelism is through MPI (Message Passing Interface), and assumes that each processor can see the same file space. Atlasmaker can build scripts suitable for being queued in a PBS batch system. A future release of Atlasmaker will include a Condor-G job submission to allow dynamic selection of where the job is run on the Grid.

The workflow environment runs the sequence of modules to build catalogs of images, project the relevant ones, determine and compute overlap images, use these to estimate background to the images, then subract this background and coadd the images. Atlasmaker also prints out timings, for the data fetching, for the (parallelized) projection operations, for the serial computing.

Configuring and Building the Code


Concept: Charts

Montage uses a short text file called a template file, or Chart, to define the area on the sky for which a mosaic is requested. The Chart also defines (through the FITS WCS conventions) the map projection that relates sky position and position on the pixel plane of the mosaic. The script makeChart is available to create Charts, either by specifying all relevant parameters, or by specifying what type of atlas to use for the chart. For more details of atlases, see the Atlasmaker project.

Getting Data

Atlasmaker can get data from image services that conform to the protocol of the National Virtual Observatory, specifically the Simple Image Access Protocol (SIAP). There are such services for many astronomical sky surveys, and prototype implementations for the 2MASS and Sloan surveys are listed in the file "siapservers" in the root directory of the Atlasmaker tarball.

The script getData.py reads a Chart and determines the right calling sequence for a SIAP server, then invokes the service to get a list of URLs that point to the necessary images. The getData script knows how to read data from either http or srb URLs.

Running Atlasmaker

Having obtained data from the getData script or some other way, use the Run script to manage the modules of the Montage pipeline. Images are projected to the correct pixel plane, and backgrounds estimated and subtracted, then the projected images co-added to make the final mosaic. When the script is configured for parallelism, the projection part of the algorithm is spread among the worker processors, leaving background estimation and subtraction as serial processes.

Parallel Computing: Manager + Workers

The Atlasmaker application is written with a (non-parallel) manager to do many of the serial parts of the application, including building catalogs of images, background estimation, image coadding, etc. These serial operations take a few percent of the total compute time, the bulk of it being the (parallel) projection operation.

A popular parallel architecture is the host+nodes model, where users log into the host, and MPI jobs run on the nodes. However, we point out here, that it may be anti-social to run the manager operations on the host node, as other may be trying to work there. Thus, for massive parallel omputation, it would be better to run both manager and workers on the nodes. A script pbs.sh is supplied in the root fo the tarball that illustrates how such a run can be organized with the PBS job scheduler. The submission would look like:

	% qsub pbs.sh

MPI Parallelism

The MPI code mProjPara is distributed with Atlasmaker in two flavors: mProjPara-roundrobin.c and mProjPara-lockfile.c.

In each case, each file from the input (data) directory is projected, using the Montage executable mProjct, and the chart that has been chosen for the mosaic. These tasks are distributed among the processor of the parallel computer in an attempt to load-balance the execution.

In the "roundrobin" version, processor number p of N takes all tasks whose task number has remainder is p when divided by N. Thus for N=4, processor 1 would take task numbers 1, 5, 9, 13, ...

In the "lockfile" version, the file system is used to distribute tasks. Each processor iterates through the tasks, checking each for the existence of a lockfile in the shared file system. If there is no lockfile, the task is taken and the lockfile created. As long as this check-and-create operation is atomic, this algortithm shared the work correctly.

The ColorPicture Application

The Atlasmaker scripts can be used as they are, doing any combination of makeChart, getData, Run, and the utilities. There is also an illustrative script, "ColorPicture.py", which illustrates how the scripts that make up Atlasmaker can be combined in complex ways. This script is intended to be modified by a user.

In the ColorPicture.py script, a Chart is made consistent with a given atlas, then there is a loop over three bandpass filters. For each filter, data are fetched from a suitable NVO server. Then, for each filter, Montage is run and a mosaic file created. These three files are reduced in dynamic range from 8 bytes per pixel to one, and a JPEG image made from them. The JPEG is what shows us that the application ran correctly; but it is the FITS files that are the result.