 |
Welcome to |
Technical Instructions
Atlasmaker is a grid-based workflow manager for building astronomical
image atlases.
Atlasmaker can run on a Unix workstation, or on a parallel machine.
Parallelism is through MPI (Message Passing Interface), and assumes that
each processor can see the same file space. Atlasmaker can build scripts
suitable for being queued in a PBS batch system. A future release of Atlasmaker
will include a Condor-G job submission to allow dynamic selection of where
the job is run on the Grid.
The workflow environment runs the sequence of modules to build catalogs
of images, project the relevant ones, determine and compute overlap images,
use these to estimate background to the images, then subract this
background and coadd the images. Atlasmaker also prints out timings,
for the data fetching, for the (parallelized) projection operations,
for the serial computing.
Configuring and Building the Code
- Check that your Montage distribution is in place and compiled.
- Unpack the tarball with "tar xvf".
- Go to the util/jpeg directory and run "./configure"
- Go back to the top level, modify the top-level Makefile with (a) the
location of the Montage distribution, and (b) if you intend to run
only on a serial machine, with no MPI parallelism, comment out
the line "(cd MPI; make)".
- Type "make". It should compile the MPI code (if wanted),
the utilities, and
the Jpeg library. For the MPI codes, it needs "mpicc".
You do not need this if you are not making mosaics in parallel.
- Edit the config.xml file for where to get and put data, where the
Montage and Atlasmaker distributions are.
- Now run the maketiles.sh sample script. It should fetch 7 files
from the Sloan Digital Sky Survey image service (at sdss.org), then compute
a mosaic of the i-band that is 1024x1024.
Concept: Charts
Montage uses a short text file called a template file, or Chart, to define the area on the sky
for which a mosaic is requested. The Chart also defines (through the
FITS WCS conventions) the map projection
that relates sky position and position on the pixel plane of the mosaic. The script
makeChart is available to create Charts, either by specifying all
relevant parameters, or by specifying what type of atlas to use for the chart. For
more details of atlases, see the
Atlasmaker project.
Getting Data
Atlasmaker can get data from image services that conform to the protocol of the
National Virtual Observatory, specifically the
Simple Image Access Protocol (SIAP).
There are such services for many astronomical sky surveys, and prototype implementations
for the 2MASS and Sloan surveys are listed in the file "siapservers" in the root directory of
the Atlasmaker tarball.
The script
getData.py reads a Chart and determines the right calling sequence for a SIAP server,
then invokes the service to get a list of URLs that point to the necessary images.
The getData script knows how to read data from either http or srb URLs.
Running Atlasmaker
Having obtained data from the getData script or some other way, use the Run
script to manage the modules of the Montage pipeline. Images are projected to the correct
pixel plane, and backgrounds estimated and subtracted, then the projected images co-added
to make the final mosaic. When the script is configured for parallelism, the projection
part of the algorithm is spread among the worker processors, leaving background
estimation and subtraction as serial processes.
Parallel Computing: Manager + Workers
The Atlasmaker application is written with a (non-parallel) manager to do many of the serial parts of the application, including building catalogs of images, background estimation, image coadding, etc. These serial operations take a few percent of the total compute time, the bulk of it being the (parallel) projection operation.
A popular parallel architecture is the host+nodes model, where users log into the host, and MPI jobs run on the nodes. However, we point out here, that it may be anti-social to run the manager operations on the host node, as other may be trying to work there.
Thus, for massive parallel omputation, it would be better to run both manager and workers
on the nodes. A script pbs.sh is supplied in the root fo the tarball that illustrates
how such a run can be organized with the PBS job scheduler. The submission would look like:
% qsub pbs.sh
MPI Parallelism
The MPI code mProjPara is distributed with Atlasmaker in two flavors:
mProjPara-roundrobin.c and mProjPara-lockfile.c.
In each case, each file from the input (data) directory is projected, using the Montage executable mProjct, and the chart that has been chosen for the mosaic. These tasks are distributed among the processor of the parallel computer in an attempt to load-balance the execution.
In the "roundrobin" version, processor number p of N takes all tasks whose task number has remainder is p when divided by N. Thus for N=4, processor 1 would take task numbers 1, 5, 9, 13, ...
In the "lockfile" version, the file system is used to distribute tasks. Each processor iterates through the tasks, checking each for the existence of a lockfile in the shared file system. If there is no lockfile, the task is taken and the lockfile created. As long as this check-and-create operation is atomic, this algortithm shared the work correctly.
The ColorPicture Application
The Atlasmaker scripts can be used as they are, doing any combination of
makeChart,
getData,
Run, and the
utilities.
There is also an illustrative script, "ColorPicture.py",
which illustrates how the scripts that make up Atlasmaker can be combined
in complex ways. This script is intended to be modified by a user.
In the ColorPicture.py script,
a Chart is made consistent with a given atlas, then there is
a loop over three bandpass filters. For each filter, data are fetched from a suitable
NVO server. Then, for each filter, Montage is run and a mosaic file created.
These three files are reduced in dynamic range from 8 bytes per pixel to one,
and a JPEG image made from them.
The JPEG is what shows us that the application ran correctly; but it is the
FITS files that are the result.