Authentication and Authorization Policy
All clusters and servers at Caltech Tier2 Center use site-wide grid user mapping service (GUMS) to authenticate a grid user and provide access to intended resources based on user's VO membership. A user with CMS VO membership is automatically mapped to any available individual allocated userid. Users from non-CMS VOs are mapped to group accounts with unique userid's. This policy may change in the future as the number of users in other VOs will increase.
User Account Policy
- Interactive logins to Caltech Tier2 resources must comply with the Caltech/CACR computing policies.
- If you wish to get an account at CACR, please fill out this form. Mention that you are affiliated with Caltech Tier2. Either fax the form to (626) 628-3994 or mail it to "Caltech CACR Attn: New Accounts, Mail Stop 158-79, Pasadena, CA 91125".
- Please send email to
to notify the support team about the submitted application. - After approval by the responsible account manager of a related project, the user will be notified by email requesting an ssh public key. Interactive login access to CACR resources is supported only through ssh public keys.
- See also Accessing Caltech HEP Computing Resources
Processing Policy
The wall-clock limit for running jobs is 24 hours. For the moment, this limit is not actively enforced. Upon discovering a job that has been running for more than 24 hours, a site administrator may contact the individual who submitted this job or simply kill it.
The only batch scheduler supported at the Caltech Tier2 site is Condor.
The following is a tentative layout of our priority and preemption policies. These will be implemented in the near future.
- CMS Software Installation: These jobs will run on a dedicated machine.
- CMS Analysis: Will preempt CMS production and non-CMS jobs in order to run, but will not preempt CMS production if production is using less than 50% of cluster resources. The 50% number is reviewed and adjusted on a regular basis following recommendations from USCMS management and local computing constraints.
- CMS Production: Will preempt CMS analysis only so that production can gain control of 50% of the cluster resources. Non-CMS jobs will be preempted if there are no free batch slots available.
- Non-CMS jobs: Will only run on free batch slots, and will be preempted by any CMS job as needed.
Storage Policy
Caltech clusters (CIT_CMS_PG and CIT_CMS_T2) provide six types of storage area. $OSG_APP is used for software installation by various VOs. It is NFS-mounted across all nodes in the cluster. Users can only install and modify their own applications and not those of other users (permission 1777). Applications are installed in this area by the fork jobmanager from an OSG gateway. Compute nodes make local copies of these applications for execution on available CPUs. Since these applications reside on a shared filesystem with limited capacity, users are expected to show some consideration about filling this space.
- $OSG_DATA for input and output data. It is NFS-mounted across all cluster nodes. This area is readable and writeable by all compute and head nodes. Permission is set to 1777.
- $OSG_WN_TMP for faster reading and writing of temporary files. This space is local to each compute node and is exclusively used by the local CPUs. There is no user space quota. Files older than 3 days are automatically cleaned up.
- $HOME is used to keep globus-gram logs for user grid activities. Users are discouraged to store any files in this area, including executables. This space is NFS-mounted across all nodes. There is no space quota but users are required to clean up on a regular basis. Files older than 10 days are automatically deleted.
- dCache/SRM is exclusively used by the CMS VO for storing persistent datasets and Pileup samples for CMS analysis. Any CMS user must make prior arrangements with site administrators for access to this space. In the future, this facility will be available to members of other VOs. There is no user quota.
There is no backup system in place on any of the clusters. In the future, we will backup a small fraction of data such as metadata files for CMS analysis, OSG installation directories, MySQL catalogs, and home directories on the interactive analysis cluster.
Support Policies
As per OSG Operation policy, a support request concerning software or hardware issues should be submitted to USCMS Grid Support Center. The center will open a trouble ticket and assign it to the respective VO support center.
Shutdown Policy
In case of cooling issues in CACR basement, we will shut down equipment in the following order.
Level 1 - Non-critical systems that should have little affect on local and remote users' ability to do work
- Data backup servers. Backups are only run on Sunday afternoons
- cithep98
- t2-backup
- Network test servers
- la1
- la2
- la3
- hp1
- hp2
- ...more
- socrates.ultralight.org
- plato.cacr.caltech.edu
- osg-discovery.caltech.edu
- deepweb.caltech.edu
- All OSGP machines (Notifiy kam_k_pang@sbcglobal.net)
- Tier2c cluster - (notify vladimir and marat):
- Headnode
- nas-0-0
- All worker nodes
- cithep97 worker nodes - (notify cms@hep.caltech.edu): compute-0-1 through compute-0-8
- cithep90 cluster - (notify vladimir)
- citgrid3 worker node
- dcache test cluster
- Lambda station test nodes (notify lambda-station-technical@fnal.gov):
- ls10.caltech.edu
- ls01.caltech.edu
- Disk i/o servers:
- cithep6
- cithep7
Level 2 - Important systems for local and remote work
- CIT_CMS_T2 production cluster - (notify cms-t2@fnal, OSG GOC, CMS hypernews)
- headnode, dcache nodes, all worker nodes
- gums, higgs, satabeast
- cithep97 cluster: headnode, nas-0-0, nas-0-1
- citgrid3 headnode used for remote logins to ultralight.org
- All workgroup switches
- EXCEPT Force10 E600
- EXCEPT Cisco 6509
- EXCEPT Cisco 3750 (207 network to lauritsen)
Level 3 - Critical servers for Ultralight, VRVS, and MonALISA
- monalisa (3 servers)
- mgmt.hep.caltech.edu - Ultralight DNS server
- mgmt.caltech.edu
- boson.cacr.caltech.edu
- cms-nagios.ultralight.org
- evo07.caltech.edu
- evo08.caltech.edu
Level 4 - Critical networking equipment
- Cisco 7509
- Force10 E600
- Cisco 3750
- All UPS systems

