1 / 20

Illinois Campus Cluster Program User Forum

Illinois Campus Cluster Program User Forum. April 24, 2012 NCSA Room 1030 10:00 AM - 11:00 AM. Welcome and Agenda. Welcome General Updates Upgrades to the Illinois Campus Cluster Program webpage Governance Status of the second instance of the Illinois Campus Cluster Program, Golub

craig
Download Presentation

Illinois Campus Cluster Program User Forum

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Illinois Campus Cluster ProgramUser Forum April 24, 2012 NCSA Room 1030 10:00 AM - 11:00 AM

  2. Welcome and Agenda • Welcome • General Updates • Upgrades to the Illinois Campus Cluster Program webpage • Governance • Status of the second instance of the Illinois Campus Cluster Program, Golub • “Cluster Basics - Submitting arrays of serial Mathematica jobs” an Overview presented by Jonathan Manton • Open Floor for User Discussion • Adjourn

  3. Illinois Campus Cluster Program Webpage Update • User Discussion Forum • System Status • Usage/Queue Info • Exchange calendar for events/outages • SSH terminal (java) • Globus Online interface

  4. Governance Update • Executive Governance Committee Membership Finalized • Jonathan Greenberg, Asst. Prof, Geography (Investor Forum chair) • Mark Neubauer, Asst. Prof, Physics (Investor Forum representative) • Eric Shaffer, Asst. Dir., Computational Science and Engineering (Investor Forum representative) • Patty Jones, Assoc. Dir. for Research, Beckman (OVCR representative) • Chuck Thompson, Asst. Dean and Dir., Engineering IT Shared Services (IT Governance Research Committee Representative) • Neal Merchen, Assoc. Dean for Research, ACES (Office of the Chancellor Representative) • Charley Kline, IT Architect, CITES (CIO's Office Representative) • John Towns, Dir. of Collaborative eScience Programs, NCSA (ex-officio, Manager of Illinois Campus Cluster Operators) • First meeting scheduled for May 1st • Next Investor Forum being planned

  5. Golub Status • All Hardware is on site as of 4/8/2013. • Worked through a power design issue with DELL. • Last PM+ migrated Torque/Moab over to Golub • End game schedule being developed.

  6. Cluster Basics Submitting arrays of serial Mathematicajobs Jonathan Manton

  7. Topics • Intended Audience • Introduction • Conceptual Overview • Job Preparation • PBS File Preparation • Submitting and Monitoring • Post-Processing • Recap

  8. Intended Audience • Intended audience is novice users • Also applicable to experienced users who have never used PBS job arrays • If this is not new to you, please provide feedback to improve this tutorial

  9. Introduction • This tutorial covers submitting large numbers of identical serial (vs. parallel) jobs to the cluster • Each job does the same work, but on different input parameters or data sets • Procedure developed for Mathematica jobs, but can be easily adapted for Matlab, C, etc. • This presentation is based on a post from the user forum

  10. Conceptual Overview • Create a program that takes arguments from the command line to determine what work to do • Prepare a text file that has the appropriate input parameters, each line representing one run of the program • Use the PBS job array functionality to launch many jobs, each with different input parameters • If necessary, consolidate the data at the end

  11. Job PreparationMathematica Script #!/usr/local/Mathematica-8.0/bin/math -script (* tell the shell to interpret the rest of this file using Mathematica *) (* grab the command line arguments and put them in Mathematica variables *) i = ToExpression[$CommandLine[[4]]]; j = ToExpression[$CommandLine[[5]]]; (* multiply the prime before the first argument by the prime after the second *) f[x_, y_] := NextPrime[x,-1] * NextPrime[y,1] (* format result using comma-separated values, to make post-processing easy *) fmt[x_, y_, result_] := StringJoin[ToString[x],",",ToString[y],",",ToString[result]]; (* print to standard output *) Print[StandardForm[fmt[ i, j, f[i,j] ]]];

  12. Job PreparationScript Permissions and Input File • Need to set Mathematica script file to be executable chmoda+xmyscriptname.m • Input file has one line per program run, arguments separated by spaces • Example input text file for an array of 5 jobs: 4 11 383 66 35 93 234 48844 9398 238

  13. PBS File Preparation • The arguments at the top of the PBS file: • Tell the scheduler not to waste resources and to schedule many (12) jobs per node • Defines the size of the job array and the starting and ending job array indices • The shell script at the bottom of the PBS file: • Figures out what job array index is running • Grabs the appropriate line from the input data file • Executes the program with those arguments • Saves output data in a filename made unique using those arguments

  14. PBS File PreparationExample (1 of 2) #!/bin/bash # # This batch script is an example of running lots of copies of a serial # job using a single input file with one line per set of input arguments. # # Many of our "casual" users want to just run a lot of jobs, not necessarily # one job with lots of cores. # #PBS -l walltime=00:10:00 ## ## these tell the scheduler to use one core per job, and to schedule multiple ## jobs per node. On the secondary queue at least, if you don't have the ## naccesspolicy=singleuser argument, you will schedule one job per *node*, ## wasting 11 out of 12 cores for serial jobs #PBS -l nodes=1:ppn=1 #PBS -l naccesspolicy=singleuser ## #PBS -N primepair_test #PBS -q secondary ## ## This says to run with job array indices 3 through 7. These indices are used ## below to get the right lines from the input file #PBS -t 3-7 ## #PBS -j oe # CONTINUED NEXT SLIDE…

  15. PBS File PreparationExample (2 of 2) …CONTINUED FROM PREVIOUS SLIDE ## grab the job id from an environment variable and create a directory for the ## data output export JOBID=`echo "$PBS_JOBID" | cut -d"[" -f1` mkdir $PBS_O_WORKDIR/"$JOBID" cd $PBS_O_WORKDIR/"$JOBID" module load mathematica ## grab the appropriate line from the input file. Put that in a shell variable ## named "PARAMS" export PARAMS=`cat $HOME/mexample/inputs | sed -n ${PBS_ARRAYID}p` ## grab the arguments, using the linux "cut" command to get the right field ## modify to match the number of arguments you have for your program export ST1=`echo "$PARAMS" | cut -d" " -f1` export ST2=`echo "$PARAMS" | cut -d" " -f2` ## Run Mathematica script, directing output to a file named based on the ## input parameters. The assumption is the combination of parameters is unique ## for the job. Modify for the number of parameters you have by adding $ST#. $HOME/mexample/primepair.m $ST1 $ST2 > data$ST1-$ST2

  16. Submitting and Monitoring • Submit as usual qsub job-inputarray.pbs • Monitor as usual qstat | grep $USER • Note that job arrays have a different job name format • Instead of a job with the name 123456.cc-mgmt1 it will be 123456[].cc-mgmt1 • Some commands will behave differently, brackets are a special character in bash

  17. Post-Processing • The example PBS script will create a subdirectory with the same name as the job number • Each job will have a different output file in that directory • It is easy to consolidate the small files into one big file using the Linux cat command cat 123456/data* > outputs.csv

  18. Recap • Create your program so it takes input from the command line • Put your unique inputs, one per line, in a text file • Submit the job – one instance of the program for each line in the input file • Consolidate your data at the end

  19. Open Discussion • This is your opportunity to share with the Illinois Campus Cluster Program Team your thoughts. • Next meeting – July timeframe (meet in summer?)

  20. Adjourn • Thank you for providing your thoughts. • Please feel free to contact us at help@campuscluster.illinois.edu

More Related