Running Madgraph 5 (and aMC@NLO) on Stampede

Basic account setup

First, you need to be a member of our project on Stampede. This determines how TACC accounts for CPU use. If you aren't a member, first get an account at the TACC portal page, then let Peter know and he will add you to the project.

For questions on connecting to the machine and other details of use, check out the Stampede User Guide. Also look at this page on setting permissions on your files and directories.

Running Madgraph manually

I've installed a copy of Madgraph 5 and Fastjet on Stampede under /work/02130/ponyisi/madgraph/. I think all members of the project should have access to this directory. I've made modifications to make a 125 GeV Higgs the default.

Before running Madgraph you should run module swap intel gcc; we need to use the gcc compiler family (in particular gfortran), not the Intel ones.

After running bin/mg5 from the top-level Madgraph directory, one can configure either LO or NLO computation.

  • LO:
    Example: Madgraph ttZ + up to 2 jets, leading order
    generate p p > t t~ z @0
    add process p p > t t~ z j @1
    add process p p > t t~ z j j @2
    output ttZ-LO # output directory
    
    You probably want to edit output_dir/run_card.dat to change the number of events that will be generated in a run, and to set the ickkw variable to 1 to enable ME+PS (matrix element+parton shower) matching.
  • NLO:
    aMC@NLO ttZ
    generate p p > t t~ z [QCD]
    output ttZ-NLO # output directory
    
    I haven't fully validated NLO yet.

Do not run the launch command. We want to submit to the batch queues on our own terms. Stampede uses "SLURM" as its batch system. This is vaguely like any other batch system out there.

Running Madgraph in multicore mode (still works, but Condor is better, see below)

One feature of Stampede is that computing cores are allocated in blocks of 16 (one node). So even a single job will take (and be charged for) 16 slots. We can take advantage of this by submitting Madgraph jobs to a node in multicore mode (default); they will then take use all 16 cores. (So in short, we submit one Madgraph job per run, which will then use 16 cores.) Create the following script in the output directory above, changing ttZ-LO as appropriate:

batch_script_multicore
#!/bin/bash
#SBATCH -J ttZ-LO
#SBATCH -o ttZ-LO.o
#SBATCH -n 1
#SBATCH -p normal
#SBATCH -t 10:00:00
# For peace of mind, in case we forgot before submission
module swap intel gcc
# Following is needed for Delphes
. /work/02130/ponyisi/root/bin/thisroot.sh
bin/generate_events <<EOF
0
0
EOF

Then call sbatch batch_script_multicore from the output directory. This will go off and run Madgraph on a node somewhere. You can look at the job output by looking at the file ttZ-LO.o.

Running Madgraph with Condor

One feature of Stampede is that computing cores are allocated in blocks of 16 (one node). Madgraph doesn't deal with this so well (it wants a batch system with individual slots), so we humor it by booting a small Condor cluster within the node allocation from SLURM.

Create the following script in the output directory above, changing ttZ-LO as appropriate:

batch_script_condor
#!/bin/bash
#SBATCH -J ttZ-LO
#SBATCH -o ttZ-LO.o
# MUST ask for one job per node (so we get one Condor instance per node)
#SBATCH -n 5 -N 5
#SBATCH -p normal
#SBATCH -t 10:00:00
# For peace of mind, in case we forgot before submission
module swap intel gcc
# Following is needed for Delphes
. /work/02130/ponyisi/root/bin/thisroot.sh
# path to Condor installation.  Every job gets a private configuration file, created by our scripts
CONDOR=/work/02130/ponyisi/condor
# create Condor configuration files specific to this job
$CONDOR/condor_configure.py --configure
# update environment variables to reflect job-local configuration
$($CONDOR/condor_configure.py --env)
# start Condor servers on each node
ibrun $CONDOR/condor_configure.py --startup
# Run job
bin/generate_events --cluster <<EOF
0
0
EOF
# cleanly shut down Condors 
ibrun $CONDOR/condor_configure.py --shutdown

Then call sbatch batch_script_condor from the output directory. This will go off and run Madgraph over a bunch of nodes. You can look at the job output by looking at the file ttZ-LO.o.

File transfer

The programs scp and rsync can be used to move files to and from Stampede. Keep files in $WORK or $HOME on Stampede.

Running Pythia and Delphes

Pythia and Delphes 3 are part of the Madgraph installation. They will be automatically run as part of bin/generate_events if the cards pythia_card.dat and delphes_card.dat exist in the Cards directory. A Delphes card for 140-pileup ATLAS is part of the template so will be in your Cards directory; you can copy it to delphes_card.dat. However this will not handle pileup.

Making a gridpack

It's great to have Stampede, but we may need to run the generated Madgraph code in other environments. In particular it appears that the Snowmass effort is trying to collect Madgraph codes for various processes. One way to make a distributable version of a Madgraph run is to create a "gridpack." These are frozen versions of the code (no parameter changes allowed, integration grids already fixed) which can be easily run on Grid sites.

To make a gridpack, ensure that you're happy with the cards for your process, then edit Cards/run_card.dat to set .true. = gridpack. Then run generate_events as normal via a batch job (you probably want to set it to generate very few events). This will produce a file in your output directory called something like run_01_gridpack.tar.gz. Now you can follow the instructions under Submitting Madgraph gridpacks to Panda to run jobs on the Grid.

  • No labels