Name
condor_dagman - meta scheduler of the jobs submitted as the nodes of a
DAG or DAGs
Synopsis
condor_dagman [ -debug level ] [ -rescue filename ] [ -maxidle
numberOfJobs ] [ -maxjobs numberOfJobs ] [ -maxpre NumberOfPREscripts ]
[ -maxpost NumberOfPOSTscripts ] [ -noeventchecks ] [ -allowlogerror ]
[ -usedagdir ] -lockfile filename [ -waitfordebug ] [ -autorescue 0|1 ]
[ -dorescuefrom number ] -csdversion version_string [
-allowversionmismatch ] [ -DumpRescue ] -dag dag_file [ -dag dag_file_2
... -dag dag_file_n ]
Description
condor_dagman is a meta scheduler for the Condor jobs within a DAG
(directed acyclic graph) (or multiple DAGs). In typical usage, a
submitter of jobs that are organized into a DAG submits the DAG using
condor_submit_dag . condor_submit_dag does error checking on aspects
of the DAG and then submits condor_dagman as a Condor job.
condor_dagman uses log files to coordinate the further submission of
the jobs within the DAG.
As part of daemoncore , the set of command-line arguments given in
section work for condor_dagman .
Arguments to condor_dagman are either automatically set by
condor_submit_dag or they are specified as command-line arguments to
condor_submit_dag and passed on to condor_dagman . The method by which
the arguments are set is given in their description below.
condor_dagman can run multiple, independent DAGs. This is done by
specifying multiple -dag a rguments. Pass multiple DAG input files as
command-line arguments to condor_submit_dag .
Debugging output may be obtained by using the -debug level option.
Level values and what they produce is described as
* level = 0; never produce output, except for usage info
* level = 1; very quiet, output severe errors
* level = 2; normal output, errors and warnings
* level = 3; output errors, as well as all warnings
* level = 4; internal debugging output
* level = 5; internal debugging output; outer loop debugging
* level = 6; internal debugging output; inner loop debugging
* level = 7; internal debugging output; rarely used
Options
-debug level
An integer level of debugging output. level is an integer, with
values of 0-7 inclusive, where 7 is the most verbose output. This
command-line option to condor_submit_dag is passed to condor_dagman
or defaults to the value 3, as set by condor_submit_dag .
-rescue filename
Sets the file name of the rescue DAG to write in the case of a
failure. As passed by condor_submit_dag , the name of the file will
be the name of the DAG input file concatenated with the string
.rescue. This argument is now optional, and in general it is
preferred to not specify it. This allows condor_dagman to
automatically generate an appropriate rescue DAG name.
-maxidle NumberOfJobs
Sets the maximum number of idle jobs allowed before condor_dagman
stops submitting more jobs. Once idle jobs start to run,
condor_dagman will resume submitting jobs. NumberOfJobs is a
positive integer. This command-line option to condor_submit_dag is
passed to condor_dagman . If not specified, the number of idle jobs
is unlimited.
-maxjobs numberOfJobs
Sets the maximum number of jobs within the DAG that will be
submitted to Condor at one time. numberOfJobs is a positive
integer. This command-line option to condor_submit_dag is passed to
condor_dagman . If not specified, the default number of jobs is
unlimited.
-maxpre NumberOfPREscripts
Sets the maximum number of PRE scripts within the DAG that may be
running at one time. NumberOfPREScripts is a positive integer. This
command-line option to condor_submit_dag is passed to condor_dagman
. If not specified, the default number of PRE scripts is unlimited.
-maxpost NumberOfPOSTscripts
Sets the maximum number of POST scripts within the DAG that may be
running at one time. NumberOfPOSTScripts is a positive integer.
This command-line option to condor_submit_dag is passed to
condor_dagman . If not specified, the default number of POST scripts
is unlimited.
-noeventchecks
This argument is no longer used; it is now ignored. Its
functionality is now implemented by the
DAGMAN_ALLOW_EVENTSconfiguration macro (see section ).
-allowlogerror
This optional argument has condor_dagman try to run the specified
DAG, even in the case of detected errors in the user log
specification.
-usedagdir
This optional argument causes condor_dagman to run each specified
DAG as if the directory containing that DAG file was the current
working directory. This option is most useful when running multiple
DAGs in a single condor_dagman .
-lockfile filename
Names the file created and used as a lock file. The lock file
prevents execution of two of the same DAG, as defined by a DAG input
file. A default lock file ending with the suffix .dag.lockis passed
to condor_dagman by condor_submit_dag .
-waitfordebug
This optional argument causes condor_dagman to wait at startup until
someone attaches to the process with a debugger and sets the
wait_for_debug variable in main_init() to false.
-autorescue 0|1
Whether to automatically run the newest rescue DAG for the given DAG
file, if one exists (0 = false, 1 = true).
-dorescuefrom number
Forces condor_dagman to run the specified rescue DAG number for the
given DAG. A value of 0 is the same as not specifying this option.
Specifying a non-existant rescue DAG is a fatal error.
-csdversion version_string
version_string is the version of the condor_submit_dag program. At
startup, condor_dagman checks for a version mismatch with the
condor_submit_dag version in this argument.
-allowversionmismatch
This optional argument causes condor_dagman to allow a version
mismatch between condor_dagman itself and the .condor.subfile
produced by condor_submit_dag (or, in other words, between
condor_submit_dag and condor_dagman ). WARNING! This option should
be used only if absolutely necessary. Allowing version mismatches
can cause subtle problems when running DAGs.
-DumpRescue
This optional argument causes condor_dagman to immediately dump a
rescue DAG and then exit, as opposed to actually running the DAG.
(This feature is mainly intended for testing.)
-dag filename
filename is the name of the DAG input file that is set as an
argument to condor_submit_dag , and passed to condor_dagman .
Exit Status
condor_dagman will exit with a status value of 0 (zero) upon success,
and it will exit with the value 1 (one) upon failure.
Examples
condor_dagman is normally not run directly, but submitted as a Condor
job by running condor_submit_dag. See the condor_submit_dag manual page
for examples.
Author
Condor Team, University of Wisconsin-Madison
Copyright
Copyright (C) 1990-2009 Condor Team, Computer Sciences Department,
University of Wisconsin-Madison, Madison, WI. All Rights Reserved.
Licensed under the Apache License, Version 2.0.
See the Condor Version 7.2.4 Manual or
http://www.condorproject.org/licensefor additional notices. condor-
admin@cs.wisc.edu
date just-man-pages/condor_dagman(1)