pexec
- execute commands or shell scripts in parallel on a single
host or on remote hosts using a remote shell
This manual page documents briefly the pexec
program. pexec
executes in parallel the given command or shell script (e.g. parsed
by /bin/sh
) on the local host or on remote hosts,
while some of the execution parameters, namely the redirected standard input,
output or error and environmental variables can be varied.
The given program or script is executed as many times as how many parameters are specified in the command line or read from a given parameter file. Each parameter is a simple string which can be used either to pass to the program/script as the value of an environmental variable or it can be used in the format of the file names where the standard input, output or error are optionally redirected from or to.
Moreover, more than one shell command script can also be passed to parallel execution, in this case there is no need for parameters or the number of the parameters taken from command line (or read from a parameter file) must be the same as the number of the distinct shell command scripts.
The program is capable to automatically swallow the standard output and
error (to /dev/null
), or collecting them via pipes and dump to the
invoker's standard output or error (with optional line headers or trailers which
can be used to distinguish between the output of the distinctive processes).
The execution on remote hosts is done using a remote shell which
both builds a tunnel between the invoking and the remote host(s) and
do the authentication and ensures the security (if a secure remote shell
is used). Hence, there is no need to run standalone daemons on the
remote side: the remote shell itself executes the pexec
program
in daemon mode when the standard input and output of the latter is bound
to the remote shell to form a (secure and authenticated) tunnel. See
the appropriate section below for a more detailed explanation.
In order to avoid unexpected I/O load or to synchronize individual
tasks, Bpexec
supports mutual exclusions
(mutexes) and atomic command executions.
The maximum number of simultaneous tasks can be controlled by a
hypervisor daemon: with such a daemon, concurrent
pexec
instances can start without an unexpectedly high load.
General invocation:
pexec [options] [--] command [arguments]
pexec [options] -c [--] script
or
pexec [options] -m [--] 'script1' ['script2'...]
Remote control, mutual exclusions and atomic command exectuion:
pexec [-j|--remote] [options]
pexec [-j|--remote] [options] [-l|-u <mutex>]
pexec [-j|--remote] [options] -a -m <mutex> [-c] [--] command [arguments]
Hypervisor daemon:
pexec [-H|--hypervisor] [options] start|stop
-h, --help
--version
-s, --shell <
full shell path>
/bin/sh
) of the shell to be used for script execution.
-c, --shell-command
-s|--shell
also) to interpret the command(s)
instead of direct execution.
-m, --multiple-command
-l|--list
or -f|--listfile
).
-e, --environment <
environmental variable name>
/bin/bash
, such environmental variables can
easily be read as if they would be a normal shell variable (see
the examples also).
-n, --number <
number of parallel processes> |
auto |
managed |
ncpu-n|--number
or specifying -n|--number
auto)
the program tries to connect to a local hypervisor which keeps track
the resources of the system (see sec. Hypervisor Mode for more details).
If the connection to the local hypervisor failed, the program
derives the number of available processing units on the local host
using the content of /proc/cpuinfo
or other system-specific method
(on operating systems with different kernel than Linux).
If the argument of -n|--number
is managed, pexec
searches only for hypervisor and terminates with a non-zero exits status
if the connection is failed. If the argument for -n|--number
is
ncpu, the program does not try to connect to the hypervisor (even
if it is running) but uses the available information
(/proc/cpuinfo
or other system-specific method)
to figure out the number of processing units.
-C, --control [
<host>:]
<port>|
<path>/tmp/pexec.sock
.
If the specified port is a single number, pexec
connects to the
given port on the localhost, if a host is specified, the program
connects to the given host and port; otherwise if it is a valid
path, pexec
connects to that UNIX domain socket. Note that
the hypervisor socket does not have a default port number, i.e.
the port argument is mandatory after the host name.
See sec. Hypervisor Mode for more details.
-p, --list <
space separated list of parameters> [-p ...]
-p
or --list
option, the delimiter whitespace
should be escaped somehow. Note that if there are more than one shell commands
to be executed, the total number of parameters should be the same
as the number of the individual shell commands or no parameters
should be declared.
-r, --parameters <
list of parameters up to the next switch> [-r ....]
-p|--list
and -r|--parameters
can be mixed, depending
on the actual problem or convenience. See also some notes
below at -f|--listfile
.
-f, --listfile <
file containing the parameters>
-w|--column
) or the complete line can be threaded as a single
parameter (see -t|--complete
). If the parameters are read from
a single column and some of the parameters are wanted
to contain space(s), it can be put between double quotation marks ("...").
Note that if there are more than one shell commands
to be executed, the total number of parameters should be the same
as the number of the individual shell commands or no parameters
should be declared. Note also that the parameters can only be defined
from command line or from this list file, i.e. -f|--listfile
and -p|--list|-r|--parameters
cannot be mixed.
-w, --column <
column index>
-f|--listfile
above). If the given
column is not exist in the current line, that line will silently be omitted.
-t, --complete
-r|--parameters
(or -p|--list
), the content
of a line won't be splitted into distinct parameters even if there
are whitespaces.
-z, --nice <
nice>
pexec
and all children (executed
processes) to the priority defined by the nice value.
--
-r|--parameters
or
when the command itself begins with a literal '-' (dash) character (the
latter is a rear case, as one can expect). This marker can also be used
to emphasize the (beginning of the) command itself.
-i, --input <
input file format>
-o, --output <
output file format>
-i|--input
for this case).
If the argument is a single file, all command
execution processes writes their output to this single file. If
the argument is a single dash or -1
, all of the standard outputs are
gathered to the invoker's standard output. If the argument is
-2
, the standard outputs are gathered to the standard error
of the invoker.
If the argument contains the format elements %s or %d,
these are replaced to the respective parameter name and the standard
output will be a different file for each process. Note that in the
second case, when the output file is a single, non-formatted file name,
the outputs are collected via pipes and there is no guarantee for
subsequent data order, even the outputs of different processes can be
mixed (moreover, if the output is ASCII text, parts of lines can also
be mixed). This means that the processes will feel their standard outputs
as pipes not as regular files. Note also that if only I%s is
used in the formatted file name, the parameter list should contain
unique parameters, unless some of the output files will be lost and/or
written in parallel, yielding unexpected result.
-u, --error, --output-error <
output error file format>
-o|--output
above
for a more detailed explanation. The only exception is the single dash:
specifying a single dash to -u|--error
results that the
standard errors are going to be collected to the standard error of the invoker.
To redirect the errors to the standard output, use -1
as an argument.
-R, --normal-redirection
--output -
and --error -
and --input /dev/null
.
Since redirecting the same standard input to all of the executed commands
is nearly meaningless in a parallel environment, this argument implies
an expectable behaviour, i.e. the standard output and error streams
of the commands are gathered to the invoker's standard output.
-a, --output-format <
Ioutput line format>
-i|--input
. The
line itself without the trailing newline character is represented
by %l. Extra characters (e.g. tabulators, newlines) can also
be inserted using the well-known escape sequences. Note that
the trailing newline is always set implicitly unless it is disabled
by -x|--omit-newlines
. The line buffering yielded by the simple
format of %l can also
be useful if all of the standard outputs (or errors) are collected in a single
file and the invoker wants to avoid the inter-line confusion of output
(i.e. if this redirection formatting is omitted, no line buffering
is done at all).
The printf
-like alignment syntax can also be used near %s,
I%l and %d (both in the post-formatting and in the redirection
file name formats): the number before the period indicates the minimum size
and its sign refers to the alignment (positive: right alignment, negative: left
alignment) while the number after the period indicates the minimum number
of padding zeroes for numerical values. E.g the %5.3d would yield
" -042" for -42.
-b, --error-format <
error line format>
-a|--output-format
for more details.
-x, --omit-newlines
-a|--output-format
and/or -b|--error-format
, the
trailing newlines are disabled and only written if specified directly
using '\n'.
.PP
Note that in the case when no redirection is specified and the
number of the parameters is exactly one or less; the executed process
will inherit the standard files directly from the invoker. Otherwise,
if there is more than one parameter in the list, the redirection
will be defined by the -i|--input
, -o|--output
and -u|--error
options. It means if one of these options is omitted, the respecting standard
stream will be redirected from/to /dev/null
. In other words, if
any of these redirection options is specified, the latter rules will
define the redirection, independently of the number of the parameters.
The execution on remote hosts is done using a remote shell which both builds a tunnel between the invoking and the remote host(s) and do the authentication and ensures the security (if a secure remote shell is used). It means that on the remote hosts(s) there should be:
pexec
which is started in daemon mode;
-k|--local-files
for more details about
this issue).
-g, --remote-shell "<
remote_shell> [<
arguments>]"
/usr/bin/ssh
with no extra arguments. Note that
if additional arguments are defined for the remote shell,
the whole argument of the switch should be escaped somehow (e.g. put
between quotation marks). The connection and authentication are performed
sequentially before executing anything and only once for each host:
if the the authentication requires interactivity (e.g. typing a password), it
is also done before the whole procedure starts.
-n, --number <
hostspec>:[<
processes>],...,[<
processes>]
-n|--number
option is used
to specify a comma-separated list of names and expected capacities
of the remote hosts used for parallel execution.
The hostspec argument is the host specification
argument passed directly to the remote shell which should be capable
to understand it. In the most cases, it is simply the name of the
peer machine, in the case of ssh
, the username also can be
passed using the well-known username@hostname form.
The host specification must always be followed by a literal colon (':').
Optionally, the maximum number of processes to run on that host can
follow the colon. If it is omitted, the maximum number of processes
are determined on the peer side automatically (yielding the same number
of processes as it is determined by --number
auto, see above),
moreover the literal auto, managed and ncpu arguments
can also be used, like in the case of local host parallelization.
The number of processes executed on the invoker's host can simply be
specified by a single positive number, or by one of the
keywords auto, managed and ncpu (just in the case
of simple local host execution).
Note that the host specifications are additive, i.e. if the same machine
(including the local one when the hostspec and its colon is omitted)
is defined more than once, the maximum number of processes are added.
It can yield unexpected results if the number is omitted after the colon,
i.e. it is determined automatically, in this case the automatically
determined maximum number of processes are also added, yielding a
large load.
-k, --local-files [TBD]
Note that if the redirected files are parameter-specific and
tunneled to/from the remote hosts, then these files 1/
on the invoker's host are seen as regular files;
2/ on the remote hosts are seen as pipes. Otherwise, if the redirection
is done from/to a single file, both the local and remote hosts will
see their standard outputs and errors as pipes but the standard input
is still a regular file on the local side and a pipe on the remote side.
-P, --pexec <
pexec-path>
pexec
program on the remote hosts. If this
option is omitted, the invoker tries to figure out from the invoking
syntax (see argv[0]) and the current path. This issue can be a bottleneck
if the program is installed differently on the hosts since the remote shells
executes their commands in non-interactive and/or non-login modes which
might result different paths.
-T, --tunnel
pexec
,
the program will start in tunnel daemon mode. This parameter is not used during
the regular usage but used by pexec
itself to start daemons
via the remote shell tunnels.
Running instances of pexec
can be controlled remotely to gather
some status information of the paralleled execution and implement
mutual exclusions.
-y, --bind
inet|
unix|
<port>|
/<path>pexec
to be remote controlled via internet
(AF_INET family sockets) or via UNIX domain sockets. If the
literal inet or unix is specified as an argument
for this switch, the port or the path of the named socket will be
assigned randomly; but both of them can be specified directly by
a single integer number (referring to an INET port) or by an absolute
path (beginning with a literal slash, referring to an UNIX domain named
socket). In all cases, the currently assigned port or path will be
reported in the logs and will be exported as an environmental
variable with the name of PEXEC_REMOTE_PORT (by default,
see also -E|--pexec-connection-variable
). This environmental
variable is inherited by all processes executed on the local host
and tunneled to the remote hosts too and inherited by the
all of the processes executed by pexec
daemons.
-E, --pexec-connection-variable <
environment_variable_name>
-p|--connect
auto combination to determine the control
socket with which the running pexec
instance can be controlled
or polled. Note that in practice there is no need to change this
variable since separate pexec
jobs uses different environment space
(i.e. a process which changes an environment variable affects the
variables of its childrend only).
-j, --remote
pexec
, then
the program can be used to control and poll the status of
other running instances of pexec
if these other running
ones were started with enabling the remote control by -y|--bind
(see above).
-p, --connect
auto|[
<host>:]
<port>|
/<path>pexec
instance which is to be
remote controlled. Since connecting to something is mandatory for
doing remote control, omitting this option is equivalent with
--connect auto
and in this case the program gets the remote
port information using the environmental variable PEXEC_REMOTE_PORT
(by default, see also -E|--pexec-connection-variable
).
If this environmental variable is not exist, the connection will fail
and pexec
exits with an error.
The pexec
instance to be remote controlled can also be specified
directly by specifying either the INET host and port
(in this case if host is omitted, localhost is used as default
but the port number is mandatory since there is no default port)
or the absolute path of the UNIX domain socket. Note that in the most
shells --connect auto
is equivalent to
--connect $PEXEC_REMOTE_PORT
(by default, see also
-E|--pexec-connection-variable
) since the environmental variables
can be referred as a normal shell variable.
-t, --status
-l, --lock, --mutex-lock
<mutex-name>-u, --unlock, --mutex-unlock
<mutex-name>-u|--unlock|--mutex-unlock
with the mutex of the same name).
-m, --mutex
<mutex-name>-d, --dump
<filename> | -s, --save
<filename>-d|--dump
or -s|--save
options are specified, the program prints the content of the file
to standard output or stores the data read from the standard input to the
specified file, respectively (like cat
or tee
with the
difference that -s|--save
does not copy the content to standard output
like tee
does so). If a mutex is specified by -m|--mutex
,
pexec
locks the mutex before the dump/save operation and
unlock after it is done, i.e. pexec -j -d something.txt -m mymutex
is equivalent with
( pexec -j -l mymutex && cat something.txt && pexec -j -u mymutex )
.
-m, --mutex
<mutex-name>-a, --atomic [-c|--shell-command] [--]
<command>-m|--mutex
) of the command can be performed.
For example, pexec -j -m mymutex -a cat something.txt
is equivalent with
( pexec -j -l mymutex && cat something.txt && pexec -j -u mymutex )
.
Note that if the lock and unlock operations are in the same pipeline and these operations use the same mutex, the invoker should ensure that the locking call exists before the unlock request could start (otherwise the whole parallel execution blocks infinitely). Like so, if dump and a save operations with the same mutex present in the same pipeline, the intermediate programs should delay the data propagation: the save part must not get any data until the dump part flushes everything to its standard output (otherwise even this single pipeline blocks infinitely).
Note also that the whole remote controlling procedure is transparent to the remote host execution, i.e. every necessary parameter, environmental variable and mutex lock/unlock request will propagate via the remote shell tunnel. Therefore the end-user won't see any difference (and do not have to bother with these details in his final invocation) between the purely local and remote execution processes.
The program pexec
is capable to run in hypervisor mode.
The hypervisor daemon acts as a resource controller, i.e. other
running instances of pexec
ask the hypervisor if there is
available resource or not. The main purpose of the hypervisor
daemon is to balance the usage between concurrent running pexec
instances in order to avoid unexpectedly high load.
-H, --hypervisor [
start|
stop]
pexec
in hypervisor mode. By default, the
hypervisor is not detached from the terminal. If start is
given, the deamon is detached and put into background. Such
running daemons can be stopped using the stop argument.
-C, --control
<port>|
/<path>/tmp/pexec.sock
. If the specified port is a single number,
pexec
creates an INET server socket, otherwise if it is a valid
path, an UNIX domain socket is created.
-n, --number <
number of parallel processes> |
auto |
ncpupexec
programs together.
By default, pexec
uses /proc/cpuinfo
(or other system-dependent
way) to figure out the number of available processing units and use this
number as the maximum of parallel processes.
-l, --load, --use-load <
load>
-n|--number
.
The normalized load is the actual load (averaged on 1, 5
or 15 minutes) divided by the maximum number of parallel processes.
The argument of this switch can be 0, 1 or 2,
or 1min, 5min or 15min, respectively. The pexec
hypervisor uses the specified time averaged load.
-L, --log <
log file>
-W, --log-level <
log level>
-l|--log
is omitted,
the log will be written to the invoker's standard error (maybe in parallel
with other messages gathered by -u|--error -
. If neither
-l|--log
nor -v|--log-level
is specified, logging does not occur.
-V, --verbose
-v|--log-level
) by one. For example,
--log-level 2
is equivalent to -V -V
.
If all options are omitted, only the command and its arguments are specified
after pexec
, the program simply runs the given process, as it
would happen without pexec
:
pexec ssh -X -l user host
Using directly the output of the command seq
, let us calculate
the square root of the first ten integer numbers and store the results
in separate files. For the calculation itself, we use the program bc
, a
command-line driven arbitrary precision calculator:
pexec -o sqrt-%s.dat -p "$(seq 10)" -e NUM -n 4 -c -- \ 'echo "scale=10000;sqrt($NUM)" | bc'Here we explicitly used 4 processors. The number itself is passed to the shell-script via the environmental variable NUM.
This example sorts some files which matches the a pre-defined search pattern.
The sorted versions of the files are stored in the files with the same
names but the suffix .sort
is always appended.
pexec -p "$(ls myfiles*.ext)" -i %s -o %s.sort -- sortHere we used explicit redirection from the input files to the output files. Since the command itself is very simple, it is wise to put the dash-dash before the command to make the reading of the whole command easier.
In this example we assume that we have a list file of base names
of astronomical images, i.e. the images themselves have the base name
followed by the extension *.fits
. Since we not expect so much errors,
the standard errors are collected in a single file (star.log
, namely):
pexec -f image.list -n auto -e B -u star.log -c -- \ 'fistar $B.fits -f 100 -F id,x,y,flux -o $B.star'The program
fistar
can be more tuned, depending on the actual problem.
The base name of the image implies the base name of the star detection
output (*.star
). Here also an environmental variable, B was
used to pass the varied information, i.e. the basename of the images.
If all of our images (*.png
) have a name which does not start with a dash,
we can use the -r|--parameters
switch too:
pexec -r *.png -e IMG -c -o - -- \ 'convert $IMG ${IMG%.png}.jpeg ; "echo $IMG: done"'For the conversion the ImageMagick tool
convert
is used which
simply figures out the format from the extensions. The trailing echo
just report the images which are ready after conversion, these
"reports" are collected via the standard output and printed for the
invoker due to -o -
. Another realization, using the NetPBM package:
pexec -r *.jpg -i %s -o %s.png -c 'jpegtopnm | pnmtopng'
In this example a simple usage of mutexes is demonstrated. The usage of mutexes prevents a high (peak-like) disk access if many processes would try to read/write the same disk simultaneously:
pexec -n 8 -r *.jpg -y unix -e IMG -c \ 'pexec -j -m blockread -d $IMG | \ jpegtopnm | pnmscale 0.5 | pnmtojpeg | \ pexec -j -m blockwrite -s th_$IMG'In the above example an UNIX domain socket is used to for the communication between the main
pexec
program and the remote control calls.
-k|--local-files
option is still not implemented.
pexec
prints an error to standard error and exits,
therefore unless the standard error is gathered somehow (see -u|--error
),
the invoker won't be informed in such cases. And even if so, the invoker
is unable to distinguish this from the case when the successfully
executed process prints directly the same message to its standard error.
This is not a real bug but this behavior is planned to be changed in the
future.
This manual page describes pexec
version 1.0rc5
.
This software was written by Andras Pal. The core part
was written while working for the Hungarian-made Automated Telescope (HAT)
project to make the data processing more easier and therefore find
many-many extrasolar planets.
See more information about this project: http://hatnet.hu
.
Another internal libraries (e.g. numhash.[ch]
were primarily
written for other projects. Send bug reports, comments and remarks
to apal@szofi.elte.hu
or apal@cfa.harvard.edu
.
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
This documentation: copyright (C) 2007, 2008; Andras Pal. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".