The GNU Parallel homepage is here.
Simply Run Something Multiple Times
Here is my normal trick for loading up all the CPUs on a machine (for power testing for example). This shows what to do to have 8 jobs running that use as much cpu as possible:
seq 8 | parallel "python -c 'while 1:1/3'"
As Batch Job Scheduler
This is a cool example from the man page:
echo >jobqueue; tail -f jobqueue | parallel
To submit your jobs to the queue:
echo my_command my_arg >> jobqueue
Use -S
to add some remote hosts and you’re done!
A Good Example Of An Effective Use
GNU parallel will often have overhead that makes it not worthwhile. But if your problem has some very deep CPU intensive business, then it starts to pay off. Here is a demonstration of finding some products of large primes on a machine with 48 cores.
Without parallel
:-> [forty8][~]$ time python -c \
"for x in xrange(249999903000009000,249999903000009300):print x" | \
factor | awk 'NF==3{print $2,$3}' | wc -l
32
real 1m20.155s
user 1m20.134s
sys 0m0.026s
With parallel
:-> [forty8][~]$ time python -c \
"for x in xrange(249999903000009000,249999903000009300):print x" | \
parallel -j20 "factor {}" | awk 'NF==3{print $2,$3}' | wc -l
32
real 0m6.512s
user 1m21.735s
sys 0m1.375s
Notice the "user" time here shows that a lot of CPU time was used, but
with parallel
the wall clock "real" time was greatly reduced.