GNU Parallel

From LinuxReviews
Jump to navigationJump to search
Gnu-head.jpg

Parallel is a handy command-line tool for running multiple commands - or jobs - in parallel. It is very useful for running a larger number of single-threaded jobs in parallel on a multi-core system.

Consider this really meaningless example:

parallel echo 'Job for:' ::: *.png

This simple command, when ran in a folder with some png images, will produce output similar to:

Job for: Guix-installing-software-fs8.png
Job for: guix.png
Job for: KDE_Katie.png
Job for: kdenlive-wjsn.png
Job for: lives-2.10.2-fs8.png

You may or may not wonder what's the point of using parallel when you can do the exact same thing, echo a filename, by running

for f in *.png;do echo 'Job for: '$f;done
Gnu.jpg

and answer is simple: You can run 1000 echo jobs in an instant so it does not matter if it's single-threaded or multi-threaded but it does matter if you want to actually do something with a command like convert[1] and that something is single-threaded.

Running a mostly single-threaded program in sequence or parallel can make a big difference if you have a 4 or 6 or 8-core system. There's is a very real and clear advantage in executing 8 or 10 single-threaded jobs instead of just one if you have 8 cores. This is what GNU parallel can do for you.

parallel pngquant ::: *.png

would obviously be faster than using for f in *.png;do on a larger collection of images since pngquant[2] takes up to half a second per image.

Use the -jNUMBER argument to specify just how many jobs you want to run in parallel. -j7 would be a good choice if you have 6 cores. Thus we get:

parallel -j7 pngquant ::: *.png

Bash has the handy $(nproc) variable which reflects how many threads your system has. Try echo $(nproc) and you will get the number of cores you have. This can be used to run parallel with the total number of threads you have. Our pngquant example would, with that variable, be:

parallel -j$(nproc) pngquant ::: *.png

For more advanced use-cases the same syntax as xargs[3] applies.

Parallel will replace {} with input file names and {.} with input files without the file extension. This opens the door for using tools like convert from the ImageMagick without having to do anything special to replace the target file extension:

parallel -j$(nproc)  echo convert {} {.}.webp ::: *.png

The parallel manual page is really long. You should want to read it, but if you don't want to there's a short video introduction series on YouTube. If you do look at the manual page and think "This is as long as a book" you'd be wrong. There is a paper-back book and it is 112 pages long.

Versions of Parallel are available in all distributions repositories.

You can read more about GNU Parallel at http://www.gnu.org/s/parallel/

Footnotes[edit]

  1. ImageMagick's convert can convert and resize and manipulate images. See the manual page to learn how to use it
  2. pngquant is a tool for compressing PNG images, see the pngquant manual page to see what it does
  3. tool for running commands from standard input. See the manual if do not know how to use it

Questions?[edit]


Add your comment
LinuxReviews welcomes all comments. If you do not want to be anonymous, register or log in. It is free.