pdsh - A Sysadmin's Secret Weapon

In my experiences working with the computing grid and the cloud, the ability to run commands across a large set of servers becomes quite necessary. From forcing a puppet run, to gathering hardware statistics - these tasks become non-trivial and even painful when your server count mounts into the hundreds and beyond. There comes a point where Bash loops will no longer suffice.

I spent a bit of time researching various solutions and came across Parallel Distributed Shell (pdsh), which is an open-source project from the Lawrence Livermore National Laboratory. It is available in most Linux distributions, and can easily be compiled from source otherwise. What it allows for you to do is to run commands on remote hosts in parallel by expressing a hostgroup via an external library such as libgenders.

I highly encourage taking a look at the documentation and seeing how powerful this little tool is: http://code.google.com/p/pdsh/wiki/UsingPDSH

I use it heavily in the operations team at Acquia, and it has served me extremely well when I’m in a tight spot and I need to run a command across a large set of servers quickly. A quick tip about use- I tend to run pdsh with this environment variable setting, especially since servers can commonly be relaunched in a cloud environment, and I don’t want to deal with my SSH known_hosts file being inaccurate:

PDSH_SSH_ARGS_APPEND="-q -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o PreferredAuthentications=publickey"

Ask me any questions about pdsh in the comments!

Dialogue & Discussion