Bash You Don’t Want To Use

September 2014 a very serious undocumented feature in Bash was published. Here’s how to check for it.

env x='() { :;}; echo vulnerable' bash -c "echo this is a test"

If this returns the word "vulnerable" then it is. If not, then no. Patches cleanly fix this but it’s a big problem for CGI enabled web-servers and DHCP servers.

Bash User Interactive Keys

I often forget useful Bash keyboard tricks because I have other ways of doing things. Here are some worth remembering.

Immediate Special Functionality

^ls^rm^

Run the previous command with rm substituted form ls

ALT-.

Immediately substitute the last argument of the previous.

ALT-B

Back one word.

ALT-F

Forward one word.

ALT-D

Delete a word.

ALT-T

Toggles order of two words - "A B" becomes "B A" when cursor is between.

ALT-U

Turns a word into upper case - "sample" becomes "SAMPLE".

ALT-L

Turns a word into lower case - "SAMPLE" becomes "sample".

ALT-C

Capitalizes a word - "SAMPLE" or "sample" becomes "Sample". command.

ALT-*

Immediately substitute expansion of a star glob.

ALT-?

Propose results of expansion of a star glob.

CTL-]

Jump forward to the next occurrence of next typed character.

CTL-ALT-]

Jump backward to the next occurrence of next typed character. Can also type ESC, then CTL-], then character to find.

CTL-_

Undo last command line edit operation. (Shift is used.)

ALT-#

Comment out the line and enter. Handy for preserving visibility of a command but not executing.

Special Meaning Syntax

!$

Substitute the last argument of the previous command.

!*

Substitute all arguments of the previous command.

!!

Substitute last command (I use up arrow for normal cases).

Tips

This has a lot of sensible tips about organizing and running more complex Bash programs.

Debugging

Run bash with bash -x to have bash print out every thing it’s doing. To localize this effect, use set -x just before the code you want to examine and use set +x where you want that to stop.

I often do things like this.

CMD="${RMCMD} ${FLAGS} ${DIRTOKILL}"
echo "About to do this:"
echo ${CMD}
read -p "If this doesn't look right, press CTRL-C."
eval ${CMD}

Bash’s Crazy Config Files

In the /etc directory are some default bash config files that affect bash’s behavior unless countermanded by user set config files.

Other people have also tried to explain this.

/etc/profile

This file is read and executed at initial login time by all users. I think that subshells ignore this. It also seems like this gets executed before the the xterm box is ready, so output from this script, like a message of the day, doesn’t seem to make it to the screen. It will make it to a log file however. This means that if you want to print a message or set the PS1 prompt variable, it won’t carry over when X subsequently opens up an xterm.

/etc/bashrc

This file is read and executed by all bash initialization activity such as logging in by any user from anywhere and any sub shell that the user spawns. I think that this is referred to in a lot of unexpected places too. So it’s not smart to have anything too busy here. This file might actually be a fake. It isn’t mentioned in any documentation. It seems to be called from ${HOME}/.bashrc. Therefore, if that file doesn’t exist, then this one is useless. It’s not really a global catch-all if users can pull the reference from their ${HOME}/.bashrc, so I don’t really see the point to it. Although it would take up a trivial amount of extra disk space, a better way would be to have the things that you wanted in here, in a skeleton for creating ${HOME}/.bashrc files when you create new users.

I add plenty of personal optimization to my .bashrc but one of the more popular tricks is my prompt which keeps the user apprised of the exit status of the previous command. Like this.

:-> [host][~]$ true
:-> [host][~]$ false
:-< [host][~]$ true
:-> [host][~]$
My Happy Prompt
# xed- Fancy prompt      :->
BLU="\[\033[0;34m\]" #Blue.
LGY="\[\033[0;37m\]" #Light Gray.
LGR="\[\033[1;32m\]" #Light Green.
LBU="\[\033[1;34m\]" #Light Blue.
LCY="\[\033[1;36m\]" #Light Cyan.
YEL="\[\033[1;33m\]" #Yellow.
WHT="\[\033[1;37m\]" #White.
RED="\[\033[0;31m\]" #Red.
OFF="\[\033[0m\]"    #None.
LASTSTAT=":-\$(if [ \$? -eq 0 ] ; then echo -n '>' ; else echo -n '<' ; fi )"
if [[ "$UID" > "0" ]];
then # Normal User
    PS1="${LASTSTAT}${BLU}[${LGR}\H${OFF}${BLU}][${LCY}\w${BLU}]${LBU}\$${OFF} "
else # Root UID=0
    PS1="${LASTSTAT}${BLU}[${RED}\H${OFF}${BLU}][${LCY}\w${BLU}]${LBU}${SD}#${OFF} "
fi

$HOME/.bash_profile

This is the user’s chance to redo the initial login settings established by /etc/profile, if there is one. This does things once only on log in. If the user wants something run every time he logs in, like a back up or something, here’s the good place for it. Be aware that this file has no less than two functionally equivalent synonyms, 1.) ${HOME}/.bash_login and 2.) ${HOME}/.profile which are searched for in order.

$HOME/.bashrc

This file is executed for all the owning user’s new shell activity. This means that these commands will be reissued for each and all subsequent subshells. This is where users can put custom aliases that should be very persistent. This file seems to execute quite frequently - 3 times on initial login for me and once for each subshell.

Pattern-Matching Operators:

  • ${variable#pattern} if pattern matches beginning, del shortest part

  • ${variable##pattern} if pattern matches beginnnig, del longest part

  • ${variable%pattern} if pattern matches end, del shortest part

  • ${variable%%pattern} if pattern matches end, del longest part

Here’s an example - this changes a big long list of x001.gif, x002.gif,etc to this b001.gif, b002.gif, b003.gif etc.

for ce in x???.gif; do mv $ce b${ce#x}; done

Another example - this converts a series of tif images with the names tb01.tif, tb02.tif,tbwhatever.tif to tb01.gif, tb02.gif, etc. It also changes the images to 15 color grayscale.

[~/xfile/project/pix]$ for ce in tb*.tif; \
> do convert -colorspace GRAY -colors 15 $ce ${ce%tif}gif; done

Arithmetic Expansion

I don’t know how long bash has had this, but it wasn’t a useful feature when I learned Bash a long time ago. But these days it is quite handy and elevates Bash to the status of quite a sensible general purpose language. Basically in the distant Unix past one had to use bc and dc and expr to get any kind of arithmetic done, even trivial things like incrementing a variable involved spawning a new shell and casting text strings every time. Now you can pretty much use the (( expression )) syntax and the expression will be evaluated in a pretty sensible way. If you need the expression to stand by itself as a process which produces its own exit code (like the old bc but I’m not sure a new process is actually spawned), then you can do something like this:

if (($RANDOM%2)); then echo Tails; else echo Heads; fi

If you need to produce the result for further consideration, you need to expand the expression like this:

echo $(($RANDOM%6+1)) # 6 sided dice
# Print a random line from a file:
F=file; sed -n "$(($RANDOM%`sed -n $= $F`+1))p" $F

Note that the performance of these operations can be suboptimal. Often spawning new shells for sub commands beats the arithmetic expansion.

$ X=0; time while (( X < 100000 )); do true $((X++)); done
real    0m2.794s
$ X=0; time for X in `seq 100000`; do true; done
real    0m1.176s

Oh and check out this:

$ X=0;time while (( X < 100000 )); do true $(( X++ )); done
real    0m2.875s
$ X=0;time while ((X<100000)); do true $((X++)); done
real    0m2.510s

Normally in Bash, it’s good to help the tokenizer by being explicit about where things are, but in this case, spaces just waste time.

Escaping

Do the action 10 times concurrently:

for X in `seq 10`; do ( time python binpack.py & ) ; done

SSH Escaping

Sometimes running complex things over SSH can be very annoying. Here is a complex example that has a redirect and glob and variables which is run in the background on several hosts at the same time.

for H in ${TheHosts[@]}; do
    echo "Processing ${TargetDir} on ${H}..."
    RCMD="${TheScript} ${TargetDir}/*.gz > ${TargetDir}/output.${H}.sdf;"
    RCMD="${RCMD} echo Done with ${H}"
    echo ${RCMD} | ssh ${H} $(</dev/stdin) &
done

For simple things, something like this works:

for H in ${TheHosts[@]}; do echo ${H};\
( time ssh ${H} bzip2 /tmp/zinc/thechosen1s.${H}.sdf & ) ; done

Arrays

Bash has perfectly good support for Arrays. Search for /^ *Arrays in the man page for complete information.

This will print out a random file from the current directory:

F=(*);echo ${F[$((RANDOM%${#F[@]}))]}

Here is an example program that copies a large collection of files in a source directory to many different hosts (say in a cluster for processing) such that each host will receive the same number of files (to the extent possible):

# Where are the files coming from:
SourceDir="/data/zinc/zinc"
# What hosts will the files go to:
TheHosts=( c39 c40 c41 c42 c43 c44 c45 c47 c84 c85 c86 c87 c88 )

NofH=${#TheHosts[@]}
TheFiles=( ${SourceDir}/* )

C=0
for F in ${TheFiles[@]}; do
    echo "rsyncorwhatever ${F} ${TheHosts[C]}:/thedir/"
    C=$(( (C+1)%NofH ))
done

Mass updating of files that contain similar things that need changing

Here is a method to massively update lots of files that all contain some bad thing and replace it with some good thing. There might be a smoother way to use the sed command, but redirecting to the same file causes the file to simply disappear. Using a tempfile works just fine.

$ for cxe in project??.type ; do cp $cxe temp.temp; \
sed s/bad/good/g temp.temp > $cxe; done

Here, all files that match the project??.type pattern (project01.type, project1a.type, etc) will be copied into the temporary file temp.temp and then they’ll be pulled back out into their real name line by line by sed which will also make substitutions as requested. In this case, it will change all references of "bad" to "good". Don’t forget to erase the residual temp.temp file when you’re done.

Another note - if the text is long, you need quotes around it so bash doesn’t get wise with it: sed s/"This is very bad."/Nowitsgood/g file

If the text has quotes in it:

$ for X in *.html; do cp $X temp.temp;\
> sed s/height=\"66\"/height=\"100\"/ temp.temp > $X; done

Here’s another example-

$ for XXX in *html ; do cp $XXX temp.temp; \
> sed -e "/---/,/---/s/<table>/<table align=center>/" temp.temp > $XXX; done

This only does the substitution after it finds 3 dashs and only until it finds 3 more dashes.

Functions

Bash functions are quite useful. They are like bash aliases with the added ability to take parameters. Here’s an example of function use that I have in my ~/.bashrc:

# Break a document into words.
function words { cat $1 | tr ' ' '\n' ; }
function wordscounted { cat $1 | tr ' ' '\n' | sort | uniq -c | sort -n ; }

Another syntax that can be used is this.

words () { cat $1 | tr ' ' '\n' ; }

This is what the set command reports and is probably preferable even though it seems kind of "fake C" to me.

The automatic variable ${FUNCNAME} is available in the body of a function. The parameters supplied when the function was called can be accessed with $@ with $1 being the first one, etc.

Redirection

How to redirect input from the output of a process
mount | grep sd[a-z]
grep sd[a-z] <(mount)

These both produce the same result.

How to pipe those nasty errors off to Neverneverland.
grep hattrick * 2> /dev/null

This will do the expected thing with stdout, but stderr will head off to the trash.

To get all of the garbage a command spits out to go to a file:

make &> compile.error
How to pipe those handy errors off to another command.
cmd1 2>&1 | cmd2
cdparanoia -Q 2>&1 | grep "no   no"

Much more fancy things can be done with pipes and redirection. It is possible to use the test command (which is related to the [] syntax since [ is an alias for test) to check for where a file descriptor is from. That’s confusing but this makes it clear:

    if [ -t 0 ]; then
        echo Interactive
    else
        echo File descriptor 0 coming from a pipe
    fi

The exec command replaces the current process with a subshell containing the specified process as in exec takeoverthisjob -a args. An interesting effect and demonstration of the exec command is seen in the following example.

echo This is being sent to standard output like normal.
exec > logfile.txt
echo This is now being put in the logfile.

Complete Chat Utility Using Named Pipes

Here is a nice example of how to use named pipes to implement a complete private chat system. Imagine your user name is "primary" and your friend’s is "guest".

Put this in /home/primary/.bashrc
TOPIPE='/home/guest/toguest'

This is the named pipe which will contain your messages to the guest. Note that it goes in the guest’s home directory which you’ll need access to. These could go anywhere (somewhere in /tmp might be good).

Put this in /home/guest/.bashrc
TOPIPE='/home/guest/toprimary'

Here’s the named pipe which you will look at to see the guest’s chat. After you specify where these will go, you need to create these named pipes once with this command:

$ mkfifo /home/guest/toprimary; mkfifo /home/guest/toguest

Next put these macros into both .bashrc files.

Chat macros
SEDSUB="s/^/ [$USER]: /"
alias listen='/bin/cat </home/guest/to$USER &'
alias chat="/usr/bin/sed -u '$SEDSUB' >$TOPIPE"

Now to chat both parties simply log in (SSH makes the whole thing pretty secure) and each types listen to start listening for the other’s text. And then they type chat to start sending. Now both parties can type and read the other’s typing.

Loops

Often you’ll want to use xargs or Gnu Parallel but you will be frightened by very messy syntax. One way to get around such things is to use a bash loop to receive the output. Here’s how it would work:

find . | while read N; do md5sum $N; done
find . | while read N; do echo -n $N; md5 $N |sed -n 's/^.* / /p'; done

This will produce a list of check sums for all files in the current directory (and below). This command can be useful to create an inventory of what’s in a directory tree and compare with a directory tree elsewhere to see what has changed.

Here’s example showing how to avoid xargs:

cat excludes | while read X; do sed -i "/$X/d" mybiglist ; done

This takes a big list of things and a smaller list of things you wish were not in the big list and it looks through each item in the excludes list and removes it in place from the big list (if you’re using GNU sed, otherwise make your own temp file).

Logging

If you ever need to log something anywhere or do any kind of error monitoring or debugging, check out man logger. This is very handy and it’s universally available as an old school program yet I had never heard of it.

Here’s a clever way to turn off logging selectively (credit to DaveTaylorOnline.com):

if [ $DEBUG ] ; then
    logger="/usr/bin/logger"+$DEBUG_FILE
else
    logger="echo >/dev/null"
fi

Signal Trapping

Trapping a signal is kind of exotic but when it’s a good idea, it’s a good idea. It can be good for things like cleaning up a mess before letting a user cancel an operation with Ctrl-C. For example (more from Dave Taylor):

trap '{ echo "You pressed Ctrl-C"; rm $TEMPFILE ; exit 1; }' INT
for C in 1 2 3 4 5 6 7 8 9 10; do echo $C; sleep 5; done

It can also be used as a timeout enforcer. Basically spawn another background process that just waits the maximum allowed time and then raises a STOP signal. Then have the main thing that’s being limited trap for that. Here’s how:

trap '{ echo Too slow. ; exit 1 ; }' SIGALRM
( sleep 5 ; kill -s SIGALRM $$ > /dev/null )&
echo "What is the airspeed velocity of an unladen swallow?"
read OK
trap '' SIGALRM
echo "${OK} sounds reasonable to me."

Note that this little script will have problems if the main script exits cleanly and its process ID is taken by something else, something important that should not get a SIGALRM. Just keep that in mind.

To turn off a trap for the INT signal (as used in the previous example):

trap '' INT

General Bash Syntax

Syntax for using the "for" structure.
for variablename [in <list of textstrings or files>]
    do
        commands that take advantage of $variablename go here
    done

Here’s another example. This little routine calculates (via exit code) whether a number is a power of 2 or not. Apparently sometimes big search engine companies care about such things.

$ for N in `factor 16384 |cut -d: -f2-`;do if expr $N \!= 2 >/dev/null;\
then break;fi; done; expr $N == 2 >/dev/null
Using Here Documents
#!/bin/bash
# Heredocs are a way to put a lot of arbitrary text into a commands
# standard input stream. Can be good for CGI scripts and the like.
# Here is a sample program that illustrates the idea.
cat > test1 <<HEREDOC
<html>
<head><title>Here Doc Fun!</title><head>
<body>
HEREDOC
# Note that the heredoc closer needs to be alone on the line.
echo Here are commands that aren\'t part of the heredoc
date
# Note the quotes in the following. These do what they should.
cat << "EOF_SIMPLE"
I owe $0 to you.
EOF_SIMPLE
cat << EOF_FANCY
This program is called: $0
EOF_FANCY
# Heredocs can send the contents of variables to stdin.
MESSAGE=".sdrawkcab si sihT"
rev <<<${MESSAGE}
# If you see "<<-", the dash means strip literal \t (tabs) from
# the beginning. This is for more natural indentation. I don't use it
# because I hate tab's non-obvious behavior.

if/then/fi

IF/THEN constructions
#cxe- Here are two good examples - the first one justs waits for a
#cxe- confirmation message.
read REP
if [ CXE${REP} != "CXEy" ]; then
  echo "Ok, maybe next time."
  exit
fi

#cxe- The second one, makes some new directories if they need to be made.
if [ ! -d ${TND} ]; then
  echo "Creating the directories:"
  echo "mkdir "${WSD}" "${TND}
  mkdir ${WSD} ${TND}
fi

It can get very tedious to use and remember the cryptic "CONDITIONAL EXPRESSIONS" (that’s a hint of what to search for in the man bash). It can often be polite and nice to make nicer sounding functions. Here is a reference of some of these tests in a format that is pleasant to use.

function is_empty { [[ -z "$1" ]] ;}
function is_not_empty { [[ -n "$1" ]] ;}
function is_file { [[ -e "$1" ]] ;} # or -a
function is_readable_file { [[ -r "$1" ]] ;}
function is_writable_file { [[ -w "$1" ]] ;}
function is_executable_file { [[ -x "$1" ]] ;}
function is_regular_file { [[ -f "$1" ]] ;}
function is_non_empty_file { [[ -w "$1" ]] ;}
function is_dir { [[ -d "$1" ]] ;}
function is_symlink { [[ -h "$1" ]] ;} # or -L
function is_set_var { [[ -v "$1" ]] ;} # Send "Name", not "${Val}"
function is_zero_len_string { [[ -z "$1" ]] ;}
function is_non_zero_len_string { [[ -n "$1" ]] ;}
function is_mod_since_read { [[ -N "$1" ]] ;}

Option Handling

This code shows two approaches to option handling. The first function uses getopts and is quite robust when dealing with single letter options. It can handle things like ./oh -caALPHA -bBETA -- dog -dashedarg. The second strategy can’t easily do this but it has the advantage of not using getopts at all and being able to match any kind of option, long or short. This makes input like this possible ./oh -b BETA --alpha ALPHA -c js c py. The caveat is that the input must have spaces to parse the options from the option arguments (not -bBETA). If you need a custom option handling scheme which can do something like -long1dash=bash this can probably be made to work by just replacing the = and looking for -long1dash. Another strategy that is not implemented would be to look for long options in the option string and replace them with short options and send on to getopts as normal.

The variable PARGS contains the "program arguments" which are not options and not option arguments. In the long option case, this starts off as the same list as the original input and non-qualifying components are removed explicitly with unset. The approach shown here is also a decent example of modularity and good code organization.

Option Handling With getopts
#!/bin/bash
# oh - Option Handling Examples - Chris X Edwards

function show_usage {
cat << EOUSAGE
Usage: $0 [-h] [-a alpha] [-b beta] [-c] <arguments>
       -h = usage message
       -a = set alpha value [default 'First']
       -b = set beta value [default not set]
       -c = set gamma flag [default not set]
EOUSAGE
} # End function: show_usage

function handle_options_getopts {
    # Option letters followed by ":" mean has an OPTARG.
    while getopts "a:b:ch" OPTION
    do
        case ${OPTION} in
            a) ALPHA=${OPTARG};;
            b) readonly BETA=${OPTARG};;
            c) readonly GAMMA="true";;
            h) show_usage && exit 0;;
        esac
    done
    shift $((OPTIND - 1)) # Leave behind remaining arguments.
    PARGS=$@
} # End function: handle_options_getopts

function handle_options_long {
    local ARGV=(${ARGS}) j=0
    PARGS=( ${ARGV[@]} ) # Program arguments (not options or option arguments).
    for OPTION in ${ARGV[@]}
    do
        i=$((j++))
        case ${OPTION} in
            -a|--alpha) ALPHA=${ARGV[j]}; unset PARGS[$i] PARGS[$j];;
            -b|--beta) BETA=${ARGV[j]}; unset PARGS[$i] PARGS[$j];;
            -c|--gamma) GAMMA="True"; unset PARGS[$i];;
            -h|--help) show_usage && exit 0;;
        esac
    done
} # End function: handle_options_long

function display_options {
    [ "$ALPHA" ] && echo "Alpha is set to: $ALPHA"
    [ "$BETA" ] && echo "Beta is set to: $BETA"
    [ "$GAMMA" ] && echo "Gamma is set."
    echo -n "Arguments: "
    for OP in ${PARGS[@]} ; do echo -n "$OP " ; done
    echo
} # End function: display_options

readonly ARGS="$@"
ALPHA="DefaultValue"     # Set default value.
#handle_options_long
handle_options_getopts ${ARGS}
display_options