Command Line Fu (Part 1)

Almost every developer needs to get out to the command line once in a while. Some of us almost live there. So, here are a few tips and tricks that I’ve picked up over the years. May they help speed you along to your final command line destination.

Find Exec

Have you ever wanted to execute a command against every file in a directory? Or every file that matched a search? Well then, find with exec is your friend. For example, say we wanted to change the permissions on every file within a directory.

$>find . -type f -exec chmod 744 {} \;

The first argument, the ‘.’, tells find to start the search from the current directory. The ‘-type f’ says to look for only files. The -exec option tells find to execute the following command against every search result. Each result is substituted in the place specified by the {}. The ‘\;’ terminates the command. So, basically this command quickly executes a ‘chmod 744’ against every file found in the current directory. You could also add a ‘-name *.sh’ in order to further limit the search to all files ending in a ‘.sh’.

$>find . -type f -ls

This version of the command (the one with the -ls) simply shows you what files will be operated on as a result of the find. It is good practice to run this command first using all the same options except replacing the -exec with the -ls. The -exec option can be particularly destructive so know what what files you are operating on. Performing a ‘rm -rf’ as a result of a find against the root filesystem will send your day in the wrong direction. So, use the -ls version of the command first to see what files will be used.

Right Sed Fred

“Who’s sed? Sed’s dead baby. Sed’s dead.”1 Forgotten maybe but not dead. Sed stands for “stream editor” and it’s one of those old school neckbeard command line utilities that should be in your arsenal. Sed is prefect for picking through large amounts of text and twisting it to your will. Take this over simplified csv file as an example:

$>cat numbers.csv
one,two,three
four,five,six
seven,eight,nine

With sed you can do crazy cool stuff like the following:
# Delete lines
$>cat numbers.csv | sed -e '2d'
one,two,three
seven,eight,nine

# Print or add lines (prints current line)
$>cat numbers.csv | sed -e '1,2p'
one,two,three
one,two,three
four,five,six
four,five,six
seven,eight,nine

# Substitute text
$>cat numbers.csv | sed -e 's/one/zero/g' | sed -e 's/,/:/g'
zero:two:three
four:five:six
seven:eight:nine

The -e on the sed command line tells it to accept what follows as the command. You can also write entire files of sed commands but I never do this. Instead I tend to chain successive commands together using the unix pipe in order to get what I’m after (see the Substitute text example).

The sed command takes the following structure: [address[,address]]function[arguments]. Where [address] can be a line number or a regex style pattern. Here’s a few more examples that use variations of addresses:
# Add the following text after lines where the pattern is found
$>cat numbers.csv | sed -e '/one/a\
$>token'
one,two,three
token
four,five,six
seven,eight,nine

# Insert the following text before lines where the pattern is found
$>cat numbers.csv | sed -e '/four/i\
$>token'
one,two,three
token
four,five,six
seven,eight,nine

# Read in the file abc.csv and add it after lines where the pattern is found
$>cat numbers.csv | sed -e '/four/r abc.csv'
one,two,three
four,five,six
a,b,c
d,e,f
g,h,i
seven,eight,nine

# Delete lines 2 through 4
$>cat numbers.csv | sed -e '/four/r abc.csv' | sed -e '2,4d'
one,two,three
g,h,i
seven,eight,nine

# Delete the lines found between the two patterns
$>cat numbers.csv | sed -e '/four/r abc.csv' | sed -e '/four/,/e.f/d'
one,two,three
g,h,i
seven,eight,nine

And that’s just the beginning of the power of sed. Getting comfortable with the basic commands and structure of sed can allow you to whip out some minor change or pick through some complicated text in a flash. It’s a quick and powerful tool sitting there on your command line just waiting to be utilized. Man up the man pages (man sed) and check out some of the other cool stuff sed can do.

Before we leave sed here are some powerful sed one liners:

sed -e '$='                              # Count lines (similar to wc -l)
sed -e 'G'                               # Double space a file
sed -e 'n;d'                             # undo double spacing
sed 's/^M$//'                            # convert DOS newlines to Unix format.
                                         # (Press CTRL-V then CTRL-M to get ^M)
sed = numbers.csv | sed 'N;s/\n/     /'  # Number the lines in the specified file
sed -e 's/^[ \t]*//'                     # delete leading whitespace from each line
sed -e 's/[ \t]*$//'                     # delete trailing whitespace from each line
sed -e '/^$/d'                           # delete blank lines

Awkland

Awk is the pretty cousin to sed. When you have any semi-well formatted data with a known separator awk is your man (or gal). Awk’s power lies in printing various portions of columns of data. Strictly speaking they don’t need to be real columns of data but rather just have a known separating character. For example:

$>echo "one two three" | awk '{print $2}'
two

Awk’s default separator is conveniently the space. The ‘{print $2}’ is the awk command and tells it, in this case, to print the 2nd column of data as separated by a space. You can also change the separating character with the -F option:

$>echo "one,two,three" | awk -F, '{print $2}'
two

In this case the comma was specified as the separator. Whatever character immediately follows the -F will be used as the separator. Like sed, awk can make use of addresses to select which rows of text it operates on.

$>cat numbers.csv | awk -F, '/four/ {print $2}'
five

$>cat numbers.csv | awk -F, '/four/,/seven/ {print $2}'
five
eight

In the first example, the line with the word ‘four’ in it is selected and the second column is printed (‘five’). In the second example, the lines between the one that contains word ‘four’ and the line that contains the word ‘seven’ are selected and their respective second columns are printed (‘five’ and ‘eight’). You can also have awk read a file directly rather than using cat.

$>awk -F, '/four/ {print $2}' numbers.csv
five

However, I almost never do this since it limits my ability to chain commands together which is where the real power in sed and awk lie.

To round out awk here are some powerful awk one liners:

awk '{print $NF}'             # Print the last field of each line
awk 'NF > 3'                  # Print each line with more than 3 fields
awk '{print $2, $1}'          # Print the first two fields in opposite order
awk '!/regex/'                # Print only the lines which do not match the regex
awk '/regex/{getline;print}'  # Print the line immediately following the regex

Knock Out Blow

Sed and awk work together to land a powerful one, two knock out blow. Together they can chew through even the most thorny text munging problems. For example, I recently needed a script to determine the process id (pid) of a running process so that I could kill it. Easy with sed and awk at your side.

PID=`ps -aef | grep "unicorn master" | sed -e '/grep/d' | awk '{print $2}'`

What’s going on here? The ps command is listing all processes. This is piped into the grep command which picks out the one that I am looking for (the awesome unicorn ruby http server). The ps and grep together print out something like this:

root 19092 1 0 Feb10 ? 00:00:03 unicorn master -c /etc/unicorn/sites/test/unicorn.rb -E production -D
ubuntu 22562 7121 0 15:42 pts/0 00:00:00 grep --color=auto unicorn master

The pid I want is right there, 19092, but how do I get at it? Easy. Pipe the output into sed to delete the line picking up our running grep command (sed -e ‘/grep/d’) and then pipe that into awk to print out the second column of text, our process id.

$>ps -aef | grep "unicorn master" | sed -e '/grep/d' | awk '{print $2}'
19092

As you can see with sed and awk at your side crunching through nearly any text is possible. And it’s all done right there on the command line.

In the next installment I will move away from sed and awk to explore remote command execution, command looping, and some powerful kung fu master techniques to utilize your command history.


1. Zed is dead.


This post is a part of the iDevBlogADay group of Indie development blogs. Thanks to @mysterycoconut for managing such a great site.