Fun with Pipes
Written by Sam Moffatt   
Wednesday, 17 October 2007 01:00

This article is about the usage of Pipes following up from a discussion on Pipes at a LUG meeting. This example covers the autogenerate of manifest files for libraries in Joomla! 1.5 as an example automation task, as this is a real world example.

As a small follow up to the Pipes covered on Tuesday night, here is a short script to help improve the usage of individual tools and demonstate the usage of pipes:

 

for i in `find * -type d -maxdepth 0`;
do DIRNAME=`basename $i`;
echo Processing $DIRNAME;
cat template.head > ~/manifest-$DIRNAME;
cd $DIRNAME;
find | grep -v svn | grep php | sed -e 's/.\/\(.*\)/
\1<\/file>/g' >> ~/manifest-$DIRNAME;
cd ..;
cat template.foot >> ~/manifest-$DIRNAME;
done

Or in one line:

for i in `find * -type d -maxdepth 0`; do DIRNAME=`basename $i`; echo Processing $DIRNAME; cat template.head > ~/manifest-$DIRNAME; cd $DIRNAME; find | grep -v svn | grep php | sed -e 's/.\/\(.*\)/\1<\/file>/g' >> ~/manifest-$DIRNAME; cd ..; cat template.foot >> ~/manifest-$DIRNAME; done

So lets take this apart, we've got a few things happening here.

First line: for i in `find * -type d -maxdepth 0`
This is a BASH construct to allow looping in the console. In this case we're looping over the output of the command "find * -type d -maxdepth 0". The find command is used to do stuff like recursively finding all files in a directory and running commands against them amoung other things. In this case we're hunting for directories (-type d) in the present folder (-maxdepth 0 and the *, path(s) come before predicates with find). This will grab all of the folders in a mixed file/folder directory but will ignore hidden folders. Using . will include hidden folders as well but the depth would have to be 1. In this case it works for me because I'm ignoring hidden files

Second line: do
DO is a grammar element common to many programming languages (Ada, Pascal, Visual Basic use a similar syntax).

Third line: DIRNAME=`basename $i`;
basename is a Linux command that gets the basename of a path. In this case its handy for ensuring I have some gunk stripped out such as if I was running find with . instead of * and makes it safe for usage later

Fourth line: echo Processing $DIRNAME;
Simply puts that line to the console

Fifth line: cat template.head > ~/manifest-$DIRNAME
This cats a file (template.head) to my home directory (the ~), to a file called manifest-$DIRNAME where DIRNAME is the directory name we used above. The > is used to create the file and remove the existing file if there is one. This ensures the file is clean for when we want to use it.

Sixth line: cd $DIRNAME
This moves us to the directory we're interested in

Seventh line: find | grep -v svn | grep php | sed -e 's/.\/\(.*\)/\1<\/file>/g' >> ~/manifest-$DIRNAME;
This command does a lot of work, and piping. The crux is that its finding all files and directories, stripping out ones with "svn" in their name, and hunting for ones with php in their name. Once its done this is uses a regular expression of 's/.\//(.*)/\1<\/file>/g' (the line above has extra escaping because its on the command line). What it is looking for is ./something and turns it into something. Basically we're generating an XML file on the fly. The last part is the >> operator which instead of >, appends to the file that we referred to before (retains its contents)

Eighth line: cd ..
Moves to the above directory

Ninth line: cat template.foot >> ~/manifest-$DIRNAME
This appends the template.foot file into file we've been using.

Tenth line: done
This ends the for loop

 

The end result is a nice full XML document generated on the fly with a few lines of BASH scripting. In this example there are only a few files, another file generated by this is 173. With a few modifications the files are ready to be imported into another system. This is an example of pipes, BASH scripting (for loop), some different commands (basename, sed, grep, find and cat), output redirection (> and >>) in a real world environment. The key to all of this is learning each little part of the equation, building it up seperately into the final component. For example the sed expression was the last part of the system. Getting the initial find was done before it was put in a loop.

Sam

Last Updated on Tuesday, 11 December 2007 18:18