Unix pipelines
Categories: OS X | Written By: thedoctor
Experimenting with pipelines is one of the best ways to exploit the power of Unix/Linux. Pipes basically feed the output of one command line program as input to another program. By combining commands in this way, complex processing can be undertaken to answer the sort of real-world questions that every Web developer has.
For example, suppose we have two sitemap files in XML format generated a day apart and we want to quickly identify the new links from the second day’s file. XML files are not particularly human-readable but a Unix pipeline can be used to extract the data we’re after:
diff liammccormicktumblr_old.xml liammccormicktumblr.xml | grep "<loc>" | sort
- The
diffcommand compares the two files and outputs any lines of text that differ between them. - This output is then fed into the
grepcommand, which in this case acts as a filter for the location links only. - Finally, the
sortcommand sorts the output into a list in the order in which the posts were added.
For example, the output of the above code will be:
> <loc>http://liammccormick.tumblr.com/post/136973585/love-this-via-likeahobbitinthespace</loc>
> <loc>http://liammccormick.tumblr.com/post/137014528/brettjohn-colorware-stealth-macbook-pro</loc>
> <loc>http://liammccormick.tumblr.com/post/137029186/so-small-yet-already-incredible-tout-petits-et</loc>
> <loc>http://liammccormick.tumblr.com/post/137029407/more-evian-babies</loc>
> <loc>http://liammccormick.tumblr.com/post/137031979/pastel-houses</loc>
> <loc>http://liammccormick.tumblr.com/post/137034314/marvel-vs-capcom-2-hulk-vs-zangief</loc>
As with almost all Unix commands, further details on how to use these programs is available by issuing, for example, either man diff or info diff at the command line.


