2. sed basics

Contents of this section

2.1 What is sed?

Sed is affectionally known as a 'Stream Editor' which really means that it takes standard in and routes it out to standard out. Typically all input comes from the file that you are editing where it is splashed onto your screen or redirected to file. The command instructions that do the work tend to be inserted into a script that is read by sed. Sed tends to be a good student reading what it is told and interpreting the instructions verbatim. Better put it takes raw information and applies a script that processes the information and burps it to screen with the revisions that you requested. If you do not understand vi or ed then it is suggested that you do before you attempt to do sed as i will not be reviewing or explaining much that i assume you already know. To encapsulate sed allows the automation of routine editing duties on the fly. Redundancy is reduced overall allowing for more effective use of one's time when manipulating data in files. It also lends itself well to conversions and renovations that you might have considered doing on a large number of files.

2.2 Why should I use it?

Although it will take some time to learn the characteristics of Sed the reward as alluded to previously will include better performance of routine editing tasks. This is especially true when you are working on a great many files that have common elements that you wish to alter in one way or another. This will leave you more time to play 'Half Life' and other equally noble pursuits;-) ... or if your anything like me you can hardly wait to write that next experimental server from scratch:) just for the fun of watching it blow up real good and hopefully learning something along the way!

2.3 What do i need to do to prepare myself for life with Sed?

Sed requires that one get up to speed on:

Think of the benefits of watching with not a dry eye as wysiwyg html programmers go through the paces trying to make that deadline... Then you simply type your sed string and just like magic the whole darn revision is complete. That is something that can't be learned in drag and drop classes no matter how much they might wish it. This will take practice though so devote the cycles and sed will be your friend.

2.4 Can you explain the basic syntax for mutiple command processing?

Sed uses a basic pattern of:

address-that-starts-off-chain-reaction{
first command string
second command string
third command string
}

The above example illustrates how much can be achieved on a per line basis with sed. There is much more to learn as you get better you can add if it doesn't get too confusing for you multiple commands on a single line providing that each command is delimited by a ';' . If you decide to get fancy and use multiple commands then it is important that you learn to # out a detailed description so that you remember later on.

2.5 How does sed work?

For more extensive information, please refer to the sed and awk man files.

Sed works by reading each and every line of input one line at a time. It then applies the modifications that you requested to the pattern space for that line outputting the results to standard out. This all occurs on a per line basis so it can be considered a sort of global default version of vi/ed if you like. It is important to realize like all good editors it can switch its characteristics at will. In this case this means that sed can output only the lines that are being edited or the whole file depending on your requirements. It can also redirect the contents of standard out to a different file. Since the author of this document is a sed head of sorts we better stop now so that the contents herein continue to remain accurate.

2.6 How do i substitute a object for another object in my file(s)?

For more extensive information, please refer to the sed and awk man files.

use the following string altering as is required by your file(s):

s /string that starts chain reaction/my replacement killer string/$flag

flag(s) can be switched on as follows and remember that your mileage may vary...
if $flag == n then the replacement is only to occur on the nth occurance of the pattern.
elseif $flag == g then replacement is globally applied rather than on first occurance.
elseif $flag == p then do the same as standard print command or echo if you like.
elseif $flag == w then write the contents of the pattern space to a specified file.

2.7 What are the replacement meta-characters?

As you might be familiar with the '*' having special meaning so does the meta-characters &, \[#], and \ . They each can be used in the replacement killer string as a sort of powerful short hand. For instance the '&' means that the replacement killer string is the same as the original regular expression. This tells sed to make the replacement string the same as the original regular expression that you were attempting to match. \[#] is a bit more interesting as it is interpreted by sed to mean to control replacement order of variables as previously encapsulated in the regular expression by "\(" and "\)" this might throw you off as it did me if not for an example. So here goes a example to hopefully assist.

Simply create a file called 'myfile' with using ':' as a delimiter with containing.
one:two:three
a:b:c
bouncing:a ball
Then simply massage the keyboards into echoing the following string to standard out.
$sed 's/\(.*\):\(.*\):\(.*\)/3:2:1/' myfile
This will produce a readout to standard out of:
three:two:one
c:b:a
a ball:bouncing

Thought i'd save the easiest for last so here goes. the '\' has a special meaning when inserted in the replacment character area as a means to escape the '&' or another '\' and in instances where you need to escape the substitution commands '/' delimiters. One of the most clever ways of using it is to strip off the newline character to produce a multiline string. The meta-characters in the above example will save you a great deal of typing for sure and sometimes simply is the best method to get the desired end result. The more complex handlings are left to the reader for practice makes perfect.

2.8 How do i delete a line?

For more extensive information, please refer to the sed and awk man files.

Probably the easiest command to remember it is also one of the most dangerous if you are asleep while using sed. The matched regular expression is deleted as is the entire line. Remember since sed by default is global consider the 'd' for delete seriously. here is a example as food for thought.

$sed /^$/d

one of my favorites it will blow away all lines that are empty without tabs or spaces. If you are not paying attention here you can end up deleting your desired string as well as all the other strings that match the search pattern. Better that you use the substition command as it is a whole lot safer and far more granular for more control of pattern replacement.

2.9 How do i append before the regular expression line?

For more extensive information, please refer to the sed and awk man files.

If i had to add to a document that needed to have appended before the we might do a:

/< BODY >/i\
Hello World\
This one has happened b4\.

This can be quite useful especially for redundant tasks. It is up to you to take the foundation of anything and expand its consciousness:) The above works providing what experiment and learn.

2.10 How do i insert after the regular expression line?

For more extensive information, please refer to the sed and awk man files.

this very simular to the insert example that was illustrated above. Just do a:

/< BODY>/a\
Hello World\
This one comes after\.

So that means at this time you are now able to alter text on a specific line basis, add text before a specific line or after a specific line.

2.11 How do i change a specific pattern in lines?

For more extensive information, please refer to the sed and awk man files.

Change is very powerful so consider that you needed to make all your file Header tags a bit smaller or rather more accurately deeper subheadings. How would you do that so they all were say from < H1> to < H3> ??? Well this is how although i leave the advanced logic to the reader's imagination and workmanship.

/< H1>/c\
< H3>

This would change everything when it came to the < H1> tag this tag would no longer be < H1> rather it would globally now be called < H3> which is a lot smaller or deeper depending on how you look at things.

2.12 How do i transform certain characters throughout a document to completely different characters?

For more extensive information, please refer to the sed and awk man files.

Here is another command for sed that i would use with caution. The transform command is a global command that will transform every instance of a pattern of characters to another pattern of characters. Note the pattern of characters must be equal on both sides of the equation as the command is non intelligent. Here is a example...

sed 'y/abc/ABC/' myfile

would transform all lowercase 'a', 'b', and 'c' to all capitals throughout the document. I screwed up with this one when trying to be too clever in converting my html document tags to all caps. This is by the way the original method and standard to label tags in html. Say what??? Well this was back in the days when Netscape was Alpha and the web had no pretty pictures period. Talk about dating myself;'{

2.13 How might one utilize the print command to advantage?

For more extensive information, please refer to the sed and awk man files.

By example of course this is a sed command line that i whipped up in a few minutes to demonstrate the use one might get for it on the web. Suppose you have determined that you had to replace some tags in a file without effecting any other strings. This is one way of doing it that does work. I am sure that you will find much better uses of amplifying the example. If you do and wish to show me email me at the supplied address anytime. Well so much for buildup here goes a real short example.

/< TITLE>/< BODY>/{
#here comes my first print...
p
#here comes my first real change to pattern...
s/< TITLE>/< BODY>/p
#now here goes my end tag conversion to < /BODY>
s/< /TITLE>/< /BODY>/p

#not bad for a short piece of code but could be way better as could replace any variables at any time if one just wrote it

2.14 Where might the next command come into play?

For more extensive information, please refer to the sed and awk man files.

Let us visualize for a moment... ok times up how about a blank line that is messing up your output that you would rather waste space somewhere else. In the above code example add a blank line after the original < TITLE> tags. then add some content... whatever it really doesn't matter now when you add the below command to you program it will process the next line without going to the top of the script. Since this line is blank the below sed command would match setting off a chain of events resulting in the blank line being removed. It goes like this:

n
/^$/d

like i said before realize that a delete is always a entire line situation. None the less the next can be as useful as the next hyperlink:-)

2.15 Is there a method to call files much like libraries so that one file can grab from another files resources as needed?

For more extensive information, please refer to the sed and awk man files.

The answer is yes which is real cool and what i really get excited about is the fact that you can do real useful stuff with this feature. Take for instance 'DREAMWVR.COM' say our company wanted to add to many files a bit of information on what 'DREAMWVR.COM' does. How might you do this so that it was managable? First you would need to contruct a file called conveniently 'DREAMWVR.COM' since this is as good a example as any. Then to let people know what we could provide them given the right opportunity we might add the following:

Design -
Development -
Integration -
Security -

Then let say that we had a tag to mark where we to input text called < DREAMWVR.COM> in order to read the file we called 'DREAMWVR.COM' we would just do this in the file that referred to our DREAMWVR.COM file. Insert into a file the text immediately below:

DREAMWVR.COM - Email Us, Call Us, Contact Us...
< DREAMWVR.COM>
We Thank You for Your Viewership...

Then in our handy sed script we would do a:

/^< DREAMWVR.COM>/r DREAMWVR.COM
/^< DREAMWVR.COM>/d

or if you like try the following example... if the above does not work well for your copy of sed.

/^< DREAMWVR.COM>/{
r DREAMWVR.COM
/DREAMWVR.COM/d
}

The logic that would occur is that as the script ran it would scan for the line marked by tag named
< DREAMWVR.COM> then it would read the file called 'DREAMWVR.COM' appending the
contents to the buffer at the location where < DREAMWVR.COM> lived in the file. It would then
delete the line that called the 'read' file again namely < DREAMWVR.COM> this will produce
a standard out that would look like:

DREAMWVR.COM - Email Us, Call Us, Contact Us...
Design -
Development -
Integration -
Security -
We Thank You for Your Viewership...

2.16 How might i use the 'w' command to put it to good use?

For more extensive information, please refer to the sed and awk man files.

Well here you go again asking all the right questions:-) To really see how one might use the power of sed's 'w' command one simply applies the principal of the above example to a more serious problem. Our company has grown so stellar that we need to divide the company into bonafide divisions. In the past it was easy to keep track of things like names and workgroup but all that has changed. so what do we do? See the enormous file we are getting all excited about.

Joyce H. Design
Nicole W. Design
Dennis Development
Paul J. Development
Ken C. Integration
Diane L. Integration
Michael B. Development
Klaus K. Development
Curtis B. Integration
U.N. Owen Security
Richard M. Security

So what does one do with this enourmous file to make it more manageable? Do a fast $rm ... The solutions are endless but to keep your job as the organizer it would be better if you separated all the employees by department which will be renamed division to make us grow that much easier. There are many ways we could handle this animal that i know of more i am sure if you take yourself more seriously.

/Design$/w Design.DREAMWVR.COM
/Development$/w Development.DREAMWVR.COM
/Integration$/w Integration.DREAMWVR.COM
/Security$/w Security.DREAMWVR.COM

Below is a method to scan for the pattern then remove it writing it to a file. it is a little more programmish as we do the following:
/Design$/{
s//
w Design.DREAMWVR.COM
}

Then once we have successfully migrated the info on our people to another file we could just create a script to remove all names that do no have a workgroup. This should logically be zippo once your done the transfer. Most importantly create a script that you can follow... follow. I will leave this up to you as a exercise which is doable. Now do you see the logic? Simply put consider that the various workgroups are at the end of file that is where we want to search. This is logical as it allows us to create logical divisions that are humanly readable. Then we merely write to a new file which we declare and the show is over. I sincerely hope this helped you become more productive good luck and remember what is 'sed' is never forgotten;-)]


Next Chapter, Previous Chapter

Table of contents of this chapter, General table of contents

Top of the document, Beginning of this Chapter