The Next command is used for multiline pattern space manipulation it allows you to juggle
text that you wish to control over multiple lines. Where the next command outputs the contents
of the pattern space then reads the next line... the Next command reads the contents of the
next line then appends it to the pattern space separating first line from the second with a
"\n" character. Let just suppose that we were about to pump out a "Surfing the Internet Guide"
that we wanted to alter so that all instances of "Surfing Guide to the Internet" were to be more
personalized to "DREAMWVR.COM Guide to the Net" . The pattern we are scanning for is spread over
multiple lines and we would like to call upon the great powers of sed. How might we use the [N]ext
command to achieve the desired results with the following content?:
We suggest using the Surfing the Internet
Guide that is provided at http://www.dreamwvr.com/webframe.htm
Checkout the Surfing the Internet Guide we provide for your usage.
The over 7 different methods to use this website include the Surfing the
Internet Guide we continue to mention here so that you might adventure here.
The Surfing the Internet Guide is provided to assist you in your travels.
Our first attempt is as follows to solve the problem we are having:
/Internet$/{
N
s/Surfing the Internet\nGuide/DREAMWVR.COM Surfing the Internet Guide\
/
}
This method scans for the occurances of Internet at the end of a line. Then
it reads the next line of content and appends it to pattern space. Then we do our
substitution replacing the "Surfing the Internet Guide" with "DREAMWVR.COM Surfing
the Internet Guide" . Note the "\n" that occurs in our pattern search as this is
important in this example. Next we escape the end of line "\n" character with "/"
which allows us to have a "\n" after the substitution otherwise we would have one
very long altered text line once our rules took effect. The rules are applied top
down as sed always likes to read like me:-)
The above [N]ext example is good but not good enough for us to sleep at night so let's
see if we can improve the situation to apply the alterations to all the above contents.
What we wish to do is read line by line the contents of the file and alter it to make
all instances of the "Surfing the Internet Guide" converted to "DREAMWVR.COM Surfing the
Internet Guide" here is how...
/Surfing/{
N
s/Surfing *\n*the *\n*Internet *\n*Guide/DREAMWVR.COM Surfing the Internet Guide/
}
The above example searches to start the ball rolling for instances of "Surfing" which
is the common denominator that would work here as it occurs in all relevant searches.
We look for the pattern as above which takes into account 0 or more spaces followed by
0 or more '\n' carried on through for the pattern match. It then replaces the pattern
the usual way. The 'N' allows us to do this search over a multiple pattern space as before.
This changes everything we want and does a pretty good job to boot! Could we take
the example further? Of course we could as you will notice if you try the example
one could reduce the length of the first line as it is very long. How would you do
it? Let me know ;-) as there are a number of ways that will work that i am aware of...
At any rate this will get you started... there are at least 3 more improvements that
could be made let me know if you get around to solving them.
There are two different methods i have found varying performance with that i will share
with you. The next method assumes that the your version of sed will allow multiple commands
executed via sed while contained in curly braces. If your particular flavour of sed does
not allow for this i do provide the second method which is a good exercise any how to
interpreting how sed operates for your version. So with no further verbage here is the scoop.
/Surfer the Internet Guide/DREAMWVR.COM Surfing the Internet Guide/
/Surfing/{
N
/ *\n/ /
s/Surfing the Internet Guide/DREAMWVR.COM Surfing the Internet Guide\
/
}
/Surfer the Internet Guide/DREAMWVR.COM Surfing the Internet Guide/
/Surfing/{
N
/ *\n/ /
}
s/Surfing the Internet Guide/DREAMWVR.COM Surfing the Internet Guide\
/
Both will achieve the same result i have found with your mileage to vary depending on how
your version of sed wants to play. What occurs here is the first substitution string scans
for any lines that contain "Surfing the Internet Guide" replacing them with can you guess???
Of course you can once this global substitution occurs then things get interesting as we
look for the pattern "Surfing" then we read the [N]ext line of content and for each instance
we strip off the embedded '\n' character replacing it with a space. Once that has been done we
naturally have a double long line to deal with as remember that 'N' causes the next input line
to be appended to the line preceding. So we have gotten this far it is up to us now to replace
the occurances of "Surfing the Internet Guide" with the replacement killer string of
"DREAMWVR.COM Surfing the Internet Guide" which is what we do. We use two slightly different
methods as i have found that some versions of sed don't like playing nice with me when i add
all the sed commands in curly braces. Solution??? I simply don't place all the commands in
curly braces except those i need to ;-)
When using the [d]elete command the entire pattern space is deleted and then the next line
of data is read with the script starting over from the top applying the commands in the script.
So what happens if your intention is to have more granular control of the Deletion of the pattern
space? Let say that you wish to be more selective that is where the [D]elete command comes into
play. If you used the normal /^$/d then when this command encountered a odd number of blank lines
it would not work as you might suspect. This is due to the fact that the [d]elete command scans
and locates the first blank line and then reads the following line into its pattern space then
since it applies the [d]elete to the entire pattern space both lines are removed absolutely. This
would be alright if you wished to run the script a number of times redirecting the result to another
file each time. But how about a situation where there is a request for an double spaced output file?
Here is how using the new fangled [D]elete command...
/^$/{
N
/^\n$/D
}
The multi-line [D]elete command allows us better control of successive blank lines by reading
both blank lines and only deleting the first of 2 consecutive blank lines. On the second path
it is intuitive enough to know that upon reading the [N]ext line and determining that it is not
a blank line it knows to simply output the entire 2 line pattern space. Hence our control of varying multiple
blank lines is solved and our content is double spaced. This is one that i often forget how to do that is why i put it here in
the first place:-)
The [P]rint command prints the contents up to the first embedded newline then it waits patiently as
the script continues to apply other commands to do work in the multi-line pattern space. Then after the
last command is executed it prints the remaining manipulated pattern space. where [p]rint is
very useful for situations where you wish to debug output from the pattern space [P]rint is
used most often with other advanced commands like [N]ext and [D]elete so that the contents of
the pattern space once they have received the once over with [N]ext, [D]elete, can be splashed
to the screen. This example may help explain things better or make them muddier depending on
how you look at the topic. Below is the input text i began with:
Linux has gone a long ways from the early
days of the System. These days the
system can claim some unique company.
Companies such as Intel, Netscape, IBM, Oracle,
HP, Toshiba, NEC, Packard Bell, SGI, Corel,
Informix, and SUN have entered alliances with Linux.
This all obtained from a System that began
as the academic exercise of a student named
"Linux" who spearheaded it less than a decade
ago. The beauty of the System is that it is
very different than the marketing hype generated
by some less genuine companies because it is not
a company but rather a mindset umbrella for revolutionary
System that is constantly inproving as it is the sum of
the minds involved creating a Greater Product that can be
shared by all. It's popularity has now entered the boardroom
and there is stopping this baby!
Now here is how i initially solved the problem. Notice there is a peculiar way that i worked
around a issue i wished to avoid the first time around. Can you spot it? The debugging was
over the span of a minute or two so it is far from perfect i am sure but hey it works! None the
less it is not perfect and that is where you come in... fix the issue and email me the solution
and i promise to add it to this faq:-) Here goes the code that allows me to change every instance
of [sS]ystem to 'Operating System' throughout the document and spread the word literally about
Linux all at the same time!
N
/[sS]ystem/{
s//Operating &/
s/Operating Operating/Operating/
P
D
}
Note as i mentioned before the the use of multiple advanced commands takes some getting used to
but allows for more control of the text flow since it applies over more than one line. Above i
used the [N]ext command, the [P]rint, and [D]elete commands all in that order. Learn to do this
on the fly and your over half way there!
The [h]old command is used to takes what you have in the pattern space and places it
in a special area called the hold space. When you issue the [H]old command you are
appending with a newline to the contents of the hold space. With the [g]et command
you are getting the entire contents of the hold space and replacing the entire pattern
space. Whereas with the [G]et command you are grabbing the contents of the hold space
and appending this preceded by a newline to the contents of the pattern space. [x]change
literally switches the contents of the pattern space with the contents of the hold space.
Here is a example of content and how you might apply this knowledge:
hi
ho
hi
ho
it's off to switcheroo..
i go..
Now if i were to be so bold as to create a file called 'hiho.test' and populate it with
the contents as above.. creating a sed script called 'switcheroo.sed' to drive the above
'hiho.test' script we might get some humourous results. so let's do that.. we need to create
a script we call 'switcheroo.sed' as the program that does all the work for us so we can get
all the credit:-)
/hi/ {
h
d
}
/ho/{
G
}
which will produce the results that we would suspect of a switcheroo. IOW the
program was built to detect 'hi' and yank it out of the pattern space placing it
into the hold space. Then for every 'ho' it detects it yanks the 'hi' out of the
hold space and appends it to each 'ho' line first appending a "\n" character to
it following this with the 'hi' word pattern. since [H]old have and [g]et do not
do what we want cleanly this is left for you to play around with.
ho
hi
ho
hi
it's off to switcheroo..
i go..
At times it is convenient to have the ability to transform select lowercase areas into UPPER
CASE characters for emphasis or make up your reasons and justify them later;-) where might
this be a handy ability? Say that you were reading email and notices that routinely you would
like to receive in CAPS certain Subject matter for your attention. Since your are hopefully
aware that Internet mail follows standards unlike a certain large software giant whose mail
systems routinely loose attachments;-} which is why you lose attachments b.t.w. you might
proceed as follows.. Naturally this is only a example so you might need to
adjust the script accordingly to make it work in the real world:-)
Subject: FYI using capitals can be attention grabbing
here is my content which is not saying much:-)
Notice the above Subject: FYI we will be using the 'FYI' as a flag to do something hopefully
useful. what we want to have occur is that everytime someone sends us a email with the subject
line of 'Subject: FYI...' that the ... content be changed to all CAPS to draw attention to
the email being sent to us. if we just scanned for 'Subject:' although useful all email subject
lines would be in CAPS which would defeat the purpose. here goes..
/Subject: FYI.*/{
#First we cp the pattern space to the hold space so we might use it below.
h
#Then we substitute everything after the 'Subject: FYI' into the pattern space.
s/Subject: FYI\(.*\)/\1/
#Then we use the transformers command to change all lower instances to UPPERCASE.
y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/
#Now we [G]et the original pattern in the hold space and append it with a '\n' to the new pattern space.
G
#Then we get a bit confusing but it works as we match all chars up to the newline. Follow that by matching
#the 'Subject: FYI' and then the space and everything else to the end of the pattern. Which leaves us with
#our need to create a final variable to contain null land which we append to the end of the new pattern.
#This leaves us in position where we wanted to live in the first place. If you neglect to use the '\3' we
#would still have the original transformed text append to our noew pattern space. Now you know:-)
s/\(.*\)\n\(Subject: FYI\).*\(.*\)/\2\1\3/
}
For additional verbage you might consider that this is a example that can be adjusted for your needs so that
you might take it further by automating your recieves so that people that send you mail use 'Subject: FYI' to
really get your attention. Some tweaking will be required to do this in a production environment but not much.
Now here is one that i could go on for miles about since i am one of those you know who's that never bothered to learn howto
WYSIWYG. Enough harping here is the lowdown. Say you had a text only document as below that you had never got around to
converting over to html. For Example...
Here is the best paragraph i could come up with under the circumstances. Seems that i am definately speechless and perhaps
could even be accused of being the silent type. How do i get what i am writing from this ASCII stuff to a more popular
web enabled version. hmmm... tough call there could be far better ways and naturally this is only a template but once you
finish with this text you will be at least half way there.
Now somewhere else create a script with the below enscribed content:
${
/^$/!{
H
s/.*/ /
}
}
#So far you have a script that happily scans to the bottom of the content. Then it
takes the last line if not blank and absorbs it into the [H]old space substituting
whatever was there for a blank line. It then spirals down to the next routine..
/^$/!{
H
d
}
#Then we gobble up every line that is not blank appending it to the hold space..deleting
whatever was there in the pattern space.
/^$/{
x
s/^\n/< HTML>< BODY>< P>/
s/$/<\/P><\/BODY><\/HTML>
G
}
#This produces the desired result of allowing us to transform the above straight ascii text into
something far more nobler namely pure html:-) The 'x' switches the contents held in the hold space for the contents of
the pattern space. We then do our substitutions which should be pretty self explanatory. Then we use the 'G' to append
the blank space in the hold space to the newly switched pattern space.
< HTML> < BODY> < P>
Here is the best paragraph i could come up with under the circumstances. Seems that i am definately speechless and perhaps
could even be accused of being the silent type. How do i get what i am writing from this ASCII stuff to a more popular
web enabled version. hmmm... tough call there could be far better ways and naturally this is only a template but once you
finish with this text you will be at least half way there.
< /P>< /BODY>< /HTML>
Next Chapter, Previous Chapter
Table of contents of this chapter, General table of contents
Top of the document, Beginning of this Chapter