Tuesday, February 4, 2014

Playing with sed

So I wanted to modify /etc/apt/sources.list to comment lines in and out depending on startup conditions (depending if I am using a local intranet or not).  I decided to use sed and wanted to use this as a good opportunity to learn a little more about regular expressions (good tutorial here).  Here is a version of my code:
#! /bin/sh
debianString="deb http://IPADDRESS/debian keyword main"
sourceFile="sources.list"
# Removes any lines that contain #noCell or keyword
# then for any line that has 2 or more characters add #cell to the beginning
# then add debianString to the end of the file
workWithCell()
{
    sed -i -e '/#nocell/d' "$sourceFile"
    sed -i -e '/keyword/d' "$sourceFile"
    sed -i -e 's/\(.\{2,\}.*\)/#cell \1/g' "$sourceFile"
    echo "$debianString" >> "$sourceFile"
}
# Adds #noCell to the begginning of any line that contains keyword
# Removes #cell anywhere it is found in the file
workWithEthernet()
{
    sed -i -e '/keyword/ s/^/#nocell /' "$sourceFile"
    sed -i -e 's/#cell //g' "$sourceFile"
}
if grep -q "$debianString" "$sourceFile"; then
    if grep -q "#nocell" "$sourceFile"; then
        workWithCell
    else
        workWithEthernet 
    fi
else
    workWithCell
fi
I want to step through some of this and explain it mostly so I can remember how I did it.  I am sure there are better ways but this works for now.  First I just put sources.list so you hopefully won't mess with the real file until you are sure.  I recommend making a copy so you can play without consequence.  If you look at my if statements you will see that I use grep to see if a string exists in the sourceFile.  Then depending on what is matched I call workWithCell or workWithEthernet.  For me the interesting part is the sed statements.  So here is the first one in workWithCell():
sed -i -e '/#nocell/d' "$sourceFile"
So -i means make the changes in the sourceFile instead of copying them to a new file.  -e is followed by the script that runs through sourceFile and makes the changes.  In this case, '/#nocell/d' will delete any line in the code that contains #nocell.  The next line deletes any line in the code the contains keyword (safety feature in case something gets left behind).
sed -i -e 's/\(.\{2,\}.*\)/#cell \1/g' "$sourceFile"
This line gets a little more ugly.  This time we are doing a search and replace ('s/whatToMatch/ReplaceWithThis/').  The search term, .\{2,\}.*,  matches anything except end of line (.) and make sure there are 2 or more characters \{2,\}.  The .* means match anything except end of line (.) and match 0 or more characters (*), thus including the rest of the characters on the line.  The \( \) saves the match contained inside, which in this case is the entire line.  Then we replace the entire line with #cell \1 where \1 is whatever was saved by \( \) previously, in this case the entire line.  The final g indicates to do this globally or for the entire file.
sed -i -e '/#nocell/d' "$sourceFile"
So -i means make the changes in the sourceFile instead of copying them to a new file.  -e is followed by the script that runs through sourceFile and makes the changes.  In this case, '/#nocell/d' will delete any line in the code that contains #nocell.  The next line deletes any line in the code the contains keyword (safety feature in case something gets left behind).  The final line in workWithCell echo "$debianString" >> "$sourceFile" just puts the debianString at the end of the file.

The first line in workWithEthernet:
sed -i -e '/keyword/ s/^/#nocell /' "$sourceFile"
Searches for any line that contains keyword, if so then it searches for the start of the line (^) and replaces, or better put adds #nocell to the beginning of the line.

The second and final line in workWithEthernet:
sed -i -e 's/#cell //g' "$sourceFile"
Searches the string "#cell " and replaces it with nothing (thus the //).  The g once again indicates that you want to do it globally.