What characters do I need to escape when using sed in a sh script?

Take the following script:

sed 's/(' [some file]

If I try to run this in sh (dash here), it’ll fail because of the parentheses, which need to be escaped. But I don’t need to escape the backslashes themselves (between the octets, or in the s or 1). What’s the rule here? What about when I need to use {...} or [...]? Is there a list of what I do and don’t need to escape?

Asked By: detly


Unless you want to interpolate a shell variable into the sed expression, use single quotes for the whole expression because they cause everything between them to be interpreted as-is, including backslashes.

So if you want sed to see s/( put single quotes around it and the shell won’t touch the parentheses or backslashes in it. If you need to interpolate a shell variable, put only that part in double quotes. E.g.

sed 's/('"$ip"'/'

This will save you the trouble of remembering which shell metacharacters are not escaped by double quotes.

Answered By: Kyle Jones

There are two levels of interpretation here: the shell, and sed.

In the shell, everything between single quotes is interpreted literally, except for single quotes themselves. You can effectively have a single quote between single quotes by writing ''' (close single quote, one literal single quote, open single quote).

Sed uses basic regular expressions. In a BRE, in order to have them treated literally, the characters $.*[^ need to be quoted by preceding them by a backslash, except inside character sets ([…]). Letters, digits and (){}+?| must not be quoted (you can get away with quoting some of these in some implementations). The sequences (, ), n, and in some implementations {, }, +, ?, | and other backslash+alphanumerics have special meanings. You can get away with not quoting $^ in some positions in some implementations.

Furthermore, you need a backslash before / if it is to appear in the regex outside of bracket expressions. You can choose an alternative character as the delimiter by writing, e.g., s~/dir~/replacement~ or ~/dir~p; you’ll need a backslash before the delimiter if you want to include it in the BRE. If you choose a character that has a special meaning in a BRE and you want to include it literally, you’ll need three backslashes; I do not recommend this, as it may behave differently in some implementations.

In a nutshell, for sed 's/…/…/':

  • Write the regex between single quotes.
  • Use ''' to end up with a single quote in the regex.
  • Put a backslash before $.*/[]^ and only those characters (but not inside bracket expressions). (Technically you shouldn’t put a backslash before ] but I don’t know of an implementation that treats ] and ] differently outside of bracket expressions.)
  • Inside a bracket expression, for - to be treated literally, make sure it is first or last ([abc-] or [-abc], not [a-bc]).
  • Inside a bracket expression, for ^ to be treated literally, make sure it is not first (use [abc^], not [^abc]).
  • To include ] in the list of characters matched by a bracket expression, make it the first character (or first after ^ for a negated set): []abc] or [^]abc] (not [abc]] nor [abc]]).

In the replacement text:

  • & and need to be quoted by preceding them by a backslash,
    as do the delimiter (usually /) and newlines.
  • followed by a digit has a special meaning. followed by a letter has a special meaning (special characters) in some implementations, and followed by some other character means c or c depending on the implementation.
  • With single quotes around the argument (sed 's/…/…/'), use ''' to put a single quote in the replacement text.

If the regex or replacement text comes from a shell variable, remember that

  • The regex is a BRE, not a literal string.
  • In the regex, a newline needs to be expressed as n (which will never match unless you have other sed code adding newline characters to the pattern space). But note that it won’t work inside bracket expressions with some sed implementations.
  • In the replacement text, &, and newlines need to be quoted.
  • The delimiter needs to be quoted (but not inside bracket expressions).
  • Use double quotes for interpolation: sed -e "s/$BRE/$REPL/".

The problem you’re experiencing isn’t due to shell interpolating and escapes – it’s because you’re attempting to use extended regular expression syntax without passing sed the -r or --regexp-extended option.

Change your sed line from

sed 's/(' [some file]


sed -r 's/(' [some file]

and it will work as I believe you intend.

By default sed uses uses basic regular expressions (think grep style), which would require the following syntax:

sed 's/([ t]/1/' [some file]
Answered By: R Perrin

I think it’s worth mentioning that, while sed is based on the the POSIX standard, which specifies support only for basic regular expression (BRE), two different versions of the sed command actually exist – BSD(Mac OS) and GNU(Linux distros). Each version implements similar, as well as unique extensions to the POSIX standard, and can affect the functionality of sed across different platforms. As a result, proper syntax of the sed command, functioning as expected on one system, might actually translate to completely different results on another. This can lead to unexpected behavior with regards to the usage of escaped and special characters.

These extensions to the POSIX standard tend to be more prevalent on the GNU version of sed, often times providing the convenience of less strict formatting, especially in comparison to the BSD version. However, while GNU sed does allow for the functionality of some special characters, they are still not actually POSIX-compliant. Additionally, the only real difference between basic and extended regular expression(ERE), within GNU sed, is the behavior of the following special characters:

‘?’, ‘+’, parentheses, braces (‘{}’), and ‘|’

While this may be the case, some special characters have limited or no support at all on BSD sed, such as ‘|’, ‘?’, and ‘+’, as it more closely adheres to the POSIX syntax standards. The inclusion of those characters, in a fashion similar to that of GNU sed, will often result in issues with portability and functionality of scripts utilizing sed. It’s also worth noting, POSIX BRE syntax does not define a meaning for some escape sequences, most notably: |, +, ?, `, ‘, <, >, b, B, w, and W,.

For those running the BSD/Mac OS version of sed, emulating behavior of some special characters can be a bit tricky, but it can be done in most cases. For example, + could be emulated in a POSIX-compliant fashion like this:
{1,} and ? would look like this: {0,1}
Control character sequences, however, are typically not supported. If at all possible, it’s certainly easiest to utilize GNU sed, but if you need functionality on both platforms, remember to use POSIX features only, to ensure portability. If you’re a Mac user and would like to take advantage of GNU sed as opposed to BSD sed, you might try installing Homebrew, and downloading GNU sed via command line with: $brew install gnu-sed.

To wrap things up, differences in version can really dictate what the proper syntax might look like, or what characters are necessary to escape. I hope this provides some additional context for the initial question as well as the accepted answer, and helps others consider how they should proceed, based on the end goal of their script and command usage.

Answered By: forthelulz
Categories: Answers Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.