What is `^M` and how do I get rid of it?
When I open the file in Vim, I see strange ^M
characters.
Unfortunately, the world’s favorite search engine does not do well with special characters in queries, so I’m asking here:
-
What is this
^M
character? -
How could it have got there?
-
How do I get rid of it?
The ^M
is a carriage-return character. If you see this, you’re probably looking at a file that originated in the DOS/Windows world, where an end-of-line is marked by a carriage return/newline pair, whereas in the Unix world, end-of-line is marked by a single newline.
Read this article for more detail, and also the Wikipedia entry for newline.
This article discusses how to set up vim to transparently edit files with different end-of-line markers.
If you have a file with ^M
at the end of some lines and you want to get rid of them, use this in Vim:
:s/^M$//
(Press Ctrl+V Ctrl+M to insert that ^M
.)
Most UNIX operating systems have a utility called dos2unix
that will convert the CRLF to LF. The other answers cover the “what are they” question.
You can clean this up with sed
:
sed -e 's/^M$//' < infile > outfile
The trick is how to enter the carriage-return properly. Generally, you need to type C-v C-m
to enter a literal carriage return. You can also have sed work in place with
sed -i.bak -e 's/^M$//' infile
Another way to get rid of carriage returns is with the tr
command.
I have a small script that look like this
#!/bin/sh
tmpfile=$(mktemp)
tr -d 'r' <"$1" >"$tmpfile"
mv "$tmpfile" "$1"
A simpler way to do this is to use the following command:
dos2unix filename
This command works with path patterns as well, Eg
dos2unix path/name*
If it doesn’t work, try using different mode:
dos2unix -c mac filename
-c
Set conversion mode. Where CONVMODE is one of:ascii, 7bit, iso, mac
withascii
being the default.
What is this ^M?
The ^M is a carriage-return character. If you see this, you’re probably looking at a file that originated in the DOS/Windows world, where an end-of-line is marked by a carriage return/newline pair, whereas in the Unix world, end-of-line is marked by a single newline.
How could it have got there?
When there is change in file format.
How do I get rid of it?
open your file with
vim -b FILE_PATH
save it with following command
:%s/^M//g
You can use Vim in Ex mode:
ex -bsc '%s/r//|x' file
-
-b
binary mode -
%
select all lines -
s
substitute -
r
carriage return -
x
save and close
This worked for me
:e ++ff=dos
The :e ++ff=dos command tells Vim to read the file again, forcing dos file format. Vim will remove CRLF and LF-only line endings, leaving only the text of each line in the buffer.
then
:set ff=unix
and finally
:wq
In my case,
Nothing above worked, I had a CSV file copied to Linux machine from my mac and I used all the above commands but nothing helped but the below one
tr " 15" "n" < inputfile > outputfile
I had a file in which ^M characters were sandwitched between lines something like below
Audi,A4,35 TFSi Premium,,CAAUA4TP^MB01BNKT6TG,TRO_WBFB_500,Trico,CARS,Audi,A4,35 TFSi Premium,,CAAUA4TP^MB01BNKTG0A,TRO_WB_T500,Trico,
In the past, I have seen even configuration files are not parsed properly and complain about whitespace, but if you vi and do a set list it won’t show the whitespace, grep filename [[space]] will show you ^M
that’s when dos2unix file
helps
Add the following line to your ~/.vimrc
command! Tounix :call Preserve('1,$s/^M//')
Then when you have a file with the Windows line endings, run the command “:Tounix”.
Sed in-place solution without needing to type the special character (you can copy this and it works):
sed -i -e "s/r//g" filename
Explanation:
-i: in-place
-e: regular expression
r: escaped carriage return
/g: replace globally
If your file uses mixed line breaks, i. e. rn
and r
, you can use this sed
script (this is an all-in-one solution and of course, you can also use it if your file merely has rn
or r
line breaks):
sed -i[SUFFIX] ':read; N; $!b read; s/rn/n/g; s/r/n/g' file
-i
tells sed
to replace your file with the result of the script,
and if you supply a SUFFIX
, a backup will be created with that suffix.
You also may omit -i
and redirect the output to an arbitrary file.
How the script works:
Special commands:
N
: Adds a newline to the pattern space, then appends the next line of input (with any trailingn
removed) to the pattern space.$!
: Don’t execute the following command on the last line.b
: Branches unconditionally to the specified label.
So when the cycle starts (we have just one cycle here), sed
reads the first line of input, removes any trailing n
and places it in the pattern space, then it processes the script:
N
adds a newline to the pattern space, then appends the next line of input (with any trailing n
removed) to the pattern space.
This is repeated until the last line is reached, then the substitutions operate on the pattern space, which contains the whole file as a single line, which is the reason why we need the g
flag for the substitutions.
The first substitution replaces each rn
with n
.
After this, the remaining ("lonesome") r
s are replaced with n
.
Why do we need the N
loop?
Since sed 's/<regexp>/<replacement>/[flags]' file
reads a line from the file, performs the substitution on it, prints the result, reads the next line and so on, n
cannot be used within the <regexp>
part.
But if we have the whole file as a single line in the pattern space, we will be able to use n
within the <regexp>
part, and this is necessary to distinguish between the sequence rn
and "lonesome" r
s.
Example:
$ sudo file /var/log/apt/term.log
/var/log/apt/term.log: UTF-8 Unicode text, with CRLF, CR, LF line terminators, with escape sequences, with overstriking
$ sudo sed -i.bak ':read; N; $!b read; s/rn/n/g; s/r/n/g' /var/log/apt/term.log
$ sudo file /var/log/apt/term.log
/var/log/apt/term.log: UTF-8 Unicode text, with escape sequences, with overstriking
I have to dissent from all the answers here – what VIM displays as ^M
is the carriage return character from DOS/Windows. To remove it from all lines in VIM the command is:
:%s/r//
Trying to remove ^M
with a regex like s/^M//
means you don’t understand regex – ^
just means the start of the line and ^M
matches any line starting with capital M
.