How to merge all (text) files in a directory into one?
I’ve got 14 files all being parts of one text. I’d like to merge them into one. How to do that?
This is technically what cat
("concatenate") is supposed to do, even though most people just use it for outputting files to stdout. If you give it multiple filenames it will output them all sequentially, and then you can redirect that into a new file; in the case of all files just use ./*
(or /path/to/directory/*
if you’re not in the directory already) and your shell will expand it to all the filenames (excluding hidden ones by default).
$ cat ./* > merged-file
Make sure you don’t use the csh
or tcsh
shells for that which expand the glob after opening the merged-file
for output, and that merged-file
doesn’t exist before hand, or you’ll likely end up with an infinite loop that fills up the filesystem.
The list of files is sorted lexically. If using zsh
, you can change the order (to numeric, or by age, size…) with glob qualifiers.
To include files in sub-directories, use:
find . ! -path ./merged-file -type f -exec cat {} + > merged-file
Though beware the list of files is not sorted and hidden files are included. -type f
here restricts to regular files only as it’s unlikely you’ll want to include other types of files. With GNU find
, you can change it to -xtype f
to also include symlinks to regular files.
With the zsh shell,
cat ./**/*(-.) > merged-file
Would do the same ((-.)
achieving the equivalent of -xtype f
) but give you a sorted list and exclude hidden files (add the D
qualifier to bring them back). zargs
can be used there to work around argument list too long errors.
The command
$ cat * > merged-file
actually has the undesired side-effect of including ‘merged-file’ in the concatenation, creating a run-away file. To get round this, either write the merged file to a different directory;
$ cat * > ../merged-file
or use a pattern match that will ignore the merged file;
$ cat *.txt > merged-file
If your files aren’t in the same directory, you can use the find command before the concatenation:
find /path/to/directory/ -name *.csv -print0 | xargs -0 -I file cat file > merged.file
Very useful when your files are already ordered and you want to merge them to analyze them.
More portably:
find /path/to/directory/ -name *.csv -exec cat {} + > merged.file
This may or may not preserve file order.
You can specify the pattern
of a file then merge all of them as follows:
cat *pattern* >> mergedfile
Like the other ones from here say… You can use cat
Lets say you have:
~/file01
~/file02
~/file03
~/file04
~/fileA
~/fileB
~/fileC
~/fileD
And you want only file01
to file03
and fileA
to fileC
:
cat ~/file01 ~/file02 ~/file03 ~/fileA ~/fileB ~/fileC > merged-file
Or, using brace expansion:
cat ~/file0{1..3} ~/file{A..C} > merged-file
Or, using fancier brace expansion:
cat ~/file{0{1..3},{A..C}} > merged-file
Or you can use for
loop:
for i in file0{1..3} file{A..C}; do cat ~/"$i"; done > merged-file
Another option is sed:
sed r 1.txt 2.txt 3.txt > merge.txt
Or…
sed h 1.txt 2.txt 3.txt > merge.txt
Or…
sed -n p 1.txt 2.txt 3.txt > merge.txt # -n is mandatory here
Or without redirection …
sed wmerge.txt 1.txt 2.txt 3.txt
Note that last line write also merge.txt (not wmerge.txt!). You can use w”merge.txt” to avoid confusion with the file name, and -n for silent output.
Of course, you can also shorten the file list with wildcards. For instance, in case of numbered files as in the above examples, you can specify the range with braces in this way:
sed -n w"merge.txt" {1..3}.txt