How to remove unneeded languages from video files using ffmpeg?
I have a number of video files (500+) with lots of audio and subtitle streams for languages that I don’t need and would thus like to remove to conserve storage space.
I tinkered around with ffmpeg
, but removing streams by processing one file after another turned out to very time consuming. Had no luck with scripting either as the video files contain different streams in different orders, which makes removal via index difficult and error prone.
There must be a solution that is both faster and also works for files containing different streams, right? Any help would be much appreciated.
You could use the following ffmpeg
command line:
ffmpeg -i video.mkv -map 0:v -map 0:m:language:eng -codec copy video_2.mkv
Explanation:
-i video.mkv input file (identified as '0:' in mappings below)
-map 0:v map video streams from input to output file
-map 0:m:language:eng map streams for language 'eng' from input to output file
(may be specified multiple times for multiple languages)
-codec copy copy streams without reencoding
video_2.mkv output file
The resulting file will retain streams for English only (except for video streams which are copied regardless).
I actually even created a script for this a while ago (GitHub Gist: remove-unneeded-languages.sh):
#!/usr/bin/env bash
# -------------------------------------------------------------------------
# -
# Remove unneeded language(s) from video file -
# -
# Created by Fonic <https://github.com/fonic> -
# Date: 04/14/22 - 08/26/22 -
# -
# Based on: -
# https://www.reddit.com/r/ffmpeg/comments/r3dccd/how_to_use_ffmpeg_to_ -
# detect_and_delete_all_non/ -
# -
# -------------------------------------------------------------------------
# Print normal/hilite/good/warn/error message [$*: message]
function printn() { echo -e "$*"; }
function printh() { echo -e "e[1m$*e[0m"; }
function printg() { echo -e "e[1;32m$*e[0m"; }
function printw() { echo -e "e[1;33m$*e[0m" >&2; }
function printe() { echo -e "e[1;31m$*e[0m" >&2; }
# Set up error handling
set -ue; trap "printe "Error: an unhandled error occurred on line ${LINENO}, aborting."; exit 1" ERR
# Process command line
if (( $# != 3 )); then
printn "e[1mUsage:e[0m ${0##*/} LANGUAGES INFILE OUTFILE"
printn "e[1mExample:e[0m ${0##*/} eng,spa video.mkv video_2.mkv"
printn "e[1mExample:e[0m for file in *.mkv; do ${0##*/} eng,spa "${file}" "out/${file}"; done"
printn "e[1mNote:e[0m LANGUAGES specifies language(s) to KEEP (comma-separated list)"
exit 2
fi
IFS="," read -a langs -r <<< "$1"
infile="$2"
outfile="$3"
# Sanity checks
[[ -f "${infile}" ]] || { printe "Error: input file '${infile}' does not exist, aborting."; exit 1; }
command -v "ffmpeg" >/dev/null || { printe "Error: required command 'ffmpeg' is not available, aborting."; exit 1; }
# Run ffmpeg
printh "Processing file '${infile}'..."
lang_maps=(); for lang in "${langs[@]}"; do lang_maps+=("-map" "0:m:language:${lang}"); done
ffmpeg -i "${infile}" -map 0:v "${lang_maps[@]}" -codec copy -loglevel warning "${outfile}" || exit $?
exit 0
Use it like this (e.g. to process all .mkv
files and only keep streams for English and Spanish):
mkdir out; for file in *.mkv; do remove-unneeded-languages.sh eng,spa "${file}" "out/${file}"; done