Batch rename image files by age plus add date and variable to filename

The question is completely rewritten due to things learned from the first two answers and comments that I was unaware of when first asking.


After a photo shoot I come home with files that look like _DSC1234.NEF. NEF is Nikon’s camera raw file format, so EXIF data is present in the files.

I would like to automatically rename them with three parts:

  1. Creation date in the format YYYYMMDD
  2. Name of Shoot
  3. Number of Image

So the final file name should look like this

20140707_NameOfShoot_0001.NEF

There are a few issues:

ad 1. Creation date
sometimes I can only copy and rename the files a few days after shooting, so the date should reflect the date the picture was taken, not the date it was copied. mtime seems like the best bet here or if possible creation date from EXIF.

ad 2: Name of Shoot
This would ideally be a variable I could set as a parameter on calling the script.

ad 3. Number of Image
This should reflect the age of the image, the oldest having the lowest number. The problem is that cameras usually restart numbering at 0000 after they hit 9999. So 9995-9999 can potentially be older than 0000-0004. I am looking for a solution that reflects file age and in this special case would rename

  • _DSC0000.NEF -> 20140707_FOO_0004.NEF
  • _DSC0001.NEF -> 20140707_FOO_0005.NEF
  • _DSC0002.NEF -> 20140707_FOO_0006.NEF
  • _DSC9997.NEF -> 20140707_FOO_0001.NEF
  • _DSC9998.NEF -> 20140707_FOO_0002.NEF
  • _DSC9999.NEF -> 20140707_FOO_0003.NEF

Again, either mtime or if possible creation date from EXIF seem right.


From here I have a working solution which renames all .NEF-files in a folder by date:

find -name '*.NEF' | 
gawk 'BEGIN{ a=1 }{ printf "mv %s %04d.NEFn", $0, a++ }' | 
bash 

Hard coding file-modification-date and the shootname works well, but it would be great if it could be automated.

For the timestamp I find articles on strftime(), mktime(), systime() but I don’t understand how to use them to return file modification date. I also tried to add DATE=$(date +"%Y%m%d") and add $DATE to the gawk-line, which leads to deletion of all files in current folder (and probably is systime anyway and not change time).

For the variable, I tried

gawk 'BEGIN{ a=1 }{ printf "mv %s $1_%04d.NEFn", $0, a++ }' | 

and call the script with ./rename FOO, but FOO is ignored in renaming.

Asked By: Seul

||
stat --printf='}" "%z_${SN}_${LINENO}"n' -- * | 
nl -nln -w1 -s '' |
sort -k2,3 | 
sed 's| ..:[^_]*||;s|-||g;s|^|echo mv "${|' |
SN=SHOOTNAME sh -s -- *

The above command should do what you need. It is perhaps more general-purposed than you have requested – but please see the bottom of this answer for a more specific example.

It works by *globbing all the files in the current directory and printing out their change times. For just files in the current directory containing the suffix .NEF you’ll want to change the *globstar at the ends of both lines 1 and 5 to *.NEF. It appends some shell variables and quotes to the end – the names for which only exist at the other end of the pipeline in the sh subshell.

Also, because we specify the filenames just by their glob order – or the ${1} shell type parameters this works fine with any filename – whatever weird characters it may contain.

For now the command includes an echo – it’s nerfed. Running is essentially a no-op – it just shows you what it wants to do. Here’s its output from my home directory before feeding it to sh:

echo mv "${2}" "20140611_${SN}_${LINENO}"
echo mv "${4}" "20140614_${SN}_${LINENO}"
echo mv "${11}" "20140617_${SN}_${LINENO}"
echo mv "${7}" "20140622_${SN}_${LINENO}"
echo mv "${8}" "20140622_${SN}_${LINENO}"
echo mv "${1}" "20140624_${SN}_${LINENO}"
echo mv "${10}" "20140704_${SN}_${LINENO}"
echo mv "${5}" "20140704_${SN}_${LINENO}"
echo mv "${9}" "20140704_${SN}_${LINENO}"
echo mv "${12}" "20140705_${SN}_${LINENO}"
echo mv "${3}" "20140705_${SN}_${LINENO}"
echo mv "${13}" "20140706_${SN}_${LINENO}"
echo mv "${6}" "20140706_${SN}_${LINENO}"

Here it is after sh:

mv Desktop-1 20140611_SHOOTNAME_1
mv Library 20140614_SHOOTNAME_2
mv target.txt 20140617_SHOOTNAME_3
mv script.sh 20140622_SHOOTNAME_4
mv script.sh~ 20140622_SHOOTNAME_5
mv Desktop 20140624_SHOOTNAME_6
mv shot-2014-06-22_17-11-06.jpg 20140704_SHOOTNAME_7
mv Terminology.log 20140704_SHOOTNAME_8
mv shot-2014-06-22_17-10-16.jpg 20140704_SHOOTNAME_9
mv test 20140705_SHOOTNAME_10
mv Downloads 20140705_SHOOTNAME_11
mv test.tar 20140706_SHOOTNAME_12
mv new
file 20140706_SHOOTNAME_13

You might notice that my output shows a few image files already named for their creation time but their new assigned name does not match. This is not an effect of the sort – which works as prescribed – but rather that those files last had a status change on that date. Nevertheless, as you’ve specified in the comments on this question ctime is the property you’re looking for, that is the sort and name property offered here. Still, here is stat‘s output with filenames attached:

stat -c '%z %n' -- *

2014-06-24 16:50:09.110283839 -0700 Desktop
2014-06-11 23:34:02.981981145 -0700 Desktop-1
2014-07-05 01:00:43.213344635 -0700 Downloads
2014-06-14 10:32:13.537014418 -0700 Library
2014-07-04 23:02:25.079690701 -0700 Terminology.log
2014-07-06 11:24:05.398936386 -0700 new
file
2014-06-22 11:26:53.658004123 -0700 script.sh
2014-06-22 11:26:53.658004123 -0700 script.sh~
2014-07-04 13:34:00.063296353 -0700 shot-2014-06-22_17-10-16.jpg
2014-07-04 13:34:00.066629687 -0700 shot-2014-06-22_17-11-06.jpg
2014-06-17 19:59:38.475358571 -0700 target.txt
2014-07-05 23:53:39.097065292 -0700 test
2014-07-06 00:38:57.060521397 -0700 test.tar

The above output should also help me demonstrate what the whole pipeline is doing.

  • So stat --printf='}" "%z_${SN}_${LINENO}"n' prints out lines that look like:

    }" "YYYY-MM-DD HH:MM:SS.NS -TZ_${SN}_${LINENO}"

… where YMDHMS.NS -TZ are the various strftime components they represent for ctime. Its output format is identical for %w – time of file birth – %x – time of last access – or %y – time of last modification, and so substituting any one of these for %z in the statement above will expand instead to their values. As we’ve already discussed in the comments though, %w – file birth time – is not a reliable attribute, and where it is unsupported it expands only to 0.

It does this for every file the shell globs for it in * or whatever shell glob you provide it – such as *.NEF for only files in the current directory with the .NEF suffix.

  • That list is handed to nl which numbers each line incremented by 1. Its line -numbers are ln left-justified and not zero-padded with a minimum -width of 1 and only a '' null-string to -separate them from the contents of the line. It outputs:

    I}" "YYYY-MM-DD HH:MM:SS.NS -TZ_${SN}_${LINENO}"

… where I is the number of each line.

  • sort sorts its input from the -k2,3 second field through the third – or on YYYY-MM-DD HH:MM:SS.NS. Since at this point the only unique quality of any line is either I or the date and I is skipped, there is no need to be any more specific than that. This also addresses the comment you made regarding files named by number and not by date. I should have done the sort before sed in the first place but it didn’t occur to me to sort on minutes and seconds and the like.

My test base for this was generated like:

for s in 9 8 7 6 5 4 3 2 1; do touch $s && sleep 1; done

If I sort after sed – as I did before this edit – then this would rename files so that 9 became ${DATE}.SHOOTNAME.9 because each file’s YMD fields are identical and sort would not affect their line order. But with this adjustment this command is sort specific to the nanosecond and so 9 is renamed to ${DATE}.SHOOTNAME.1 and vice versa for 1. Thanks, @Seul, for bringing that to my attention.

  • sed then removes the first string that looks like <space><any char><any char>:<colon> and all characters that follow in sequence that are ^not an _underscore. So at this point the line looks like:

    I}" "YYYY-MM-DD_${SN}_${LINENO}"

…next it removes all -dashes. And finally it inserts echo mv "${ at the ^head of each line, so it looks like this:

echo mv "${I}" "YYYYMMDD_${SN}_${LINENO}"
  • Last a shell is invoked with the environment variable $SN declared – here its value is SHOOTNAME. POSIX specifies that the shell increment the var $LINENO for every line it reads in, so for each line we feed it that value in the filename should expand to one more than the last. If – as your comments indicate – for some reason this does not occur, a perfectly valid substitute is $((i=i+1)) as first printed by stat in line 1 of the pipeline and as I’ve provided in a specific example below.

The shell is also invoked with its positional parameters set to our glob – here the *globstar for all files in the current directory. As already mentioned, *.NEF in this line and in the first would serve to only operate on files in the current directory with the filename suffix of .NEF.

So long as its glob is the same as that in the first line, it will glob them in the same order as nl numbered them. So no matter the line on which it occurs "${1}" will expand to the same filename we’ve assigned it according to nl‘s output. This way you can rename the files in date order quickly and safely and in the correct order.

  • As also already mentioned, I have nerfed the command with echo here. But if you run the echo and find it suits you after all you’ll need to remove the echo.

Like this:

stat --printf='}" "%z_${SN}_${LINENO}"n' -- * | 
nl -nln -w1 -s '' |
sort -k2,3 |
sed 's| ..:[^_]*||;s|-||g;s|^|mv "${|' |
SN=SHOOTNAME sh -s -- *

Or maybe:

export SN=SHOOTNAME SUFX=.NEF
stat --printf='}" "%z_${SN}_$((i=i+1))${SUFX}"n' -- *$SUFX | 
nl -nln -w1 -s '' |
sort -k2,3 |
sed 's| ..:[^_]*||;s|-||g;s|^|mv "${|' |
sh -s -- *$SUFX

And here it is worked into a shell function:

_batch_date_rename () ( # a big one
    ERR= # for error reporting
    export "DIR=$1" "SUFX=$2"  # args 1,2 must be dirname and file suffix
        "NAME=${3-${ERR:?no rename string specified}}"  # need name string
        "TIME=${4-%y}" INT=$((${INT:-25}*3)) ${NOCONFIRM+NOCONFIRM=}
        #all above vars are exported to all points below
    _path_chk () { #run once at start - fn quits if any below test fails
        [ -d "$1" ] && [ -w "$1" ] && set -- "$1"/*"$2" && [ -e "$1" ]
    } # chks for user writable dirname and resolvable $1/*$2 glob
    _print_fmt () { #shell printf now not stat - last field zero padded
        printf 'mv "${%d}" "${DIR}/%d_${NAME}_%04d${SUFX}"n' "$@"
    }
    _print_mv () { #prints copy of mv action before attempting 
        echo '(set -x' #uses shells debug printer to show expanded vals
        printf ': ${0+%s}n' "$@" 
            ${NOCONFIRM-'Key "ENTER" to accept or "CTRL+C" to quit'}
        echo ) #above can be disabled by declaring NOCONFIRM at invocation
    } #by default fn batches 25 mvs at a time, displays them, and confirms
    _read_loop () { #parses piped in with IFS, batches in INTerval of 25
        argc=${1-$argc} ; ${1+shift} #total globbed files - quit point
        while IFS=' -' read nl y m d na ; do #split on -
            set -- "$@" "$nl" "$y$m$d" "$((i=i+1))" #build array until
            [ "$#" -ge "$INT" ] && break #hit interval
        done ; IFS='
';      set -- $(_print_fmt "$@") && unset IFS #finalize array in _print_fmt
        _print_mv "$@" #do the debug out
        ${NOCONFIRM+:} read < /dev/tty #if $NOCONFIRM not set confirm
        printf '%sn' "$@" #now print the actual command
        [ $((argc>i)) -eq 1 ] || echo 'exit 0' #check if quit point     
        _read_loop #if not quit repeat
    }                                                                    
    _pipeline () { #this is mostly same - no sed though
        stat -c "$TIME" -- "$@" | nl -nln -w1 -s ' ' | sort -k2,3 | {
            _read_loop $# || echo 'exit 1' #read loop stands in for sed
        } | sh -s -- "$@" #sh still evaluates on args
    } #only two calls from main function below
    _path_chk "$1" "$2" || ${ERR:?Invalid pathname parameters specified}
    _pipeline "$DIR"/*"$SUFX" #if _path_chk do _pipeline
) #that's all folks

That uses the shell to some things I was doing with other utilities. The concept is the same – glob a file list, sort in different ways and store the sort order. What’s really different about this thought is that it batches move operations by interval, displays to the user what it is about to to do and awaits a prompt before continuing. I recorded myself using it here so you can watch a terminal session of how it works.

Answered By: mikeserv

Most unices don’t track a file’s creation date¹. “Creation date” is ill-defined anyway (does copying a file create a new file?). You can use the file’s modification time, which is by a reasonable interpretation the date at which the latest version of the data was created. If you make copies of the file, make sure to retain the modification time (e.g. cp -p or cp -a if you use the cp command, not bare cp).

A few file formats have a field inside the file where the creator application fills in a creation date. This is often the case for photos, where the camera will fill in some Exif data in JPEG or TIFF images, including the creation time. Nikon’s NEF image format wraps around TIFF and supports Exif as well.

There are ready-made tools to rename image files containing Exif data to include the creation date in the file name. renaming images to include creation date in name shows two solutions, with exiftool and exiv2.

I don’t think either tool lets you include a counter in the file name. You can do your renaming in two passes: first include the date (with as high resolution as possible to retain the order) in the file name, then number the files according to that date part (and chuck away the time). Since modern DSLRs can fire bursts of images (Nikon’s D4s shoots at 11fps) it is advisable to retain the original filename as well in the first phase, as otherwise it would potentially lead to several files with the same file name.

exiv2 mv -r %Y%m%d-%H%M%S:basename: *.NEF
# exiv2 uses `strftime(3)`, so `%Y%m%d-%H%M%S` returns YYYYMMDD-hhmmss
# :basename: is a naming variable exiv2's `-r`-handle provides. See `exiv2 -h` for more  
# Now you have files with names like 20140630-235958_DSCC1234.NEF.
# Note that chronological order and lexicographic order agree with this naming format.
i=10000
for x in *.NEF; do
  i=$((i+1))
  mv "$x" "${x%-*}_FOO_${i#1}.NEF"
done

${x%-*} removes the part after the - character. The counter variable i counts from 10000 and is used with the leading 1 digit stripped; this is a trick to get the leading zeroes so that all counter values have the same number.

Rename files by incrementing a number within the filename has other solutions for renaming a bunch of files to include a counter.

If you want to use a file’s timestamp rather than Exif data, see Renaming a bunch of files with date modified timestamp at the end of the filename?


As a general note, don’t generate shell code and then pipe it into a shell. It’s needlessly convoluted. For example, instead of

find -name '*.NEF' | 
gawk 'BEGIN{ a=1 }{ printf "mv %s %04d.NEFn", $0, a++ }' | 
bash

you can write

find -name '*.NEF' | 
gawk 'BEGIN{ a=1 }{ system(sprintf("mv %s %04d.NEFn", $0, a++)) }'

Note that both versions could lead to catastrophic results if a file name contained shell special characters (such as spaces, ', $, `, etc.) since the file name is interpreted as shell code. There are ways to turn this into robust code, but this isn’t the easiest approach, so I won’t pursue that approach.


¹ Note that there is something called the “ctime”, but the c isn’t for creation, it’s for change. The ctime changes every time anything changes about the file, either in its content or in its metadata (name, permissions, …). The ctime is pretty much the antithesis of a creation time.

Categories: Answers Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.