# How could a sequence of random dates be generated, given year interval?

What is needed here is a command that generates six dates given a range of years (1987 to 2017). For example:

``````12/10/1987
30/04/1998
22/02/2014
17/08/2017
19/07/2011
14/05/2004
``````

How it could be done, with `sed`, `gawk`, etc?

If the difference in the number of years is less than about 90 you can use the `\$RANDOM` variable in bash to give you an offset in number of days and use the limited ability of the `date` command to do the calculation.

``````#!/bin/bash
s=\$(date +%s -d "1/1/\$1")          # start in seconds since 1 Jan 1970
e=\$(date +%s -d "1/1/\$((\$2+1))")   # start of end year +1 in seconds
days=\$(((e-s)/(24*3600)))          # number of days from start to end
factor=\$((32767/\$days))            # RANDOM is 0 to 32767. See how many
toobig=\$((\$factor*\$days))          # exact multiples of days.
# if RANDOM is too large, draw again
for((i=0;i<\${3:-1};i++))           # produce \$3 random dates
do
r=\$RANDOM                      # find a random number < toobig
while (( r >= toobig ))        # if toobig, then loop.
do r=\$RANDOM
done
offset=\$((\$r/\$factor))         # get (0,days) from (0,factor*days)
# output horrible day/month/year for N days past start date
date -d "\$offset days 1/1/\$1" +%d/%m/%Y
done
``````

The inner loop to select the random number attempts to correct for bias. If a random number source could give you 0, 1, 2, 3, 4, 5, 6, 7, 8 or 9 equally and you want a random number between 0 and 3 inclusive, then if you get 0 or 1 from the source you report 0, if you get 2 or 3, then you report 1, if you get 4 or 5 you report 2, if you get 6 or 7 you report 3, and if you get 8 or 9 you ignore this number from the source.

Here’s one way, mostly in awk:

``````#!/bin/sh

[ "\$end" -ge "\$start" ] || exit 1

awk -v start=\$1 -v end=\$2 '
BEGIN {
srand();
for(i=1; i <= 6; i++)
printf "%02d/%02d/%dn", 1 + rand() * 28, 1 + rand() * 12, start + rand() * (end-start);
}
' < /dev/null
``````

The shell script takes two parameters and passes them as variables to awk, who reads no input and does all the work in the `BEGIN` block.

After seeding the random number generator, it loops 6 times over a `printf` statement. That print statement selects a subset of the possible dates in the range by generating a random number between 1 and 28 for the day (being safe for February), between 1 and 12 for the month, and between the given start and end years. It’s random, but it’s not full coverage — it’ll never print days 29-31 for months that have them.

Another way, using GNU date and bash features:

``````#!/bin/bash

start=\$1
end=\$2

[[ "\$start" -le "\$end" ]] || exit 1

startsec=\$(date -d "1/1/\$start" +%s)
for((i=1; i<=6; i++))
do
r=\$((RANDOM % (1 + end - start)*365*24*60*60))
date -d @\$((startsec + r)) +%d/%m/%Y
done
``````

It works by computing the seconds-since-the-epoc of Jan 1st on the start date, then for each of the loops, it comes up with a random number of offset seconds to add; the random number is limited to the number of seconds spanning the given range. GNU date then manipulates that date into the desired format.

You can turn the problem into generating a random number between a number representing the first possible date and a number representing the last possible date (actually the one right after the last possible), in unix epoch format. Everything else is handled by standard date conversions. `gawk` has a better random number resolution than `bash` (float vs 15 bits integer), so I’ll be using `gawk`. Note that the `rand()` result N is a float such that 0 <= N < 1, that’s why the higher limit is increased below, it’s a limit that can’t be rolled. There’s an optional 3rd parameter for the number of results.

``````#!/usr/bin/gawk -f
BEGIN {
first=mktime(ARGV[1] " 01 01 00 00 00")
last=mktime(ARGV[2]+1 " 01 01 00 00 00")
if (ARGC == 4) { num=ARGV[3] } else { num=1 }
ARGC=1
range=last-first
srand(sprintf("%d%06d", systime(), PROCINFO["pid"]))
for (i=1; i <= num; i++) {
print strftime("%d/%m/%Y", range*rand()+first)
}
}
``````

For example:

``````./randomdate.gawk 1987 2017 6
26/04/1992
28/04/2010
21/04/2005
17/02/2010
06/10/2016
04/04/1998
``````

With `date`, `shuf` and `xargs`:

Convert start and end date to “seconds since 1970-01-01 00:00:00 UTC” and use `shuf` to print six random values in this range. Pipe this result to `xargs` and convert the values to the desired date format.

Edit: If you want dates of the year 2017 to be included in the output, you have add one year -1s (`2017-12-31 23:59:59`) to the end date. `shuf -i` generates random numbers including start and end.

``````shuf -n6 -i\$(date -d '1987-01-01' '+%s')-\$(date -d '2017-01-01' '+%s')
| xargs -I{} date -d '@{}' '+%d/%m/%Y'
``````

Example output:

``````07/12/1988
22/04/2012
24/09/2012
27/08/2000
19/01/2008
21/10/1994
``````

Perl:

``````perl -MTime::Piece -sE '
my \$t1 = Time::Piece->strptime("\$y1-01-01 00:00:00", "%Y-%m-%d %H:%M:%S");
my \$t2 = Time::Piece->strptime("\$y2-12-31 23:59:59", "%Y-%m-%d %H:%M:%S");
do {
my \$time = \$t1 + int(rand() * (\$t2 - \$t1));
say \$time->dmy;
} for 1..6;
' -- -y1=1987 -y2=2017
``````

Sample output:

``````10-01-1989
30-04-1995
10-12-1998
02-01-2016
04-06-2006
11-04-1987
``````
Categories: Answers Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.