# How can I print records with difference in maximum and minimum value greater than a certain threshold?

I have data like this in two columns:

``````[id1] 09:51:07,175
[id1] 09:51:07,215
[id2] 10:09:47,550
[id2] 10:09:47,588
[id2] 10:09:47,942
[id2] 10:09:47,947
[id3] 10:05:25,945
[id3] 10:05:26,001
``````

The first column is IDs and the other is time.
Now, I want to print IDs if the difference between their minimum and maximum times is greater than x.

If x = 100, the desired output would be:

``````id2
``````

Because max(id2) = 10:09:47,947 and min(id2) = 10:09:47,550
and their difference is 397. If x=30, the desired output would be:

``````id1
id2
id3
``````

because

``````max(id1) - min(id1) = 40
max(id2) - min(id2) = 397
max(id3) - min(id3) = 56
``````

if x is 50 then output would be:

``````id2
id3
``````

How can I approach this?

A typical job for `perl`:

``````perl -MList::Util=min,max -lsne '
push @{\$v{\$1}}, \$5 + 1000 * (\$4 + 60 * (\$3 + 60 * \$2)) if
m{^[(.*)] (dd):(dd):(dd),(d+)\$};
END{
for (keys %v) {
print if max(@{\$v{\$_}}) - min(@{\$v{\$_}}) > \$threshold
}
}' -- -threshold=100 < your-file
``````

If could be made more efficient and less memory at the expense of more complicated code hungry by computing the min and max as we read the lines rather than storing all the values and calling `min()` and `max()` in the end. But unless the input is very large, it’s not likely to make much difference.