
On Tue, 28 Apr 2015 23:03:33 +1200, Daniel Lawson wrote:
Drives that are in a good condition and are not failing quite simply do *not* have increasing SMART counters ...
Interpreting SMART that way is a statistical predictive thing. Like any statistical prediction, you will have some percentage of false positives: the counters go up to some high value that you interpret as anomalous, but the drive continues to operate fine for something close to the usual life span.
I re-read your first email on this subject and you even acknowledged that backblaze make the same point I am, but you don’t put any weight on avoiding downtime.
I assume there are already systems in place for coping with failed drives--whether RAID or something more advanced like btrfs/ZFS, or some storage-management scheme built on top of that, whatever. In a situation with thousands of drives, you will be continually having failed drives somewhere, so you have to be able to keep operating with that. The effort and expense comes in actually replacing them. So anything that increases this expense, without actually improving the reliability of things, is not going to be welcome.