
So SMART is fine for telling you a drive is *not healthy*. On the other hand, SMART is no use at telling you a drive is *healthy*: A clean bill of health today, according to SMART, doesn’t mean the drive won’t catastrophically fail tomorrow. This is the conclusion that the google paper drew, but a lot of people seem to misinterpret it.
This is why I don’t bother with SMART. Sit down and work out the maths: relying on SMART greatly increases your rate of replacement of drives, without a corresponding increase in the reliability of your data.
I’m struggling to understand your reasoning here. If you don’t ever check your drives SMART data, you will absolutely have more unexpected disk failures, quite simply because the drive failures you could have caught early by proactive SMART monitoring weren’t caught early. This might result in multiple drive failures in a single RAID set, which might result in data loss. Of course, you’re backed up (right? right), so there’s no loss in data *reliability*. But there is a loss in *availability*. It’s not a perfect tool, but when something tells you something is going wrong, you should pay attention to it. I’ll take imperfect health prediction and being able to proactively replace drives I know are failing over operating completely blind any day.