
Hello Daniel,
Thank you very much for your thoughtful analysis and comments. I'll probably use your suggestion of using md5sums for sorting the file archive.
What were the ZFS feature you thought you'd like to use?
ZFS does a fairly "software" RAID implementation (RAID-Z), but it also goes beyond that and does file checksums, which go a long way to protecting you against silent data corruption. It also has some nice cache acceleration options with SSDs - it will natively support SSDs for read and write caches, giving you fairly decent read/write acceleration with minimal effort and cost. I haven't really looked too much into it, because I'm limited (for reasons that aren't worth going into, because they can't be overcome) to linux, so it's just not an option.