
Hello all, I was reading an article in Linux User magazine about Sabayon Linux and their use of the ZFS file system. Supposedly the file system support deduplication which I thought could be useful if manually tidying up a photograph archive. Has anyone on the list used the ZFS system or Sabayon Linux? Any recommendations or experience? Cheers, [1] http://www.linuxuser.co.uk/ [2] http://www.sabayon.org/ [3] http://zfsonlinux.org/

Hello all,
I was reading an article in Linux User magazine about Sabayon Linux and their use of the ZFS file system.
Supposedly the file system support deduplication which I thought could be useful if manually tidying up a photograph archive.
Has anyone on the list used the ZFS system or Sabayon Linux?
Any recommendations or experience?
Deduplication in this sense won't help you tidy up your photo archive. The filesystem will determine (in some manner) that the contents of block X are the same as block Y, and will free up the space occupied by block Y and replace it with a pointer to block X. The files remain unchanged: file A (with block X) and file B (with block Y) will still look identical, and you can still choose to delete or modify them individually. If you modify file B, then the new blocks written out will contain the full, unduplicated (as it is now unique) contents. The net result to you is that you will save the space (or at least some of it) occupied by duplicated photographs, but if at some point in the future you were to copy those photos elsewhere, eg to a filesystem that doesn't support dedupe, that space would be expanded out. If you do want to tidy up a photograph archive, you're better off doing something like creating a hash (md5sum, sha256, etc) of all the files, and then looking for duplicate hashes. Or you could use a tool like the ones mentioned in this thread: http://askubuntu.com/questions/4072/how-can-i-find-duplicate-photos. In a more general sense, I'm wary of the various efforts to get ZFS into linux. It's very unlikely to be included in the mainstream kernel in a hurry, due to licensing issues, and while some distros may offer it as a feature, it is limited in many ways (eg, with Sabayon, you can't easily install to ZFS). Further, once you've got your system onto a ZFS filesystem, you're reliant on the continued support of whoever got it there. With an in-kernel filesystem, you have far more support. It's a shame, I'd really like to be able to use some of ZFS's features, but I don't think it's ready in linux

In a more general sense, I'm wary of the various efforts to get ZFS into linux. It's very unlikely to be included in the mainstream kernel in a hurry, due to licensing issues, and while some distros may offer it as a feature, it is limited in many ways (eg, with Sabayon, you can't easily install to ZFS). Further, once you've got your system onto a ZFS filesystem, you're reliant on the continued support of whoever got it there. With an in-kernel filesystem, you have far more support. It's a shame, I'd really like to be able to use some of ZFS's features, but I don't think it's ready in linux
Hello Daniel, Thank you very much for your thoughtful analysis and comments. I'll probably use your suggestion of using md5sums for sorting the file archive. What were the ZFS feature you thought you'd like to use? Cheers, Chris

On Wed, Nov 7, 2012 at 12:55 PM, mailinglist <mailinglist(a)blahdeblah.co.nz>wrote:
In a more general sense, I'm wary of the various efforts to get ZFS
into linux. It's very unlikely to be included in the mainstream kernel in a hurry, due to licensing issues, and while some distros may offer it as a feature, it is limited in many ways (eg, with Sabayon, you can't easily install to ZFS). Further, once you've got your system onto a ZFS filesystem, you're reliant on the continued support of whoever got it there. With an in-kernel filesystem, you have far more support. It's a shame, I'd really like to be able to use some of ZFS's features, but I don't think it's ready in linux
Hello Daniel,
Thank you very much for your thoughtful analysis and comments. I'll probably use your suggestion of using md5sums for sorting the file archive.
What were the ZFS feature you thought you'd like to use?
Cheers,
Chris
Hi Chris. The RAID features look very good, making it robust filesystem. From a management point of view it illuminates a layer to manage as the file system is the RAID provider. If you wanted to have a look I believe FreeNAS (BSD based) has it as a storage option, obviously you would need to dedicate a machine / Virtual machine to being a NAS. Greg

Hello Daniel,
Thank you very much for your thoughtful analysis and comments. I'll probably use your suggestion of using md5sums for sorting the file archive.
What were the ZFS feature you thought you'd like to use?
ZFS does a fairly "software" RAID implementation (RAID-Z), but it also goes beyond that and does file checksums, which go a long way to protecting you against silent data corruption. It also has some nice cache acceleration options with SSDs - it will natively support SSDs for read and write caches, giving you fairly decent read/write acceleration with minimal effort and cost. I haven't really looked too much into it, because I'm limited (for reasons that aren't worth going into, because they can't be overcome) to linux, so it's just not an option.

Hello Daniel,
Thank you very much for your thoughtful analysis and comments. I'll probably use your suggestion of using md5sums for sorting the file archive.
One of the comments on the URL I linked suggested a tool called fslint. Probably worth looking into this first, as it implements an md5sum match for you, with minimal effort :) (Apparently at least, i haven't used it...)

You might want to take a look here <http://en.wikipedia.org/wiki/ZFS#Lightweight_filesystem_creation> if you want to know more about ZFS. Methinks, ZFS might become a successor for BTRFS, which in turn is slated to eventually replaces ext4. But if I read it all correctly, I won't have enough data to make ZFS worthwhile in a hurry, or even necessary . . . Wolfgang On 07/11/12 21:27, Daniel Lawson wrote:
Hello Daniel,
Thank you very much for your thoughtful analysis and comments. I'll probably use your suggestion of using md5sums for sorting the file archive.
One of the comments on the URL I linked suggested a tool called fslint. Probably worth looking into this first, as it implements an md5sum match for you, with minimal effort :) (Apparently at least, i haven't used it...) _______________________________________________ wlug mailing list | wlug(a)list.waikato.ac.nz Unsubscribe: http://list.waikato.ac.nz/mailman/listinfo/wlug

The Solaris guys got a massive jump on the linux community when it comes to modern filesystems. BTRFS is pretty much a GPL catchup effort, with some internal disk structure improvements. It currently doesn't have dedup thou. The SSD caches are also being worked on for linux, but in a filesystem agnostic way via bcache, but still has a lot of work before its ready. ZFS is also very memory hungry with a recommended 1GB of ram per TB of disk space (in the pool). The other nice feature to both ZFS and btrfs is the copy-on-write snapshots, the mentioned above checksums which combine really nicely with RAID (thou btrfs doesn't have RAID5 or RAIDZ), and on the fly compression is pretty nice too. ZFS wont be a successor to btrfs, ZFS wont ever be integrated into the Linux kernel (licence issues, and Oracle started btrfs and havn't shown any indication of re-licensing it, if even possible). Without integration it will always be a second class citizen in Linux. So far there has been no work to port btrfs to other UNIX systems either. Been running / on my laptop on btrfs for about 9 months now and have been pretty happy with it, it still has a few issues to iron out, but its getting there. Thinking about moving my home fileserver to btrfs soon too.
participants (5)
-
Daniel Lawson
-
Gregory Machin
-
mailinglist
-
Ronnie Collinson
-
Wolfgang