Hi Robbie, > Sorry, I wanted to clarify regarding dedup. >
Thanks for the clarification. You made me think again about my setup. When I originally setup my NAS I did some guessing/planning and enabled dedup for those fs-es where I expected to benefit. Now that I've been filling it up for over a year I did have a good look at how much I benefit. Turns out my data is rather unique, my dedup-factor is only 1.1 and contributing to that is also some of my tests to verify that dedup works :-( With the calculations from the link you sent me, dedup-tables are almost 8GB. I did not realize this was the case as my used swap-space is still zero bytes. Reading further I expect that I am using a great deal of the L2ARC-SSD partition now for dedup-tables. Guess that saved me from getting into noticable delays due to the dedup-tables growing. Seems like in my case dedup indeed isn't worth it and it is a good idea to switch off dedup all together. I will be migrating my current pool to a new pool with bigger disks soon, so that seems like the easiest way to get rid of the dedup alltogether. Thanks again for the info! Jaco > It's not just time machine that will suffer. It's everything, regardless of > if the FS the relevant data is on is being deduped or not. > > When your dedup tables outgrow the amount of RAM allocated to them (25% of > ARC IIRC), they start swapping. I had ~400GB of deduped data on a 10TB pool > that was about 60% full, dedup tables were about 24GB. I have 32GB of RAM, > but only 25% of that was for the dedup tables. That meant that I was > constantly swapping about 16GB, and by constantly, I mean actually > constantly, as in, any activity on the server would cause IO Wait on the > disks. Before I nuked the whole thing and started from scratch, I was > getting about 5MB/sec writes and 7-10MB/sec reads. It became impossible for > me to do anything that it should've been able to do, like streaming video, > with ease. > > From what I remember when calculating things and testing them against my > server, You need about 4GB of RAM for every TB of data in order to use > dedup. Otherwise, everything falls apart. And that was for my file server, > which is predominantly video, so large files, and a smaller dedup table. If > you're talking strictly about TM backups, you're going to have an even > bigger table because after the initial backup, all TM backups are smaller > files, less than 10 MB usually. Smaller files means more files, means more > info needs to be stored in your dedup table. > > When I was figuring out what was killing my system I spent a bunch of time > in #ZFS on freenet, we came to the conclusion that for my usage patterns, > I'd need to have at least 96GB of RAM in order to keep things deduped as > they were, and not want to kill myself every time I wanted to watch a > video. Dedup is VERY hungry, and gets VERY cranky when it can't eat. > > There's a good article that breaks down things a bit better than I have if > you want to look into it further: > http://constantin.glez.de/blog/2011/07/zfs-dedupe-or-not-dedupe > > On Fri, Aug 17, 2012 at 9:11 AM, Jaco Schoonen <[email protected]> wrote: > >> >>> I have both the sparsebundle size and the ZFS FS set. So the FS was >> created >>> with a quota of 300GB and the sparsebundle then created in there with >>> 300GB. >>> >> >> OK, thanks for the tip. I'll give it a go. >> >>> Unless you're backing up multiple Macs, don't use dedup, the performance >>> hit you'll take will be huge after a few months unless you've got an >>> obscene amount of RAM, like 64+GB >> >> I don't use dedup for everythings, but for TimeMachine I think it may >> write more or less the same data quite often so there I have it enabled. >> Performance for time machine doesn't really matter too much. Besides, my >> NAS (5GB RAM) is only connected with single GBit, so I don't need faster >> than that. >> >>> Dedup and compression give you more free space, and that's what time >>> machine sees. >>> >>> Works great for me. I'm backing up 5 macs. >> >> Cool thanks! >> >> Jaco >> >>>> >>>> To limit TM disk-usage I have found at least 5 different options: >>>> 1) Use "zfs quota" to limit the size of the filesystem. >>>> 2) Use "zfs refquota" to limit the referenced amount of data in the fs. >>>> 3) Use netatalk feature "volsizelimit" >>>> 4) give the sparsebundle that TM uses a maximum size (hdiutil -resize >> 100g) >>>> 5) Limit Timemachine at application level (defaults write >>>> /Library/Preferences/com.apple.TimeMachine MaxSize -integer XXXX ) >>>> >>>> What are you all using and how does it work out for you? What would you >>>> recommend? >>>> >>>> Best regards, >>>> >>>> Jaco >> >> >> >> _______________________________________________ >> OpenIndiana-discuss mailing list >> [email protected] >> http://openindiana.org/mailman/listinfo/openindiana-discuss >> > > > > -- > Seconds to the drop, but it seems like hours. > > http://www.openmedia.ca > https://robbiecrash.me > _______________________________________________ > OpenIndiana-discuss mailing list > [email protected] > http://openindiana.org/mailman/listinfo/openindiana-discuss _______________________________________________ OpenIndiana-discuss mailing list [email protected] http://openindiana.org/mailman/listinfo/openindiana-discuss
