Self Managed Storage

February 4, 2017

Storage

As our digital legacies continue to grow with advanced capture systems, its important to consider taking some control over assuring how that legacy is perpetuated. There are many enterprise grade technologies that are within easy reach of the individual who wants to start building out their home lab. Rather than send monolithic emails to people who inquire about this topic it seemed more appropriate to write out my opinions openly, and maintain them as more experience is gained. This is written from the approach of accomplishing this task ‘properly’ as I personally see it. People have different initial objectives when considering this project. I’ll focus exclusively on storage and backup while ignoring things like streaming/trans-coding, or acquisition. This article is written with a couple assumptions:

You’re looking into local storage for yourself, and you’ve outgrown your desktop/external strategy.
The ‘cloud’ is not big or ‘private’ enough, so you feel obligated to roll your own solution.
You’ve become aware of bit-rot, ghost-writes, cosmic rays, and generally other real world things that are creeping you out.
You’ve become aware of spanning disks into a RAID scheme
You’ve heard the rage of people talking about checksums and COW filesystems.

Intent #

Rather than send monolithic emails to people who inquire about this topic it seemed more appropriate to write out my opinions openly, and maintain them as more experience is gained. This is written from the approach of accomplishing this task ‘properly’ as I personally see it. People have different initial objectives when considering this project. I’ll focus exclusively on storage and backup while ignoring things like streaming/transcoding, or data acquisition, with the clear assumption that these can be added later.

Before going further I’d encourage a side-voyage to the first section of this article about next generation filesystems that will give a good background into some of the reasons that choice in filesystem is likely the single most important aspect of this endeavor.

Principles #

Couldn’t locate a good definitive reference in relation to storage particularly, so I’ll venture a couple on my own:

RAID is not a backup
You can’t trust your hardware
You can’t trust your vendor
You can leverage the community

Correctness #

It is assumed that if you’re storing data… you actually want that data to be available in its intended form without corruption or loss.

Filesystem: with strong conviction I will tell you that it is critical to choose a filesystem that is resistant to hardware failure.
Duration: we’ve all traversed the hardware/software lock in (e.g. VHS) of our data, it’s important to design your storage system with principles that are aware of the duration you want to store data (likely you, or your families lifetime).
Backup: this must be done off-site, or it is not a backup. Copies or snapshots are potentially valuable internally, but off-site is the only way to ensure recovery from catastrophic failures.

I’ll just assume that speed isn’t a key consideration here. It’s safe to say that if you end up doing anything with more than four drives in a mirror or parity scheme you’ll be able to fully saturate a gigabit network connection… which will likely be a standard for many years to come in home environments.

Choice of File-system #

This is the most important technical choice one makes. The core conceptual shift is that these file-systems take full responsibility for everything below them. It is ideal to not force any abstractions, such as a hardware RAID volume, below them. Let these file-systems handle as much of the metal as possible on their own.

There are currently only two ‘mature’ solutions that satisfy the above principles and correctness; ZFS, and BTRFS. There is some complexity in choice here as ZFS is much more mature, but is mired in a deliberate licensing model that prevents it from direct inclusion in the Linux kernel. BTRFS is less mature, natively in kernel, but seems to generally move slower as a project than ZFS. Both filesystems are/were under significant influence by Oracle, which might be one of the most instrumentally talented, yet evil corporations in existence when it comes to open source ideals.

At the time of this writing, the only real choice is ZFS due to a couple reasons:

maturity (previously mentioned)
mature implementation of parity via RAIDZ
mature file system check/repair via scrub mechanism

The major downside of ZFS is that it’s not a first class citizen in the Linux kernel, which means you have to shim it in, or consider using another operating system for native support. This adds some complexity, or time, when you’re rolling your own solution.

I believe there is more attention being given to BRTFS recently due to the container movement wanting to have strong support for snapshots natively. Systemd-nspawn and machinectl will likely advance the adoption of BTRFS natively in most Linux distributions, but that doesn’t necessarily mean that parity support will mature at the same rate.

Either choice you make, between BTRFS and ZFS, make sure that you’re planning to integrate it on an open stack. You don’t want to run some vendors patched version of a filesystem only to find out when they cease to exist that you cant export/import that filesystem into another environment.

Build or Buy #

This will be the first hurdle anyone considering this venture runs into. Honestly, most won’t pass this stage in the article because to go beyond buying an integrated system requires more than money, specifically, it requires time, aptitude, and a desire to learn the ‘scene’ of the storage world.

There are currently (as of this writing) very few options one has to buy an integrated system that fulfills the principles and correctness outlined above. The only option I am personally aware of (please let me know otherwise) is FreeNAS. Which, conveniently aims at making the task of managing your storage solution less complex through bundling technologies and providing an intuitive user experience. The community around FreeNAS is fantastic towards beginers in this realm, and, you can buy a device with it integrated, or if thats too small consider larger systems. The key behind their partnership with ixsystems is that they eliminate the complexity that one would experience if they had issues with hardware support (which totally happens when you roll your own).

So, for many of you, I’d suggest you stop here and seriously consider FreeNAS as a solution. You’d learn immensely from digging into their ecosystem, and you’d get exposure to BSD. Their community is welcoming to beginners, they have great user interfaces that abstract you from underlying complexities, and due to the large user base it’s a highly ’tested’ system.

Build Considerations #

If you’re committed to truly rolling your own because you want to run Linux, or you want to do use unique hardware then luckily you have a great option due to the work of LLNL and the community with ZoL/OpenZFS. I’ll write this section with a Linux bias, as that’s what I’ve been integrating lately.

One thing first to consider is going with ECC memory, as both BRTFS and ZFS community members/designers highly suggest it. This means you’re in the market for a server board of some kind, which is proving to be quite easy now days with the advent of Xeon D system boards. Also, if you plan to use advanced features such as deduplication, get a lot of memory.

Another is a drive controller or ‘Host Bus Adapter’ (HBA) as it’s unlikely that, unless you’re sticking to 6-8 drives, you’ll have enough connectivity on your main board. If you’re in the 10+ drive system design you’ll want to consider either a dedicated HBA card, or a mainboard that has SAS connectors (e.g. SFF-8087) on them. Always lean towards locking connectors if you get a chance. I’ve personally had arrays drop drives because someone had bumped a case and the SATA connector was tenuous. SAS backplanes and locking connectors are ideal. If you examine the Backblaze and Supermicro designs you’ll see they even put the backplanes vertically to use gravity assist in maintaining a connection.

Edit: now days you should also consider matching up an HBA with a SAS expander if you’re going for more than 8 drives in a system.

Get a battery backup, and make sure your server can talk to it. ZFS and BTRFS are resistant to the RAID write hole, but don’t press your luck with your physical drives during power loss. A graceful shutdown is always preferred, tools like apcupsd and nut can help.

Examine reliability metrics on particular models of hard drives before you consider buying. A great resource over the last coupe years has turned out to be backblaze as they’ve been willing to post their results pretty often, and buy a lot of drives.

Always do initial tests of drives before integrating into an array. I prefer to use badblocks over night before I initialize any array. This typically will give me a strong indication of whether I’m about to experience infant mortality.

Always seek out clarification from the community about hardware compatibility. Typically in this class of hardware you’ll find yourself searching for things pulled from enterprise class systems on ebay. You’ll want to ensure that this hardware works before you commit to an auction.

And finally, sit down and really plan out your filesystem. Both ZFS and BTRFS have volumes, and you’ll want to use them at some level of granularity so you can selectively implement things like compression and deduplication.

Backup #

Absolutely you’ll need an off-site backup. The only other thing I’ve seen is burying/sealing a storage target inside the home under concrete, or external to the house (if you’ve got a separate structure on-site). This is uncomfortably where I have to suggest a service, as it’s unlikely that you’re planning to build two or more of these systems to replicate between. If you do, good on you, but there are some cheaper options if you’re willing to sweat a bit about whether you truly have privacy.

Edit: When I originally wrote this article I was using CrashPlan Pro, however since they’ve discontinued a large swath of their services I see that as a warning sign for the efficacy of them going forward. Also their client is insanely frustrating to work with. I’ve modified my recommendation below based on where I went upon leaving CrashPlan.

Hands down the best experience for doing backups is a combination of rclone with your choice of cloud backend. I’ve personally chosen Backblaze B2 due to their transparency with both the pod and their quarterly reporting on failure statistics.

Community Resources #

In all of this, attempt to talk to community members from ZoL, OpenZFS, or FreeNAS. There is also a lot of discussion at hardforums specifically about storage that would be beneficial to peruse. Figure out how to use mailing lists and IRC. Figure out how to submit meaningful debugging information to issue trackers. And, finally, don’t log into a chat or send an email in a panic… people are willing to help you, but it’s not something any of them are doing professionally all day long so you’ll have to give folks time to respond.

Wrapping Up #

Rolling your own storage is, in my opinion, one of the best projects to embark upon to learn about operating systems, networking, storage, system administration, and the wealth of the open source community. Much of what I’ve learned about ‘systems’ that I put into application daily came from a basis in rolling custom storage systems for myself and my family.

Please feel free to reach out. This was written as an attempt to provide a shoving off point that funneled people into the build or buy lanes, although I realize how biased my buy ploy was. It just seems that, at this time, there are no good off-the-shelf options that integrate filesystems which satisfy the above principles. Even when they do exist, would you really trust them? Personal data acquisition devices (e.g phones, cameras, wearables) are becoming more prevalent, entertainment media is ballooning (e.g. 4k, 8k, DSD), and not everyone can live with their lives being in ’the cloud’. Plus, how cool do you get to feel when someone sees your box of blinking lights in the basement?