Hardware RAID .. ed your pockets ..

Yes, not only is a hardware RAID controller going to cost you, 9 times out of 10 it's actually going to give you worse performance than if you didn't have it in the first place. There is an argument that says a RAID controller with battery backup is less likely to corrupt your data in the event of a crash or power outage, but to be realistic what you lose is going to be governed by how your applications cache data.

So, if you don't need a hardware RAID controller, what should one use?

Easy, Linux has had software RAID for nigh on 20 years and generally it knocks spots off all the hardware implementations you see. Doesn't sound intuitive does it, why do people make these things and often charge a lot for them if there's no point in having them?

I have no easy answer, except to say that in the days before Linux and before software RAID, we used to use MYLEX RAID controllers with SCO Unix, and they were a GodSend. I guess once you have a product, you keep making it while people keep buying it ...

Easy way to explain it - what sort of processor does your hardware RAID controller have on it? I'll lay odds it a cheap MC68000 or ARM type processor running at 500MHz, with a fraction of the beef of your multi-core Intel processor that's driving the main box, which is probably running at 3GHz. The question is, how is the software running on this tiny dedicated processor supposed to compete with Linux's RAID software running on the main processor? Well, of course it can't, so at the expense of a trickle of CPU power from one of your cores, you will generally get much better performance from Linux software RAID.

Next, stick three disks in a Linux box, set each disk up with a single partition and set up a RAID device like this;

mdadm --create /dev/md0 --level=10 -n3 -pf2 /dev/sda1 /dev/sdb1 /dev/sdc1

Wait! I hear you cry (no really, everyone says this ...) you can't create a RAID10 array with an odd number of disks, you need to have 2,4 or 8!

Ok, gather all the limitations you've grown up with regarding RAID, stick them all in a bag, then chuck the bag out of the Window. You can create a RAID array on any number of disks greater than 1.

Moreover, you don't need any special RAID controllers that mean recompiling the kernel, you get a native interface to the RAID subsystem available via /proc, so to watch a RAID rebuild for example;

# watch cat /proc/mdstat 
Personalities : [raid10] 
md0 : active raid10 sdd1[1] sda1[0]
      488254464 blocks super 1.2 512K chunks 2 far-copies [2/2] [UU]
      [=>...................]  resync =  8.4% (41267328/488254464) finish=131.6min speed=56578K/sec
      bitmap: 4/4 pages [16KB], 65536KB chunk

And of course you can fail, add, remove drives from within the OS (again with no special 3rd party software), and you can run hot spares, standby's etc etc. Even better you can control the resync resource usage so if you need to do a rebuild when the system is in use, you can limit sync bandwidth so the system doesn't grind to a halt while users are using it.

One last titbit, when you install the "mdadm" package, or at least when your system installs it for you, it will install a daemon that periodically runs a check on the array to make sure it's "sane". (typically monthly) Check /etc/default/mdadm on Debian or Ubuntu systems .. you might either want to change the scheduling to ensure it happens off-peak, or indeed disable this feature.