3Ware RAID rebuilding

I’ve had the dubious honour of seeing some RAID failures and rebuilds lately. It’s the kind of thing that doesn’t get written about in the manuals very well, in particular what your RAID will report when it’s having trouble. So, here are a couple of examples from a 3Ware RAID controller using tw_cli software. This is what tw_cli /c4 show displays when we have a dead drive:

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-1    DEGRADED       -       -       -       149.05    ON     -      

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     149.05 GB   312581808     G2109NHG            
p1     DEGRADED         u0     149.05 GB   312581808     G20X1BWG            

So, we swap the drive, and it looks like this while rebuilding:

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-1    REBUILDING     89      -       -       149.05    ON     -

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     149.05 GB   312581808     G2109NHG
p1     DEGRADED         u0     149.05 GB   312581808     G209Y0HG

and after a little while…

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-1    OK             -       -       -       149.05    ON     -

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     149.05 GB   312581808     G2109NHG
p1     OK               u0     149.05 GB   312581808     G209Y0HG

There are plenty of obvious strings to match in this output (though there are many other reports available), so it’s a reasonable thing to base a monitoring script on.

It’s nice to see it actually work, and makes me extremely grateful that I bothered getting RAID n the first place. This would be a much unhappier post if I hadn’t.

Dell RAID firmware and lockfile on Ubuntu

Ian P. Christian ran into this problem a while ago:

On a seperate note, anyone know how to upgrade firmware using Dell’s software
on a non-RH system?

# ./RAID_FRMW_LX_R107404.BIN
/root
which: no lockfile in
(/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin:/usr/i686-pc-linux-gnu/gcc-bin/3.3.6)
spsetup.sh: Cannot find utilities on the system to execute package.
Make sure the following utilities are in the path: sed lockfile tail rm mkdir
chmod ls basename
/root

‘lockfile’ is missing – whatever that is!

Lockfile just doesn’t seem to exist outside of RedHat (e.g. lockfile-progs on Debian doesn’t include it), however, you can of course find it on a RedHat system, and I happen to have one handy. I just copied the binary to my Ubuntu installation, where it appeared to run just fine, and allowed the firmware updaters to run ok. Thought someone might like to know.