cciss_vol_status - show status of logical drives attached to HP

NAME

       cciss_vol_status  -  show  status  of  logical  drives  attached  to HP
       Smartarray controllers

SYNOPSIS

       cciss_vol_status [OPTION] [DEVICE]...

DESCRIPTION

       Shows  the  status  of  logical  drives  configured  on  HP  Smartarray
       controllers.

OPTIONS

       -p, --persnickety
              Without  this  option,  device  nodes  which can’t be opened, or
              which are not found  to  be  of  the  correct  device  type  are
              silently   ignored.    This   lets   you  use  wildcards,  e.g.:
              cciss_vol_status /dev/sg* /dev/cciss/c*d0, and the program  will
              not complain as long as all devices which are found to be of the
              correct type are found to be  ok.   However,  you  may  wish  to
              explicitly  list  the  devices  you  expect  to be there, and be
              notified if they are not there (e.g.  perhaps  a  PCI  slot  has
              died,  and  the  system  has  rebooted,  so  that  what was once
              /dev/cciss/c1d0 is no longer there at all).   This  option  will
              cause the program to complain about any device node listed which
              does not appear to be the right device type, or is not openable.

       -C, --copyright
              If  stderr  is  a  terminal,  Print out a copyright message, and
              exit.

       -q, --quiet
              This option  doesn’t  do  anything.   Previously,  without  this
              option and if stderr is a terminal, a copyright message precedes
              the normal program output.  Now, the copyright message  is  only
              printed via the -C option.

       -u, --try-unknown-devices
              If  a  device has an unrecognized board ID, normally the program
              will not attempt to communicate with it.  In case you have  some
              Smart  Array  controller  which  is newer than this program, the
              program may not recognize it.  This option permits  the  program
              to  attempt  to interrogate the board even if it is unrecognized
              on the assumption that it is in fact a Smart Array of some kind.

       -v, --version
              Print the version number and exit.

       -x, --exhaustive
              Deprecated.   Previously, it "exhaustively" searched for logical
              drives, as, under some circumstances some logical  drives  might
              otherwise  be  missed.   This option no longer does anything, as
              the algorithm for finding logical drives was changed to  obviate
              the need for it.

DEVICE

       The  DEVICE  argument indicates which RAID controller is to be queried.
       Note, that it indicates which RAID controller, not which logical drive.

       For the cciss driver, the "d0" nodes matching "/dev/cciss/c*d0" are the
       nodes which correspond to the RAID controllers.  (See note  1,  below.)
       It  is  not  necessary to invoke cciss_vol_status on each logical drive
       individually, though if you do this,  each  time  it  will  report  the
       status of ALL logical drives on the controller.

       For  the  hpsa driver, or for fibre attached MSA1000 family devices, or
       for the hpahcisr sotware RAID driver which emulates Smart  Arrays,  the
       RAID controller is accessed via the scsi generic driver, and the device
       nodes will match "/dev/sg*"   Some variants of the "lsscsi"  tool  will
       easily  identify  which device node corresponds to the RAID controller.
       Some variants may only report the SCSI nexus (controller/bus/target/lun
       tuple.)  Some distros may not have the lsscsi tool.

       Executing  the  following  query to the /sys filesystem and correlating
       this with the contents of /proc/scsi/scsi or output of lsscsi can  help
       in finding the right /dev/sg node to use with cciss_vol_status:

       wumpus:/home/scameron # ls -l /sys/class/scsi_generic/*
       lrwxrwxrwx 1 root root 0 2009-11-18 12:31 /sys/class/scsi_generic/sg0 -> ../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/0000:03:03.0/host0/target0:0:0/0:0:0:0/scsi_generic/sg0
       lrwxrwxrwx 1 root root 0 2009-11-18 12:31 /sys/class/scsi_generic/sg1 -> ../../devices/pci0000:00/0000:00:1f.1/host2/target2:0:0/2:0:0:0/scsi_generic/sg1
       lrwxrwxrwx 1 root root 0 2009-11-19 07:47 /sys/class/scsi_generic/sg2 -> ../../devices/pci0000:00/0000:00:05.0/0000:0e:00.0/host4/target4:3:0/4:3:0:0/scsi_generic/sg2
       wumpus:/home/scameron # cat /proc/scsi/scsi
       Attached devices:
       Host: scsi0 Channel: 00 Id: 00 Lun: 00
         Vendor: COMPAQ   Model: BD03685A24       Rev: HPB6
         Type:   Direct-Access                    ANSI  SCSI revision: 03
       Host: scsi2 Channel: 00 Id: 00 Lun: 00
         Vendor: SAMSUNG  Model: CD-ROM SC-148A   Rev: B408
         Type:   CD-ROM                           ANSI  SCSI revision: 05
       Host: scsi4 Channel: 03 Id: 00 Lun: 00
         Vendor: HP       Model: P800             Rev: 6.82
         Type:   RAID                             ANSI  SCSI revision: 00
       wumpus:/home/scameron # lsscsi
       [0:0:0:0]    disk    COMPAQ   BD03685A24       HPB6  /dev/sda
       [2:0:0:0]    cd/dvd  SAMSUNG  CD-ROM SC-148A   B408  /dev/sr0
       [4:3:0:0]    storage HP       P800             6.82  -

       From  the  above  you  can  see that /dev/sg2 corresponds to SCSI nexus
       4:3:0:0, which corresponds to the HP P800  RAID  controller  listed  in
       /proc/scsi/scsi.

EXAMPLE

            [root@somehost]# cciss_vol_status -q /dev/cciss/c*d0
            /dev/cciss/c0d0: (Smart Array P800) RAID 0 Volume 0 status: OK.
            /dev/cciss/c0d0: (Smart Array P800) RAID 0 Volume 1 status: OK.
            /dev/cciss/c0d0: (Smart Array P800) RAID 1 Volume 2 status: OK.
            /dev/cciss/c0d0: (Smart Array P800) RAID 5 Volume 4 status: OK.
            /dev/cciss/c0d0: (Smart Array P800) RAID 5 Volume 5 status: OK.
            /dev/cciss/c0d0: (Smart Array P800) Enclosure MSA60 (S/N: USP6340B3F) on Bus 2, Physical Port 1E status: Power Supply Unit failed
            /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 0 status: OK.
            /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 1 status: OK.
            /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 2 status: OK.
            /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 3 status: OK.
            /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 4 status: OK.
            /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 5 status: OK.
            /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 6 status: OK.
            /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 7 status: OK.

            [root@someotherhost]# cciss_vol_status -q /dev/sg0 /dev/cciss/c*d0
            /dev/sg0: (MSA1000) RAID 1 Volume 0 status: OK.   At least one spare drive.
            /dev/sg0: (MSA1000) RAID 5 Volume 1 status: OK.
            /dev/cciss/c0d0: (Smart Array P800) RAID 0 Volume 0 status: OK.

DIAGNOSTICS

       Normally,  a logical drive in good working order should report a status
       of "OK."  Possible status values are:

       "OK." (0) - The logical drive is in good working order.

       "FAILED." (1) - The logical drive has failed,  and  no  i/o  to  it  is
       poosible.

       "Using interim recovery mode." (3) - One or more drives has failed,
              but not so many that the logical drive can  no  longer  operate.
              The failed drives should be replaced as soon as possible.

       "Ready for recovery operation." (4) -  Failed drive(s) have been
              replaced,  and  the  controller  is  about  to  begin rebuilding
              redundant parity data.

       "Currently recovering." (5) - Failed drive(s) have been replaced,
              and the controller  is  currently  rebuilding  redundant  parity
              information.

       "Wrong physical drive was replaced." (6) - A drive has failed, and
              another (working) drive was replaced.

       "A physical drive is not properly connected." (7) - There is some
              cabling or backplane problem in the drive enclosure.

       (From fwspecwww.doc, see cpqarray project on sourceforge.net):
              Note:  If  the  unit_status value is 6 (Wrong physical drive was
              replaced) or 7 (A physical drive is not properly connected), the
              unit_status  of  all  other  configured  logical  drives will be
              marked as 1 (Logical drive failed). This is to force the user to
              correct  the  problem  and  to  insure  that once the problem is
              corrected, the data will not have been  corrupted  by  any  user
              action.

       "Hardware is overheating." (8) - Hardware is too hot.

       "Hardware was overheated." (9) - At some point in the past,
              the hardware got too hot.

       "Currently expannding." (10) - The controller is currently in the
              process of expanding a logical drive.

       "Not yet available." (11) - The logical drive is not yet finished
              being configured.

       "Queued for expansion." (12) - The logical drive will be expended
              when the controller is able to begin working on it.

       Additionally,  the  following messages may appear regarding spare drive
       status:

            "At least one spare drive designated"
            "At least one spare drive activated and currently rebuilding"
            "At least one activated on-line spare drive is completely rebuilt on this logical drive"
            "At least one spare drive has failed"
            "At least one spare drive activated"
            "At least one spare drive remains available"

       For each logical drive, the total number of failed physical drives,  if
       more than zero, will be reported as:

                   "Total of n failed physical drives detected on this logical drive."

       with "n" replaced by the actual number, of course.

       Additionally failure conditions of disk enclosure fans, power supplies,
       and temperature are reported as follows:

            "Fan failed"
            "Temperature problem"
            "Door alert"
            "Power Supply Unit failed"

FILES

       /dev/cciss/c*d0 (Smart Array PCI controllers using the cciss driver)
       /dev/sg*  (Fibre  attached  MSA1000   controllers   and   Smart   Array
       controllers using the hpsa driver or hpahcisr software RAID driver.)

EXIT CODES

       0 - All configured logical drives queried have status of "OK."

       1  -  One  or  more configured logical drives queried have status other
       than "OK."

AUTHOR

       Written by Stephen M. Cameron

REPORTING BUGS

       MSA500 G1 logical drive numbers may not be reported correctly.

       I’ve seen enclosure serial numbers contain garbage.

       Report bugs to <steve.cameron@hp.com>

COPYRIGHT

       Copyright © 2007 Hewlett-Packard Development Company, L.P.
       This is free software; see the source for copying conditions.  There is
       NO warranty; not even for MERCHANTABILITY or FITNESS FOR  A  PARTICULAR
       PURPOSE.

NOTE 1

       The  /dev/cciss/c*d0  device  nodes of the cciss driver do double duty.
       They serve as an access point to both the RAID controllers, and to  the
       first   logical   drive   of  each  RAID  controller.   Notice  that  a
       /dev/cciss/c*d0 node will be present for each  controller  even  if  no
       logical  drives are configured on that controller.  It might be cleaner
       if the driver had a  special  device  node  just  for  the  controller,
       instead  of making these device nodes do double duty.  It has been like
       that since the 2.2 linux kernel timeframe.  At that time, device  major
       and  minor nodes were statically allocated at compile time, and were in
       short supply.  Changing this behavior at this point would break lots of
       userland programs.

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

DEVICE

EXAMPLE

DIAGNOSTICS

FILES

EXIT CODES

AUTHOR

REPORTING BUGS

COPYRIGHT

SEE ALSO

NOTE 1