Linux 使用 smartctl 检查 Adaptec RAID 控制器后面的磁盘

发表 admin at 2024年8月26日

类别

未分类

标签

我可以使用“ smartctl -data -a/dev/sdb ”命令读取直接连接到系统的硬盘健康状态。但是，如何从 Linux 操作系统的 shell 提示符中读取 smartctl 命令来检查 Adaptec RAID 控制器后面的 SAS 或 SCSI 磁盘？

您需要使用以下语法来检查 SATA 或 SAS 磁盘，它们通常为操作系统的每个（物理）磁盘阵列模拟一个（逻辑）磁盘。/dev/sgX 可用作直通 I/O 控制，为 Adaptec raid 控制器提供对每个物理磁盘的直接访问。

教程详细信息
难度等级	中间的
Root 权限	是的
要求	带有 smartctl 命令和 Adaptec RAID 卡的 Linux
预计阅读时间	5 分钟

Linux 使用 smartctl 检查 Adaptec RAID 控制器后面的磁盘

使用以下命令查找是否检测到 RAID 卡并获取有关每个磁盘的信息。

Linux 能检测到我的 Adaptec RAID 卡吗？

键入以下命令：
# lspci | egrep -i 'raid|adaptec'
示例输出：

81:00.0 RAID bus controller: Adaptec AAC-RAID (rev 09)

我们还可以使用cat 命令来查看在 Linux 上可以访问 RAID 控制器的哪些磁盘：
# cat /proc/scsi/scsi

下载并安装 Adaptec Storage Manager

您需要根据已安装的 RAID 卡为 Linux 发行版安装 Adaptec Storage Manager。请访问此页面获取该软件。

SATA 健康检查磁盘语法

要扫描磁盘，请输入：
# smartctl --scan
示例输出：

/dev/sda -d scsi # /dev/sda, SCSI device

因此 /dev/sda 是被报告为 SCSI 设备的设备之一。此 RAID 设备由位于 /dev/sg{1,2,3,4} 中的 4 个磁盘组成。键入以下 smartclt 命令以检查 /dev/sda raid 后面的磁盘：要求设备报告其 SMART 健康状态或待处理的 TapeAlert 消息（如果有），运行：对于 SAS 磁盘，使用以下语法：示例输出：
# smartctl -d sat --all /dev/sgX # smartctl -d sat --all /dev/sg1

# smartctl -d sat --all /dev/sg1 -H

# smartctl -d scsi --all /dev/sgX # smartctl -d scsi --all /dev/sg1 ### Ask the device to report its SMART health status or pending TapeAlert message ### # smartctl -d scsi --all /dev/sg1 -H

smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
 
Device: SEAGATE  ST3146855SS      Version: 0002
Serial number: xxxxxxxxxxxxxxx
Device type: disk
Transport protocol: SAS
Local Time is: Wed Jul  7 04:34:30 2010 CDT
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK
 
Current Drive Temperature:     24 C
Drive Trip Temperature:        68 C
Elements in grown defect list: 0
Vendor (Seagate) cache information
  Blocks sent to initiator = 1857385803
  Blocks received from initiator = 1967221471
  Blocks read from cache and sent to initiator = 804439119
  Number of read and write commands whose size <= segment size = 312098925
  Number of read and write commands whose size > segment size = 45998
Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 13224.42
  number of minutes until next internal SMART test = 42
 
Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   58984049        1         0  58984050   58984050       3151.730           0
write:         0        0         0         0          0   9921230881.600           0
verify:     1308        0         0      1308       1308          0.000           0
 
Non-medium error count:        0
No self-tests have been logged
Long (extended) Self Test duration: 1367 seconds [22.8 minutes]

以下是 SAS 磁盘 /dev/sg2 的另一个输出
# smartctl -d scsi --all /dev/sg2 -H
示例输出：

图 01：Linux 使用 smartctl 检查 Adaptec RAID 后面的磁盘

图 01：如何在 Linux 命令行中检查硬件 Raid 状态

将 /dev/sg1 替换为您的磁盘编号。如果您有包含 4 个磁盘的 raid 10 阵列，则：

/dev/sg0 – RAID 10 控制器（您将不会获得任何信息或 /dev/sg0）。
/dev/sg1 – RAID 10 阵列中的第一个磁盘。
/dev/sg2 – RAID 10 阵列中的第二个磁盘。
/dev/sg3 – RAID 10 阵列中的第三个磁盘。
/dev/sg4 – RAID 10 阵列中的第四个磁盘。

如何运行硬盘检查？

键入以下命令：其中，
# smartctl -t short -d scsi /dev/sg2 # smartctl -t long -d scsi /dev/sg2

-t short：运行简短测试。
-t long：运行长时间测试。
-d scsi：指定 scsi 作为设备类型。
--all：显示设备的所有SMART信息。

如何使用 Adaptec 存储管理器？

另一个仅检查基本状态的简单命令如下：示例输出：
# /usr/StorMan/arcconf getconfig 1 | more # /usr/StorMan/arcconf getconfig 1 | grep State # /usr/StorMan/arcconf getconfig 1 | grep -B 3 State

----------------------------------------------------------------------
      Device #0
         Device is a Hard drive
         State                              : Online
--
         S.M.A.R.T.                         : No
      Device #1
         Device is a Hard drive
         State                              : Online
--
         S.M.A.R.T.                         : No
      Device #2
         Device is a Hard drive
         State                              : Online
--
         S.M.A.R.T.                         : No
      Device #3
         Device is a Hard drive
         State                              : Online

请注意，较新版本arcconf位于/usr/Adaptec_Event_Monitor目录中。因此，您的完整路径必须如下：其中，
# /usr/Adaptec_Event_Monitor/arcconf getconfig [AD | LD [LD#] | PD | MC | [AL]] [nologs]

 Prints controller configuration information.


    Option  AD  : Adapter information only
            LD  : Logical device information only
            LD# : Optionally display information about the specified logical device
            PD  : Physical device information only
            MC  : Maxcache 3.0 information only
            AL  : All information (optional)

如何在 Linux 上检查 Adaptec RAID 阵列本身的运行状况？

只需使用以下命令：
# /usr/Adaptec_Event_Monitor/arcconf getconfig 1
或（旧版本）
# /usr/StorMan/arcconf getconfig 1
示例输出：

图 02：设备＃1 在线，而设备＃2 出现故障，即阵列已降级。

参见：

您刚刚学习了如何使用 smartctl 命令检查 Adaptec RAID 控制器后面的磁盘。

本篇是smartctl (smartd) 教程系列中的第 3 篇（共9 篇）。继续阅读本系列的其余部分：