Linux / UNIX:Smartctl 检查 3Ware RAID 卡后面的硬盘
我有一台基于 CentOS Linux 和 FreeBSD 的服务器,使用 3Ware (9650SE-2LP) RAID 卡运行。我知道如何检查Adaptec RAID 控制器后面的硬盘,但如何使用smartctl 命令查看 3ware SCSI RAID 控制器后面的 ATA/SATA 磁盘?
您需要使用以下设备名称:
a) IDE / ATA – /dev/hd[at]
b) SCSI / SATA – /dev/sd[az]或/dev/twe[0-9]或/dev/twa[0-9]
操作方法:查看 3ware SCSI RAID 控制器后面的 ATA/SATA 磁盘
使用以下语法:
smartctl -a -d 3ware,N /dev/tweY smartctl -a -d 3ware,N /dev/twaY
参数 3ware,N,整数 N 是 3ware ATA RAID 控制器内的磁盘编号(3ware“端口”)。N 的允许值从 0 到 31(含)。您可以使用以下tw_clin 命令找出 3ware 端口并检查 RAID 卡的健康状况:
# tw_cli info
示例输出:
Ctl Model (V)Ports Drives Units NotOpt RRate VRate BBU
------------------------------------------------------------------------
c0 9650SE-2LP 2 2 1 0 1 1 -
要获取 c0 控制器的端口信息,请输入:
# tw_cli info c0
示例输出:
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy ------------------------------------------------------------------------------ u0 RAID-1 OK - - - 232.82 W ON VPort Status Unit Size Type Phy Encl-Slot Model ------------------------------------------------------------------------------ p0 OK u0 233.81 GB SATA 0 - WDC WD2503ABYX-01WE p1 OK u0 233.81 GB SATA 1 - WDC WD2503ABYX-01WE
键入以下命令查看 3Ware RAID 卡后面硬盘的智能信息,输入:
示例输出:
# smartctl -a -d 3ware,0 /dev/twa0
# smartctl -a -d 3ware,1 /dev/twa0
smartctl version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: WDC WD2503ABYX-01WERA0 Serial Number: WD-WMAYP1327617 Firmware Version: 01.01S01 User Capacity: 251,059,544,064 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Sat Jun 25 13:41:26 2011 UTC SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (4080) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 46) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x303f) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 1 3 Spin_Up_Time 0x0027 100 253 021 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 7 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 37 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 6 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 5 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 1 194 Temperature_Celsius 0x0022 118 110 000 Old_age Always - 25 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 12 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
请注意,如果特殊字符设备节点 /dev/twa? 和 /dev/twe? 不存在,或者存在错误的主编号或次编号,smartctl 将动态重新创建它们。通常,/dev/twa0 指第一个 9000 系列控制器,/dev/twa1 指第二个 9000 系列控制器,依此类推。同样,/dev/twe0 指第一个 6/7/8000 系列控制器,/dev/twa1 指第二个 6/7/8000 系列控制器,依此类推。
参见:
- 测试 Linux 服务器 SCSI/SATA 硬盘是否出现故障
- smartctl 和 tw_cli 手册页。
- 测试 Linux 服务器 SCSI/SATA/SSD 硬盘是否出现故障
- Linux / UNIX:Smartctl 检查 3Ware RAID 卡后面的硬盘
- Linux 使用 smartctl 检查 Adaptec RAID 控制器后面的磁盘
- 在 Linux 或 UNIX 操作系统下使用 smartd 监控硬盘健康状况
- FreeBSD:使用 smartd 工具获取/读取硬盘温度
- Linux 使用 hddtemp 监控硬盘温度
- Linux 命令查找 SATA 链接速度(如 1.5 / 3.0 / 6.0 Gbps)[硬盘]
- 如何在 Linux 上查找硬盘规格/详细信息
- FreeBSD insatll smartctl top 检查硬盘健康状况