# 服务器硬件数据收集

如果不想拆机的话，可以通过软件来收集到你的硬件信息

# 用系统内置 dmidecode 解析硬件设备

dmidecode 可以解析 CPU 型号、主板型号、内存相关型号等信息

dmidecode -t type

选项与参数：详细 type 使用 man dmidecode 查阅，这里列出比较常用的
	1：详细的系统数据，包含主板型号与硬件基础数据
	4：CPU 相关数据，包括倍频、外频、核心数等
	9：系统相关插槽格式，包括 PCI、PCI-E 等
	17：每一个内存插槽的规格，若该插槽有内存，则列出该内存的容量与型号

1
2
3
4
5
6
7

# 范例 1：显示整个系统硬件信息
[root@study ~]# dmidecode -t 1
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 2.5 present.

Handle 0x0001, DMI type 1, 27 bytes
System Information
	Manufacturer: innotek GmbH
	Product Name: VirtualBox
	Version: 1.2
	Serial Number: 0
	UUID: e4a1acbe-ffac-4762-b2c9-ed13daf9a493
	Wake-up Type: Power Switch
	SKU Number: Not Specified
	Family: Virtual Machine

# 范例 2：内存相关数据
[root@study ~]# dmidecode -t 17
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 2.5 present.
# 笔者这用的 VirtualBox 的虚拟机，不知道为啥获取不到内存的信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

# 硬件资源的收集与分析

系统硬件是由操作系统核心所管理的，从低第 19 章的开机流程分析中了解到，内核在开机时就能够检测主机硬件并加载适当的模块来驱动硬件。而核心所检测到的各项硬件配置，会被记录在 /proc/ 与 /sys/ 中，包括 /proc/cpuinfo、/proc/paritions、/proc/interrupts。至于更多的 /proc 内容，可以前往第 16 章回顾

TIP

核心检测到的硬件可能并非完全正确，因为它仅是使用最适当的模块来驱动这个硬件，所以由概率会误判。

你可能想要以最新最正确的模块来驱动你的硬件，此时，重新编译核心是其中一条可以达到的方向。

除了直接查看 /proc 下的文件内容之外，Linux 提供了几个简单的指令来讲核心所检测到的硬件信息调用出来，常见的指令有：

gdisk：第 7 章中用过，gdisk -l 将分区表列出
dmesg：第 16 章中用过，观察核心运行过程中所显示的各项信息
vmstat：第 16 章中用过，可分析系统（CPU、RAM、IO）目前的状态
lspci：列出整个 PC 系统的 PCI 接口设备
lsusb：列出各个 USB 端口的状态，与链接的 USB 设备
iostat：与 vmstat 类似，可实时列出整个 CPU 与接口设备的 Input/Output 状态

# lspci

lspci [-vvn]

-v：显示更多的 PCI 接口设备的详细信息
-vv：比 -v 还要更详细的信息
-n：直接观察 PCI 的 ID 而不是厂商名称

1
2
3
4
5

# 范例 1： 查询 PCI 总线相关设备
[root@study ~]# lspci
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)
00:02.0 VGA compatible controller: VMware SVGA II Adapter
00:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02)
00:04.0 System peripheral: InnoTek Systemberatung GmbH VirtualBox Guest Service
00:05.0 Multimedia audio controller: Intel Corporation 82801AA AC 97 Audio Controller (rev 01)
00:06.0 USB controller: Apple Inc. KeyLargo/Intrepid USB
00:07.0 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08)
00:08.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02)
00:0d.0 SATA controller: Intel Corporation 82801HM/HEM (ICH8M/ICH8M-E) SATA Controller [AHCI mode] (rev 02)

# 如果想要了解更详细的信息
# 下面这个是什么，不清楚，笔者的虚拟机与作者的不一样
[root@study ~]# lspci -s 00:03.0 -vv
00:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02)
	Subsystem: Intel Corporation PRO/1000 MT Desktop Adapter
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 64 (63750ns min)
	Interrupt: pin A routed to IRQ 19
	Region 0: Memory at f4200000 (32-bit, non-prefetchable) [size=128K]
	Region 2: I/O ports at d020 [size=8]
	Capabilities: [dc] Power Management version 2
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [e4] PCI-X non-bridge device
		Command: DPERE- ERO+ RBC=512 OST=1
		Status: Dev=ff:1f.0 64bit- 133MHz- SCD- USC- DC=simple DMMRBC=2048 DMOST=1 DMCRS=8 RSCEM- 266MHz- 533MHz-
	Kernel driver in use: e1000
	Kernel modules: e1000

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

-s 后面是每个设备总线、插槽与相关函数的功能，它是硬件检测所得到的数据，可以通过 /usr/share/hwdata/pci.ids 来了解这些数据串的含义

pci.ids 文件是 PCI 的标准 ID 与厂牌名称对应表，另外 lspci 指令的数据是从 /proc/bus/pci 目录下取出来的，由于硬件发展太过快速，你的 pci.ids 文件有可能落伍了，可通过如下方式在线更新

update-pciids

# lsusb

usb 设备数据

lsusb [-t]

-t：使用类似树状目录来显示各个 USB 端口的相关性

1
2
3

# 范例 1：列出当前主机上 USB 各端口状态
[root@study ~]# lsusb
Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
# 笔者这里没有连接设备到 USB 上，就是显示上面那个箱子
# 书上有连接的，数据大概如下
# Bus 001 Device 001: ID 1d6b:0001 Adomax Technology Co., LTd
# 那么设备 ID 就是 1d6b:0001，对应的厂商与产品为 Adomax

1
2
3
4
5
6
7

这里的 ID 与厂商型号对应表在 /usr/share/hwdata/pci.ids 中

# iostat

磁盘开机到现在，已经存取多少数据了？就可以通过 iostat 指令来查询（如果该软件未安装，可通过 yum install sysstat 安装）

iostat [-c|-d] [-k|-m] [-t] [间隔秒数] [检测次数]

-c：仅显示 CPU 的状态
-d：仅显示存储设备的状态，不可与 -c 一起使用
-k：默认显示的是 block，可以改成 k bytes 的大小来显示
-m：改成 MB 单位显示
-t：显示日期

1
2
3
4
5
6
7

# 范例 1：显示当前系统 CPU 与存储设备的状态
[root@study ~]# iostat
Linux 3.10.0-1062.el7.x86_64 (study.centos.mrcode) 	2020年04月03日 	_x86_64_	(1 CPU)

# CPU 信息
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.07    0.03    0.22    0.29    0.00   98.38

# 磁盘信息
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               1.35        62.23         3.98     828738      52981
dm-0              1.17        59.80         3.82     796302      50933

# 含义如下
tps：平均每秒传送次数，与数据传输「次数」有关，非容量
kB_read/： 开机到现在，平均的读取单位
kB_wrtn/s：开机到现在，平均的写入单位
kB_read：  开机到现在，总共读出来的文件单位
kB_wrtn：  开机到现在，总共写入的文件单位

# 范例 2：仅针对 sda ，每 2 秒监测一次，总共监测 3 次
[root@study ~]# iostat -d 2 3 sda
Linux 3.10.0-1062.el7.x86_64 (study.centos.mrcode) 	2020年04月03日 	_x86_64_	(1 CPU)

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               1.31        60.39         3.88     828746      53191

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               0.00         0.00         0.00          0          0

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               0.00         0.00         0.00          0          0

# 需要注意的是：第一次是开机到现在的数据
# 第 2 次则是两次直接的系统传输值

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

# 了解磁盘的健康状态

在服务器上，最重要的就是「数据安全」，数据是放在磁盘中的，那么对于磁盘的健康状况则是你需要关注的。

可以通过 smartd 指令来达成，SMART（Self-Monitoring，Analysis and Reporting Technology System）的缩写，主要用来监测目前常见的 ATA 与 SCSI 接口的磁盘。前提是，被监测的磁盘也必须要 支持 SMART 协议。

不过虚拟机磁盘不支持 smart 协议，无法进行测试。

比如笔者使用指令来监测虚拟机磁盘

[root@study ~]# smartctl -a /dev/sda
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1062.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     VBOX HARDDISK
Serial Number:    VBa19abe2f-1d5f9384
Firmware Version: 1.0
User Capacity:    42,949,672,960 bytes [42.9 GB]
Sector Size:      512 bytes logical/physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ATA/ATAPI-6 published, ANSI INCITS 361-2002
Local Time is:    Fri Apr  3 16:04:40 2020 CST
SMART support is: Unavailable - device lacks SMART capability.

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
# 这里就报错了，没有继续下去

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

下面直接用书上的信息

# 1. 用 smartctl 来显示完整的 /dev/sda 的信息
[root@study ~]# smartctl -a /dev/sda
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-229.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

# 首先来输出一下这部磁盘的整体信息状况！包括制造商、序号、格式、 SMART 支持度等等！
=== START OF INFORMATION SECTION ===
Device Model: QEMU HARDDISK
Serial Number: QM00002
Firmware Version: 0.12.1
User Capacity: 2,148,073,472 bytes [2.14 GB]
Sector Size: 512 bytes logical/physical
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA/ATAPI-7, ATA/ATAPI-5 published, ANSI NCITS 340-2000
Local Time is: Wed Sep 2 18:10:38 2015 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

# 接下来则是一堆基础说明！ 鸟哥这里先略过这段资料喔！
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
.....(中间省略)

# 再来则是有没有曾经发生过磁盘错乱的问题登录！
SMART Error Log Version: 1
No Errors Logged

# 当你下达过磁盘自我检测的过程，就会被记录在这里了！
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 4660 -
# 2 Short offline Completed without error 00% 4660 -


# 2. 命令磁盘进行一次自我检测的动作，然后再次观察磁盘状态！
[root@study ~]# smartctl -t short /dev/sda
[root@study ~]# smartctl -a /dev/sda
.....(前面省略).....
# 底下会多出一个第三笔的测试信息！看一下 Status 的状态， 没有问题就是好消息！
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 4660 -
# 2 Short offline Completed without error 00% 4660 -
# 3 Short offline Completed without error 00% 4660

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

特别强调：磁盘自检，可能磁盘的 I/O 状态会比较频繁，因此不建议在系统忙碌时进行