5.4. Remote Control with IPMI
The Intelligent Platform Management Interface (IPMI)) can be used to control remote machines via a simple set of commands. Install it via the package manager.
[root@master ~]# yum install ipmitool
The syntax for remote IPMI commands has the following scheme:
# -I lanplus stands for IPMI 2.0
# -I lan would be IPMI 1.5
[root@master ~]# ipmitool -I lanplus -U USERNAME -P PASSWORD -H HOST COMMAND [ARGUMENTS]
By default, ipmitool
will run with administrator privileges.
Depending on your cluster you will need to use the following
credentials:
Cluster |
username |
password |
---|---|---|
Temple |
mhpc |
mhpc2024 |
MHPC Lab |
ADMIN |
ADMIN |
For Temple clusters, also add -L operator
to your ipmitool
call to change your
privilege level to operator
.
[root@master ~]# ipmitool -I lanplus -U mhpc -P mhpc2024 -L operator -H c01.mgmt power status
or
[root@master ~]# ipmitool -I lanplus -U ADMIN -P ADMIN -H c01.mgmt power status
It is not a good idea to leave traces of you passwords as raw text in
your command history. Instead we recommend that you create a file (e.g.
/etc/ipmi_password
) with restrictive file permissions (600) to store the
password on disk and use the following trick to load it on demand.
[root@master ~]# ipmitool -I lanplus -U mhpc -f /etc/ipmi_password -L operator -H c01.mgmt power status
To reduce the amount of typing it is useful to create a small utility
script. We usually call this /usr/bin/ipmiwrap
.
#!/bin/bash
HOSTNAME=$1
shift
ipmitool -I lanplus -U mhpc -f /etc/ipmi_password -L operator -H $HOSTNAME.mgmt $@
Using this script allows us to type much shorter ipmi
commands:
ipmiwrap c01 power status
# USAGE: ipmiwrap HOST COMMAND [ARGUMENTS]
5.4.1. Basic IPMI Commands
ipmitool
offers a whole range of different commands that simplify
remote management. You can learn more about the tool by using the
help
command.
[root@master ~]# ipmitool help
Commands:
raw Send a RAW IPMI request and print response
i2c Send an I2C Master Write-Read command and print response
spd Print SPD info from remote I2C device
lan Configure LAN Channels
chassis Get chassis status and set power state
power Shortcut to chassis power commands
event Send pre-defined events to MC
mc Management Controller status and global enables
sdr Print Sensor Data Repository entries and readings
sensor Print detailed sensor information
fru Print built-in FRU and scan SDR for FRU locators
gendev Read/Write Device associated with Generic Device locators sdr
sel Print System Event Log (SEL)
pef Configure Platform Event Filtering (PEF)
sol Configure and connect IPMIv2.0 Serial-over-LAN
tsol Configure and connect with Tyan IPMIv1.5 Serial-over-LAN
isol Configure IPMIv1.5 Serial-over-LAN
user Configure Management Controller users
channel Configure Management Controller channels
session Print session information
dcmi Data Center Management Interface
nm Node Manager Interface
sunoem OEM Commands for Sun servers
kontronoem OEM Commands for Kontron devices
picmg Run a PICMG/ATCA extended cmd
fwum Update IPMC using Kontron OEM Firmware Update Manager
firewall Configure Firmware Firewall
delloem OEM Commands for Dell systems
shell Launch interactive IPMI shell
exec Run list of commands from file
set Set runtime variable for shell and exec
hpm Update HPM components using PICMG HPM.1 file
ekanalyzer run FRU-Ekeying analyzer using FRU files
ime Update Intel Manageability Engine Firmware
vita Run a VITA 46.11 extended cmd
lan6 Configure IPv6 LAN Channels
5.4.1.1. Controlling power
You can query the current state of your system with the power status
command.
[root@master ~]# ipmiwrap c01 power status
Chassis Power is off
To turn a machine on or off use one use the power on
and
power off
commands.
[root@master ~]# ipmiwrap c01 power on
[root@master ~]# ipmiwrap c01 power off
To reboot a machine you can either do a power reset
, which does a
warm boot (power is never turned off), or power cycle
which performs
a cold boot (power is turned off and on again).
# warm reboot
[root@master ~]# ipmiwrap c01 power reset
# cold reboot
[root@master ~]# ipmiwrap c01 power cycle
Finally, power soft
initiates a soft-shutdown via ACPI.
# tell OS to shutdown gracefully
[root@master ~]# ipmiwrap c01 power soft
5.4.1.2. Getting sensor data
Your BMC can collect a range of useful information to determine the
health of your system. An exhaustive list of all sensors can be printed
out with the sensor
command.
[root@master ~]# ipmiwrap c01 sensor
Temp | na | | na | na | na | na | 85.000 | 90.000 | na
Temp | na | | na | na | na | na | 85.000 | 90.000 | na
Temp | na | | na | na | na | na | na | na | na
Ambient Temp | na | | na | na | na | na | na | na | na
Temp | na | | na | na | na | na | na | na | na
Ambient Temp | na | | na | na | na | na | na | na | na
Ambient Temp | 21.000 | degrees C | ok | na | 3.000 | 8.000 | 42.000 | 47.000 | na
Planar Temp | na | | na | na | 3.000 | 8.000 | 92.000 | 97.000 | na
CMOS Battery | 0x0 | discrete | 0x0080| na | na | na | na | na | na
VCORE PG | 0x0 | discrete | 0x0180| na | na | na | na | na | na
VCORE PG | 0x0 | discrete | 0x0180| na | na | na | na | na | na
IOH THERMTRIP | na | discrete | na | na | na | na | na | na | na
1.5V PG | 0x0 | discrete | 0x0180| na | na | na | na | na | na
1.8V PG | 0x0 | discrete | 0x0180| na | na | na | na | na | na
3.3V PG | 0x0 | discrete | 0x0180| na | na | na | na | na | na
5V PG | 0x0 | discrete | 0x0180| na | na | na | na | na | na
0.75VTT PG | 0x0 | discrete | 0x0180| na | na | na | na | na | na
....
For a more targeted search you can use the sdr
command and specify
the type of information you are looking for.
[root@master ~]# ipmiwrap c01 sdr type list
Sensor Types:
Temperature (0x01) Voltage (0x02)
Current (0x03) Fan (0x04)
Physical Security (0x05) Platform Security (0x06)
...
[root@master ~]# ipmiwrap c01 sdr type Temperature
Temp | 01h | ns | 3.1 | Disabled
Temp | 02h | ns | 3.2 | Disabled
Temp | 05h | ns | 10.1 | Disabled
Ambient Temp | 07h | ns | 10.1 | Disabled
Temp | 06h | ns | 10.2 | Disabled
Ambient Temp | 08h | ns | 10.2 | Disabled
Ambient Temp | 0Eh | ok | 7.1 | 21 degrees C
Planar Temp | 0Fh | ns | 7.1 | Disabled
IOH THERMTRIP | 5Dh | ns | 7.1 | Disabled
CPU Temp Interf | 76h | ns | 7.1 | Disabled
Temp | 0Ah | ns | 8.1 | Disabled
Temp | 0Bh | ns | 8.1 | Disabled
Temp | 0Ch | ns | 8.1 | Disabled
[root@master ~]# ipmiwrap c01 sdr type "Power Supply"
Status | 64h | ok | 10.1 | Presence detected
Status | 65h | ns | 10.2 | Disabled
PS Redundancy | 74h | ns | 7.1 | No Reading
5.4.1.3. Chassis commands
Identifying a specific system
Most BMCs come with status lights that indicate their state and location. You can make these lights blink to help identify a system using the
chassis identify
command.# blink chassis LED for 120 seconds [root@master ~]# ipmiwrap c01 chassis identify 120
Setting boot device for next boot
You can temporarily set the next boot device using the
chassis bootdev
command.# boot via network boot [root@master ~]# ipmiwrap c01 chassis bootdev pxe # boot into BIOS setup [root@master ~]# ipmiwrap c01 chassis bootdev bios
5.4.1.4. Vendor specific extensions
Some vendors have extra capabilities which can be accessed via IPMI
[root@master ~]# ipmiwrap c01 delloem powermonitor
Power Tracking Statistics
Statistic : Cumulative Energy Consumption
Start Time : Thu Dec 16 14:09:09 2010
Finish Time : Wed Feb 6 14:19:09 2019
Reading : 859.4 kWh
Statistic : System Peak Power
Start Time : Wed Nov 3 22:21:33 2010
Peak Time : Wed Jul 15 08:36:22 2015
Peak Reading : 319 W
Statistic : System Peak Amperage
Start Time : Wed Nov 3 22:21:33 2010
Peak Time : Wed Jul 15 08:35:39 2015
Peak Reading : 1.5 A
5.4.1.5. Reviving a hanging BMC
Sometimes BMCs can get stuck. If you can still reach it via network, but commands don’t work, you can try power cycling the BMC itself with the following command.
[root@master ~]# ipmiwrap c01 bmc reset cold
5.4.1.6. System Event Log (SEL)
A BMC keeps track of events on your system such as hardware failures. In many cases a critical error will result in a SEL entry as well as your system indicating a failure using a amber warning light.
To learn more about the error you can query the current SEL with the
sel list
command.
[root@master ~]# ipmiwrap c01 sel list
1 | 08/19/2015 | 15:57:38 | Event Logging Disabled #0x72 | Log area reset/cleared | Asserted
2 | 08/15/2018 | 11:15:46 | OS Boot | Installation started | Asserted
3 | 08/15/2018 | 11:27:48 | OS Boot | Installation completed | Asserted
4 | 01/23/2019 | 21:25:05 | OS Boot | Installation started | Asserted
5 | 01/23/2019 | 21:36:02 | OS Boot | Installation completed | Asserted
6 | 01/29/2019 | 14:20:57 | OS Boot | Installation started | Asserted
7 | 01/29/2019 | 14:31:24 | OS Boot | Installation completed | Asserted
8 | 02/06/2019 | 11:18:14 | OS Boot | Installation started | Asserted
9 | 02/06/2019 | 11:29:10 | OS Boot | Installation completed | Asserted
Once you have determined the cause and fixed it, you will want to clear the SEL to remove the warning light.
[root@master ~]# ipmiwrap c01 sel clear
Clearing SEL. Please allow a few seconds to erase.
5.4.1.7. Managing IPMI Users / Changing the IPMI password
To get the current list of configured users use the user list
command.
[root@master ~]# ipmiwrap c01 user list
ID Name Callin Link Auth IPMI Msg Channel Priv Limit
1 true false false NO ACCESS
2 root true true true ADMINISTRATOR
3 true false false NO ACCESS
4 mhpc true true true OPERATOR
5 true false false NO ACCESS
6 true false false NO ACCESS
7 true false false NO ACCESS
8 true false false NO ACCESS
9 true false false NO ACCESS
10 true false false NO ACCESS
11 true false false NO ACCESS
12 true false false NO ACCESS
13 true false false NO ACCESS
14 true false false NO ACCESS
15 true false false NO ACCESS
16 true false false NO ACCESS
As you can see, by default, user 2 is set to be root
with
ADMINISTRATOR
privileges. The mhpc
user was set to OPERATOR
and has user id 4. To change the default password for a user (assuming
you have ADMINISTRATOR rights) can be done with the following command:
# 4 is the USER ID of mhpc
[root@master ~]# ipmiwrap c01 user set password 4
5.4.2. Serial Console
IPMI support a feature called Serial-Over-LAN (SOL). If enabled, you can connect to a serial port of you BMC that will change your current terminal into a text console matching the screen of your remote system.
This will allow you to configure BIOS settings and even login into the system as if you were connected to it directly with a screen and keyboard.
To launch a SOL session use the sol activate
command.
[root@master ~]# ipmiwrap c01 sol activate
Once in the SOL session, the following scape sequences are supported:
~. |
terminate connection (and any multiplexed sessions) |
~B |
send a BREAK to the remote system |
~C |
open a command line |
~R |
request rekey |
~V/v |
decrease/increase verbosity (LogLevel) |
~^Z |
suspend ssh |
~# |
list forwarded connections |
~& |
background ssh (when waiting for connections to terminate) |
~? |
help |
~~ |
send the escape character by typing it twice |
The sequence ~. will disconnect both SOL and any SSH session if typed at the beginning of a line; ~. will terminate the SSH session to the bastion host. To avoid closing the SSH connection, it is needed to scape the sequence ~. with ~ for each inner SSH session. Since you are connected from your workstation to the bastion and then to the masters, you will need to scape two SSH sessions. The complete sequence to terminate the SOL session whitout clossing your SSH connection is ~~~.
If for some reason your session gets stuck in on terminal, you can deactivate it from another terminal as follows:
[root@master ~]# ipmiwrap c01 sol deactivate
Terminating SOL can leave the terminal in a weird state (e.g. reverse colors or unreadable font or bad size). The commands reset
and clear
can be used to clean it up and get back to sane terminal settings.
5.4.2.1. Keybindings
Key |
Shortcut |
---|---|
F1 |
ESC + 1 |
F2 |
ESC + 2 |
F3 |
ESC + 3 |
F4 |
ESC + 4 |
F5 |
ESC + 5 |
F6 |
ESC + 6 |
F7 |
ESC + 7 |
F8 |
ESC + 8 |
F9 |
ESC + 9 |
F10 |
ESC + 0 |
ESC |
ESC + ESC |