Photon OS includes commands to troubleshoot kernel problems and boot and login errors.
This the multi-page printable view of this section. Click here to print.
Kernel Problems and Boot and Login Errors
- 1: Kernel Overview
- 2: Boot Process Overview
- 3: Blank Screen on Reboot
- 4: Investigating Unexpected Behavior
- 5: Investigating the Guest Kernel
- 6: Kernel Log Replication with VProbes
- 7: Linux Kernel
1 - Kernel Overview
You can use dmesg
command to troubleshooting kernel errors. The dmesg
command prints messages from the kernel ring buffer.
The following command, for example, presents kernel messages in a human-readable format:
dmesg --human --kernel
To examine kernel messages as you perform actions, such as reproducing a problem, in another terminal, you can run the command with the --follow
option, which waits for new messages and prints them as they occur:
dmesg --human --kernel --follow
The kernel buffer is limited in memory size. As a result, the kernel cyclically overwrites the end of the information in the buffer from which dmesg
pulls information. The systemd journal, however, saves the information from the buffer to a log file so that you can access older information.
To view it, run the following command:
journalctl -k
If required, you can check the modules that are loaded on your Photon OS machine by running the lsmod
command. For example:
lsmod
Module Size Used by
xt_conntrack 16384 2
nft_compat 20480 2
nf_tables 204800 39 nft_compat
nfnetlink 20480 2 nft_compat,nf_tables
xt_LOG 16384 0
nf_log_syslog 20480 0
nf_conntrack 114688 1 xt_conntrack
nf_defrag_ipv6 20480 1 nf_conntrack
nf_defrag_ipv4 16384 1 nf_conntrack
af_packet 45056 2
vmwgfx 294912 1
psmouse 110592 0
drm_ttm_helper 16384 1 vmwgfx
ttm 53248 2 vmwgfx,drm_ttm_helper
vfat 24576 1
drm_kms_helper 118784 1 vmwgfx
fat 69632 1 vfat
syscopyarea 16384 1 drm_kms_helper
sysfillrect 16384 1 drm_kms_helper
sysimgblt 16384 1 drm_kms_helper
fb_sys_fops 16384 1 drm_kms_helper
evdev 20480 2
mousedev 20480 0
button 16384 0
sch_fq_codel 20480 2
drm 368640 5 vmwgfx,drm_kms_helper,drm_ttm_helper,ttm
fuse 114688 1
i2c_core 49152 2 drm_kms_helper,drm
dm_mod 131072 0
loop 28672 0
backlight 16384 1 drm
configfs 36864 1
dmi_sysfs 16384 0
hid_generic 16384 0
usbhid 28672 0
hid 114688 2 usbhid,hid_generic
xhci_pci 16384 0
xhci_hcd 167936 1 xhci_pci
uhci_hcd 40960 0
ehci_pci 16384 0
crc32c_intel 24576 2
ehci_hcd 69632 1 ehci_pci
usbcore 217088 6 xhci_hcd,ehci_pci,usbhid,ehci_hcd,xhci_pci,uhci_hcd
sr_mod 24576 0
cdrom 49152 1 sr_mod
usb_common 16384 4 xhci_hcd,usbcore,ehci_hcd,uhci_hcd
rdrand_rng 16384 0
rng_core 20480 1 rdrand_rng
efivarfs 20480 1
ipv6 450560 270
autofs4 36864 2
2 - Boot Process Overview
When a Photon OS machine boots, the BIOS initializes the hardware and uses a boot loader to start the kernel. After the kernel starts, systemd
takes over and boots the rest of the operating system.
The BIOS checks the memory and initializes the keyboard, the screen, and other peripherals. When the BIOS finds the first hard disk, the boot loader–GNU GRUB 2.02–takes over. From the hard disk, GNU GRUB loads the master boot record (MBR) and initializes the root partition of the random-access memory by using initrd. The device manager, udev, provides initrd with the drivers it needs to access the device containing the root file system. Here’s what the GNU GRUB edit menu looks like in Photon OS with its default commands to load the boot record and initialize the RAM disk:
At this point, the Linux kernel in Photon OS, which is kernel version 4.4.8, takes control. Systemd kicks in, initializes services in parallel, mounts the rest of the file system, and checks the file system for errors.
3 - Blank Screen on Reboot
If the Photon OS kernel enters a state of panic during a reboot and all you see is a blank screen, note the name of the virtual machine running Photon OS and then power off the VM.
In the host, open the vmware.log
file for the VM. When a kernel panics, the guest VM prints the entire kernel log in vmware.log
in the host directory containing the VM. This log file contains the output of the dmesg
command from the guest, and you can analyze it to help identify the cause of the boot problem.
Example
After searching for Guest:
in the following abridged vmware.log
, this line appears, identifying the root cause of the reboot problem:
2016-08-30T16:02:43.220-07:00| vcpu-0| I125: Guest:
<0>[1.125804] Kernel panic - not syncing:
VFS: Unable to mount root fs on unknown-block(0,0)
Further inspection finds the following lines:
2016-08-30T16:02:43.217-07:00| vcpu-0| I125: Guest:
<4>[ 1.125782] VFS: Cannot open root device "sdc1" or unknown-block(0,0): error -6
2016-08-30T16:02:43.217-07:00| vcpu-0| I125: Guest:
<4>[ 1.125783] Please append a correct "root=" boot option;
here are the available partitions:
2016-08-30T16:02:43.217-07:00| vcpu-0| I125: Guest:
<4>[ 1.125785] 0100 4096 ram0 (driver?)
...
0800 8388608 sda driver: sd
2016-08-30T16:02:43.220-07:00| vcpu-0| I125: Guest:
<4>[ 1.125802] 0801 8384512 sda1 611e2d9a-a3da-4ac7-9eb9-8d09cb151a93
2016-08-30T16:02:43.220-07:00| vcpu-0| I125: Guest:
<4>[ 1.125803] 0802 3055 sda2 8159e59c-b382-40b9-9070-3c5586f3c7d6
In this unlikely case, the GRUB configuration points to a root device named sdc1
instead of the correct root device, sda1
. You can resolve the problem by restoring the GRUB GNU edit screen and the GRUB configuration file (/boot/grub/grub.cfg
) to their original configurations.
4 - Investigating Unexpected Behavior
If you rebooted to address unexpected behavior before the reboot or if you encountered unexpected behavior during the reboot but have reached the shell, you must analyze what happened since the previous boot.
Run the following command to check the logs:
journalctl
Run the following command to look at what happened since the penultimate reboot:
journalctl --boot=-1
Look at the log from the reboot:
journalctl -b
If required, examine the logs for the kernel:
journalctl -k
Check which kernel is in use:
uname -r
The kernel version of Photon OS in the full version is 6.1.10-8. The kernel version of in the OVA version is 6.1.10-8.ph5-esx. With the ESX version of the kernel, some services might not start.
Run this command to check the overall status of services:
systemctl status
If a service is in red, check it:
systemctl status service-name
Start it if required:
systemctl start service-name
If looking at the journal and checking the status of services does not resolve your error, run the following
systemd-analyze
commands to examine the boot time and the speed with which services start.systemd-analyze time systemd-analyze blame systemd-analyze critical-chain
Note: The output of these commands might be misleading because one service might just be waiting for another service to finish initializing.
5 - Investigating the Guest Kernel
If a VM running Photon OS and an application or virtual appliance is behaving preventing you from logging in to the machine, you can troubleshoot by extracting the kernel logs from the guest’s memory and analyzing them with gdb
.
This advanced troubleshooting method works when you are running Photon OS as the operating system for an application or appliance on VMware Workstation, Fusion, or ESXi. The procedure in this section assumes that the virtual machine running Photon OS is functioning normally.
The process to use this troubleshooting method varies by environment. The examples in this section assume that the troublesome Photon OS virtual machine is running in VMware Workstation 12 Pro on a Microsoft Windows 8 Enterprise host. The examples also use an additional, fully functional Photon OS virtual machine running in Workstation.
You can use other hosts, hypervisors, and operating systems–but you will have to adapt the example process below to them. Directory paths, file names, and other aspects might be different on other systems.
Prerequisites
Verify that you have the following resources:
- Root access to a Linux machine other than the one you are troubleshooting. It can be another Photon OS machine, Ubuntu, or another Linux variant.
- The
vmss2core
utility from VMware. It is installed by default in VMware Workstation and some other VMware products. If your system doesn’t already contain it, you can download it for free from https://labs.vmware.com/flings/vmss2core. - A local copy of the Photon OS ISO of the exact same version and release number as the Photon OS machine that you are troubleshooting.
Procedure Overview
The process to apply this troubleshooting method is as follows:
- On a local computer, you open a file on the Photon OS ISO that contains Linux debugging information. Then you suspend the troublesome Photon OS VM and extract the kernel memory logs from the VMware hypervisor running Photon OS.
- Next, you use the vmss2core tool to convert the memory logs into core dump files. The vmss2core utility converts VMware checkpoint state files into formats that third-party debugging tools understand. It can handle both suspend (.vmss) and snapshot (.vmsn) checkpoint state files (hereafter referred to as a vmss file) as well as monolithic and non-monolithic (separate .vmem file) encapsulation of checkpoint state data. See Debugging Virtual Machines with the Checkpoint to Core Tool.
- Finally, you prepare to run the gdb tool by using the debug info file from the ISO to create a
.gdbinit
file, which you can then analyze with the gdb shell on your local Linux machine.
All three components must be in the same directory on a Linux machine.
Procedure
Obtain a local copy of the Photon OS ISO of the exact same version and release number as the Photon OS machine that you are troubleshooting and mount the ISO on a Linux machine (or open it on a Windows machine):
mount /mnt/cdrom
Locate the following file. (If you opened the Photon OS ISO on a Windows computer, copy the following file to the root folder of a Linux machine.)
/RPMS/x86_64/linux-debuginfo-4.4.8-6.ph1.x86_64.rpm
On a Linux machine, run the following
rpm2cpio
command to convert the RPM file to a cpio file and to extract the contents of the RPM to the current directory:rpm2cpio /mnt/cdrom/RPMS/x86_64/linux-debuginfo-4.4.8-6.ph1.x86_64.rpm | cpio -idmv
From the extracted files, copy the following file to your current directory:
cp usr/lib/debug/lib/modules/4.4.8/vmlinux-4.4.8.debug
Run the following command to download the dmesg functions that will help extract the kernel log from the coredump:
wget https://www.kernel.org/doc/Documentation/kdump/gdbmacros.txt wget https://github.com/vmware/photon/blob/master/tools/scripts/gdbmacros-for-linux.txt
Move the file as follows:
mv gdbmacros-for-linux.txt .gdbinit
Switch to your host machine so you can get the kernel memory files from the VM. Suspend the troublesome VM and locate the
.vmss
and.vmem
files in the virtual machine’s directory on the host.Example:
C:\Users\tester\Documents\Virtual Machines\VMware Photon 64-bit (7)>dir Volume in drive C is Windows Directory of C:\Users\tester\Documents\Virtual Machines\VMware Photon 64-bit (7) 09/20/2016 12:22 PM <DIR> . 09/20/2016 12:22 PM <DIR> .. 09/19/2016 03:39 PM 402,653,184 VMware Photon 64-bit (7)-f6b070cd.vmem 09/20/2016 12:11 PM 5,586,907 VMware Photon 64-bit (7)-f6b070cd.vmss 09/20/2016 12:11 PM 1,561,001,984 VMware Photon 64-bit (7)-s001.vmdk ... 09/20/2016 12:11 PM 300,430 vmware.log ...
Now that you have located the
.vmss
and.vmem
files, convert them to one or more core dump files by using the vmss2core tool that comes with Workstation. Here is an example of how to run the command. Be careful with your pathing, escaping, file names, and so forth–all of which might be different from this example on your Windows machine.C:\Users\shoenisch\Documents\Virtual Machines\VMware Photon 64-bit (7)>C:\"Program Files (x86)\VMware\VMware Workstation"\vmss2core.exe "VMware Photon 64-bit (7)-f6b070cd.vmss" "VMware Photon 64-bit (7)-f6b070cd.vmem" The result of this command is one or more files with a `.core` extension plus a digit. Truncated example: C:\Users\tester\Documents\Virtual Machines\VMware Photon 64-bit (7)>dir Directory of C:\Users\tester\Documents\Virtual Machines\VMware Photon 64-bit(7) 09/20/2016 12:22 PM 729,706,496 vmss.core0
Copy the
.core
file or files to the your current directory on the Linux machine where you so that you can analyze it with gdb.Run the following
gdb
command to enter the gdb shell attached to the memory core dump file. You might have to change the name of thevmss.core
file in the example to match your.core
file:
gdb vmlinux-4.4.8.debug vmss.core0
GNU gdb (GDB) 7.8.2
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. ...
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from vmlinux-4.4.8.debug...done.
warning: core file may not match specified executable file.
[New LWP 12345]
Core was generated by `GuestVM'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0xffffffff813df39a in insb (count=0, addr=0xffffc90000144000, port=<optimized out>)
at arch/x86/include/asm/io.h:316
316 arch/x86/include/asm/io.h: No such file or directory.
(gdb)
Result
In the results above, the (gdb) of the last line is the prompt of the gdb shell. You can now analyze the core dump by using commands like bt
, to perform a backtrace, and dmesg
, to view the Photon OS kernel log and see Photon OS kernel error messages.
6 - Kernel Log Replication with VProbes
Replicating the Photon OS kernel logs on the VMware ESXi host is an advanced but powerful method of troubleshooting a kernel problem.
- Replication Method
- Using VProbes Script with a Hard-Coded Address
- A Reusable VProbe Script Using the kallsyms File
Replication Method
This method is applicable when the virtual machine running Photon OS is hanging or inaccessible because, for instance, the hard disk has failed.
As a prerequisite, you must have preemptively enabled the VMware VProbes facility on the VM before an error rendered it inaccessible. You must also create a VProbes script on the ESXi host, but you can do that after the error.
The method is useful in analyzing kernel issues when testing an application or appliance that is running on Photon OS.
There are two similar ways in which you can replicate the Photon OS kernel logs on ESXi by using VProbes.
The first modifies the VProbes script so that it works only for the VM that you set. It uses a hard-coded address.
The second uses an abstraction instead of a hard-coded address so that the same VProbes script can be used for any VM on an ESXi host that you have enabled for VProbe and copied its kernel symbol table (kallsyms) to ESXi.
For more information on VMware VProbes, see VProbes: Deep Observability Into the ESXi Hypervisor and the VProbes Programming Reference.
Using VProbes Script with a Hard-Coded Address
Perform the following steps to set a VProbe for an individual VM:
Power off the VM so that you can turn on the VProbe facility.
Edit the
.vmx
configuration file for the VM. The file resides in the directory that contains the VM in the ESXi data store. Add the following line of code to the.vmx
file and then power the VM on:vprobe.enable = "TRUE"
When you edit the
.vmx
file to add the above line of code, you must first turn off the VM–otherwise, your changes will not persist.Obtain the kernel
log_store
function address by connecting to the VM with SSH and running the following commands as root.Photon OS uses the
kptr_restrict
setting to place restrictions on the kernel addresses exposed through/proc
and other interfaces. This setting hides exposed kernel pointers to prevent attackers from exploiting kernel write vulnerabilities. When you are done using VProbes, you should returnkptr_restrict
to the original setting of2
by rebooting.)echo 0 > /proc/sys/kernel/kptr_restrict grep log_store /proc/kallsyms
The output of the
grep
command will look similar to the following string. The first set of characters (without thet
) is the log_store function address:ffffffff810bb680 t log_store
Connect to the ESXi host with SSH so that you can create a VProbes script.
Below is the template for the script.
log_store
in the first line is a placeholder for the VM’s log_store function address:GUEST:ENTER:log_store { string dst; getgueststr(dst, getguest(RSP+16) & 0xff, getguest(RSP+8)); printf("%s\n", dst); }
On the ESXi host, create a new file, add the template to it, and then change
log_store
to the function address that was the output from the grep command on the VM.Add a
0x
prefix to the function address. In this example, the modified template looks like this:GUEST:ENTER:0xffffffff810bb680 { string dst; getgueststr(dst, getguest(RSP+16) & 0xff, getguest(RSP+8)); printf("%s\n", dst); }
Save your VProbes script as
console.emt
in the/tmp
directory. (The file extension for VProbe scripts is.emt
.)While still connected to the ESXi host with SSH, run the following command to obtain the ID of the virtual machine that you want to troubleshoot:
vim-cmd vmsvc/getallvms
This command lists all the VMs running on the ESXi host. Find the VM you want to troubleshoot in the list and make a note of its ID.
Run the following command to print all the kernel messages from Photon OS in your SSH console; replace
<VM ID>
with the ID of your VM:vprobe -m <VM ID> /tmp/console.emt
When you’re done, type
Ctrl-C
to stop the loop.
A Reusable VProbe Script Using the kallsyms File
Perform the following steps to create one VProbe script and use for all the VMs on your ESXi host.
Power off the VM and turn on the VProbe facility on each VM that you want to be able to analyze.
Add
vprobe.enable = "TRUE"
to the VM’s.vmx
configuration file. See the instructions above.Power on the VM, connect to it with SSH, and run the following command as root:
echo 0 > /proc/sys/kernel/kptr_restrict
Connect to the ESXi host with SSH to create the following VProbes script and save it as
/tmp/console.emt
:GUEST:ENTER:log_store { string dst; getgueststr(dst, getguest(RSP+16) & 0xff, getguest(RSP+8)); printf("%s\n", dst); }
From the ESXi host, run the following command to copy the VM’s
kallysms
file to thetmp
directory on the ESXi host:scp root@<vm ip address>:/proc/kallsyms /tmp
While still connected to the ESXi host with SSH, run the following command to obtain the ID of the virtual machine that you want to troubleshoot:
vim-cmd vmsvc/getallvms
This command lists all the VMs running on the ESXi host. Find the VM you want to troubleshoot in the list and make a note of its ID.
Run the following command to print all the kernel messages from Photon OS in your SSH console.
Replace
<VM ID>
with the ID of your VM. When you’re done, typeCtrl-C
to stop the loop.vprobe -m <VM ID> -k /tmp/kallysyms /tmp/console.emt
You can use a directory other than
tmp
if you want.
7 - Linux Kernel
The Linux kernel is the main component of Photon OS and is the core interface between a computer’s hardware and its processes. It communicates between the two, managing resources as efficiently as possible.
Kernel Flavours and Versions
The following list contains the different Linux kernel flavours available:
linux
- A generic kernel designed to run everywhere and support everything.linux-esx
- Optimized to run only on VMware hypervisor (ESXi, WS, Fusion). It has minimal set of device drivers to support VMware virtual devices.uname -r
displaysLinux
. For additional features switch to the generic flavour.linux-secure
- Security hardened variant of the generic kernel.uname -r
displays-secure
suffix.linux-rt
- This is a Photon Real Time kernel.uname -r
displays-rt
suffix.linux-aws
- Optimized for AWS hypervisor kernel.uname -r
displays-aws
suffix.
To see the version of kernel installed, run the following command:
# rpm -qa | grep -e "^linux\(\|-esx\|-secure\|rt\|aws\)-[[:digit:]]"
linux-4.9.111-1.ph2.x86_64
linux-esx-4.9.111-1.ph2.x86_64
To see the version of the Kernel that is running currently, run the following command:
# uname -r
4.9.107-1.ph2-esx
From the output, you can see that the kernel running currently doesn’t match the installer. This happens when linux-* rpms were updated but was not restarted. Restart is required.
Configuration
To find the configurations of the installed Kernel, check the /boot directory by running the following command:
# ls /boot/config-*
config-4.9.111-1.ph2 config-4.9.111-1.ph2-esx
To get a copy of the kernel configuration (Not all flavours support this feature), run the zcat /proc/config.gz
command.
Boot Parameters and initrd
Several kernel flavors can be installed on the system, but only one is used during boot. /boot/photon.cfg symlink points to the kernel which is used for boot.
# ls -l /boot/photon.cfg
lrwxrwxrwx 1 root root 23 Jun 12 2018 /boot/photon.cfg -> linux-4.9.111-1.ph2.cfg
Its contents can be checked by running the following command:
# cat /boot/photon.cfg
# GRUB Environment Block
photon_cmdline=init=/lib/systemd/systemd ro loglevel=3 quiet no-vmw-sta
photon_linux=vmlinuz-4.9.111-1.ph2
photon_initrd=initrd.img-4.9.111-1.ph2
Where:
photon_cmdline
- Kernel parameters. This list will be extended by values from /boot/systemd.cfg file and the values are hardcoded to /boot/grub2/grub.cfg file (For example: root=).photon_linux
- Kernel image to boot.photon_initrd
- Initrd to use at boot.
Parameters of the kernel loading currently can be found by running the /proc/cmdline
command:
# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.9.107-1.ph2-esx root=PARTUUID=29194d05-4a6e-4e0c-b1f4-5020e5e8472c net.ifnames=0 init=/lib/systemd/systemd ro loglevel=3 quiet no-vmw-sta
Dmesg
To view message buffer of the kernel run the dmesg
command.
Sysctl State
To view a list of all active units run the systemctl list-units
command.
Kernel Statistics
The kernel statistics can be found by running the following commands:
procfs
sysfs
debugfs
Kernel Modules
To view the kernel log buffer run the journalctl -k
command.
To view a list of available kernel modules run the lsmod
command.
To view detailed information about all connected PCI buses run the lspci
command.