22 min read

In this article written by Benjamin Cane, author of the book Red Hat Enterprise Linux Troubleshooting Guide the author goes on to explain how before starting to explore troubleshooting commands, we should first cover locations of useful information. Useful information is a bit of an ubiquitous term, pretty much every file, directory, or command can provide useful information. What he really plans to cover are places where it is possible to find information for almost any issue.

(For more resources related to this topic, see here.)

Log files

Log files are often the first place to start looking for troubleshooting information. Whenever a service or server is experiencing an issue, checking the log files for errors can often answer many questions quickly.

The default location

By default, RHEL and most Linux distributions keep their log files in /var/log/, which is actually part of the Filesystem Hierarchy Standard (FHS) maintained by the Linux Foundation. However, while /var/log/ might be the default location not all log files are located there(http://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard).

While /var/log/httpd/ is the default location for Apache logs, this location can be changed with Apache’s configuration files. This is especially common when Apache was installed outside of the standard RHEL package.

Like Apache, most services allow for custom log locations. It is not uncommon to find custom directories or file systems outside of /var/log created specifically for log files.

Common log files

The following table is a short list of common log files and a description of what you can find within them.

Do keep in mind that this list is specific to Red Hat Enterprise Linux 7, and while other Linux distributions might follow similar conventions, they are not guaranteed.

Log file

Description

/var/log/messages

By default, this log file contains all syslog messages (except e-mail) of INFO or higher priority.

/var/log/secure

This log file contains authentication related message items such as:

  • SSH logins
  • User creations
  • Sudo violations and privilege escalation

/var/log/cron

This log file contains a history of crond executions as well as start and end times of cron.daily, cron.weekly, and other executions.

/var/log/maillog

This log file is the default log location of mail events. If using postfix, this is the default location for all postfix-related messages.

/var/log/httpd/

This log directory is the default location for Apache logs. While this is the default location, it is not a guaranteed location for all Apache logs.

/var/log/mysql.log

This log file is the default log file for mysqld. Much like the httpd logs, this is default and can be changed easily.

/var/log/sa/

This directory contains the results of the sa commands that run every 10 minutes by default.

For many issues, one of the first log files to review is the /var/log/messages log. On RHEL systems, this log file receives all system logs of INFO priority or higher. In general, this means that any significant event sent to syslog would be captured in this log file.

The following is a sample of some of the log messages that can be found in /var/log/messages:

Dec 24 18:03:51 localhost systemd: Starting Network Manager Script Dispatcher Service...
Dec 24 18:03:51 localhost dbus-daemon: dbus[620]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Dec 24 18:03:51 localhost dbus[620]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Dec 24 18:03:51 localhost systemd: Started Network Manager Script Dispatcher Service.
Dec 24 18:06:06 localhost kernel: e1000: enp0s3 NIC Link is Down
Dec 24 18:06:06 localhost kernel: e1000: enp0s8 NIC Link is Down
Dec 24 18:06:06 localhost NetworkManager[750]: <info> (enp0s3): link disconnected (deferring action for 4 seconds)
Dec 24 18:06:06 localhost NetworkManager[750]: <info> (enp0s8): link disconnected (deferring action for 4 seconds)
Dec 24 18:06:10 localhost NetworkManager[750]: <info> (enp0s3): link disconnected (calling deferred action)
Dec 24 18:06:10 localhost NetworkManager[750]: <info> (enp0s8): link disconnected (calling deferred action)
Dec 24 18:06:12 localhost kernel: e1000: enp0s3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Dec 24 18:06:12 localhost kernel: e1000: enp0s8 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Dec 24 18:06:12 localhost NetworkManager[750]: <info> (enp0s3): link connected
Dec 24 18:06:12 localhost NetworkManager[750]: <info> (enp0s8): link connected
Dec 24 18:06:39 localhost kernel: atkbd serio0: Spurious NAK on isa0060/serio0. Some program might be trying to access hardware directly.
Dec 24 18:07:10 localhost systemd: Starting Session 53 of user root.
Dec 24 18:07:10 localhost systemd: Started Session 53 of user root.
Dec 24 18:07:10 localhost systemd-logind: New session 53 of user root.

As we can see, there are more than a few log messages within this sample that could be useful while troubleshooting issues.

Finding logs that are not in the default location

Many times log files are not in /var/log/, which can be either because someone modified the log location to some place apart from the default, or simply because the service in question defaults to another location.

In general, there are three ways to find log files not in /var/log/.

Checking syslog configuration

If you know a service is using syslog for its logging, the best place to check to find which log file its messages are being written to is the rsyslog configuration files. The rsyslog service has two locations for configuration. The first is the /etc/rsyslog.d directory.

The /etc/rsyslog.d directory is an include directory for custom rsyslog configurations. The second is the /etc/rsyslog.conf configuration file. This is the main configuration file for rsyslog and contains many of the default syslog configurations.

The following is a sample of the default contents of /etc/rsyslog.conf:

#### RULES ####

# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.*                             /dev/console

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none /var/log/messages

# The authpriv file has restricted access.
authpriv.*                           /var/log/secure

# Log all the mail messages in one place.
mail.*                             -/var/log/maillog

# Log cron stuff
cron.*                               /var/log/cron

By reviewing the contents of this file, it is fairly easy to identify which log files contain the information required, if not, at least, the possible location of syslog managed log files.

Checking the application’s configuration

Not every application utilizes syslog; for those that don’t, one of the easiest ways to find the application’s log file is to read the application’s configuration files.

A quick and useful method for finding log file locations from configuration files is to use the grep command to search the file for the word log:

$ grep log /etc/samba/smb.conf
# files are rotated when they reach the size specified with "max
log size".
# log files split per-machine:
log file = /var/log/samba/log.%m
# maximum size of 50KB per log file, then rotate:
max log size = 50

The grep command is a very useful command that can be used to search files or directories for specific strings or patterns. The simplest command can be seen in the preceding snippet where the grep command is used to search the /etc/samba/smb.conf file for any instance of the pattern “log“.

After reviewing the output of the preceding grep command, we can see that the configured log location for samba is /var/log/samba/log.%m. It is important to note that %m, in this example, is actually replaced with a “machine name” when creating the file. This is actually a variable within the samba configuration file. These variables are unique to each application but this method for making dynamic configuration values is a common practice.

Other examples

The following are examples of using the grep command to search for the word “log” in the Apache and MySQL configuration files:

$ grep log /etc/httpd/conf/httpd.conf
# ErrorLog: The location of the error log file.
# logged here. If you *do* define an error logfile for a <VirtualHost>
# container, that host's errors will be logged there and not here.
ErrorLog "logs/error_log"

$ grep log /etc/my.cnf
# log_bin
log-error=/var/log/mysqld.log

In both instances, this method was able to identify the configuration parameter for the service’s log file. With the previous three examples, it is easy to see how effective searching through configuration files can be.

Using the find command

The find command, is another useful method for finding log files. The find command is used to search a directory structure for specified files. A quick way of finding log files is to simply use the find command to search for any files that end in “.log”:

# find /opt/appxyz/ -type f -name "*.log"
/opt/appxyz/logs/daily/7-1-15/alert.log
/opt/appxyz/logs/daily/7-2-15/alert.log
/opt/appxyz/logs/daily/7-3-15/alert.log
/opt/appxyz/logs/daily/7-4-15/alert.log
/opt/appxyz/logs/daily/7-5-15/alert.log

The preceding is generally considered a last resort solution, and is mostly used when the previous methods do not produce results.

When executing the find command, it is considered a best practice to be very specific about which directory to search. When being executed against very large directories, the performance of the server can be degraded.

Configuration files

As discussed previously, configuration files for an application or service can be excellent sources of information. While configuration files won’t provide you with specific errors such as log files, they can provide you with critical information (for example, enabled/disabled features, output directories, and log file locations).

Default system configuration directory

In general, system, and service configuration files are located within the /etc/ directory on most Linux distributions. However, this does not mean that every configuration file is located within the /etc/ directory. In fact, it is not uncommon for applications to include a configuration directory within the application’s home directory.

So how do you know when to look in the /etc/ versus an application directory for configuration files? A general rule of thumb is, if the package is part of the RHEL distribution, it is safe to assume that the configuration is within the /etc/ directory. Anything else may or may not be present in the /etc/ directory. For these situations, you simply have to look for them.

Finding configuration files

In most scenarios, it is possible to find system configuration files within the /etc/ directory with a simple directory listing using the ls command:

$ ls -la /etc/ | grep my
-rw-r--r--. 1 root root     570 Nov 17 2014 my.cnf
drwxr-xr-x. 2 root root       64 Jan 9 2015 my.cnf.d

The preceding code snippet uses ls to perform a directory listing and redirects that output to grep in order to search the output for the string “my“. We can see from the output that there is a my.cnf configuration file and a my.cnf.d configuration directory. The MySQL processes use these for its configuration. We were able to find these by assuming that anything related to MySQL would have the string “my” in it.

Using the rpm command

If the configuration files were deployed as part of a RPM package, it is possible to use the rpm command to identify configuration files. To do this, simply execute the rpm command with the –q (query) flag, and the –c (configfiles) flag, followed by the name of the package:

$ rpm -q -c httpd
/etc/httpd/conf.d/autoindex.conf
/etc/httpd/conf.d/userdir.conf
/etc/httpd/conf.d/welcome.conf
/etc/httpd/conf.modules.d/00-base.conf
/etc/httpd/conf.modules.d/00-dav.conf
/etc/httpd/conf.modules.d/00-lua.conf
/etc/httpd/conf.modules.d/00-mpm.conf
/etc/httpd/conf.modules.d/00-proxy.conf
/etc/httpd/conf.modules.d/00-systemd.conf
/etc/httpd/conf.modules.d/01-cgi.conf
/etc/httpd/conf/httpd.conf
/etc/httpd/conf/magic
/etc/logrotate.d/httpd
/etc/sysconfig/htcacheclean
/etc/sysconfig/httpd

The rpm command is used to manage RPM packages and is a very useful command when troubleshooting. We will cover this command further as we explore commands for troubleshooting.

Using the find command

Much like finding log files, to find configuration files on a system, it is possible to utilize the find command. When searching for log files, the find command was used to search for all files where the name ends in “.log“. In the following example, the find command is being used to search for all files where the name begins with “http“. This find command should return at least a few results, which will provide configuration files related to the HTTPD (Apache) service:

# find /etc -type f -name "http*"

/etc/httpd/conf/httpd.conf
/etc/sysconfig/httpd
/etc/logrotate.d/httpd

The preceding example searches the /etc directory; however, this could also be used to search any application home directory for user configuration files. Similar to searching for log files, using the find command to search for configuration files is generally considered a last resort step and should not be the first method used.

The proc filesystem

An extremely useful source of information is the proc filesystem. This is a special filesystem that is maintained by the Linux kernel. The proc filesystem can be used to find useful information about running processes, as well as other system information. For example, if we wanted to identify the filesystems supported by a system, we could simply read the /proc/filesystems file:

$ cat /proc/filesystems
nodev sysfs
nodev rootfs
nodev bdev
nodev proc
nodev cgroup
nodev cpuset
nodev tmpfs
nodev devtmpfs
nodev debugfs
nodev securityfs
nodev sockfs
nodev pipefs
nodev anon_inodefs
nodev configfs
nodev devpts
nodev ramfs
nodev hugetlbfs
nodev autofs
nodev pstore
nodev mqueue
nodev selinuxfs
xfs
nodev rpc_pipefs
nodev nfsd

This filesystem is extremely useful and contains quite a bit of information about a running system. The proc filesystem will be used throughout the troubleshooting steps. It is used in various ways while troubleshooting everything from specific processes, to read-only filesystems.

Troubleshooting commands

This section will cover frequently used troubleshooting commands that can be used to gather information from the system or a running service. While it is not feasible to cover every possible command, the commands used do cover fundamental troubleshooting steps for Linux systems.

Command-line basics

The troubleshooting steps used are primarily command-line based. While it is possible to perform many of these things from a graphical desktop environment, the more advanced items are command-line specific. As such, the reader has at least a basic understanding of Linux. To be more specific, we assumes that the reader has logged into a server via SSH and is familiar with basic commands such as cd, cp, mv, rm, and ls.

For those who might not have much familiarity, I wanted to quickly cover some basic command-line usage that will be required.

Command flags

Many readers are probably familiar with the following command:

$ ls -la
total 588
drwx------. 5 vagrant vagrant   4096 Jul 4 21:26 .
drwxr-xr-x. 3 root   root       20 Jul 22 2014 ..
-rw-rw-r--. 1 vagrant vagrant 153104 Jun 10 17:03 app.c

Most should recognize that this is the ls command and it is used to perform a directory listing. What might not be familiar is what exactly the –la part of the command is or does. To understand this better, let’s look at the ls command by itself:

$ ls
app.c application app.py bomber.py index.html lookbusy-1.4 lookbusy-1.4.tar.gz lotsofiles

The previous execution of the ls command looks very different from the previous. The reason for this is because the latter is the default output for ls. The –la portion of the command is what is commonly referred to as command flags or options. The command flags allow a user to change the default behavior of the command providing it with specific options.

In fact, the –la flags are two separate options, –l and –a; they can even be specified separately:

$ ls -l -a
total 588
drwx------. 5 vagrant vagrant   4096 Jul 4 21:26 .
drwxr-xr-x. 3 root   root       20 Jul 22 2014 ..
-rw-rw-r--. 1 vagrant vagrant 153104 Jun 10 17:03 app.c

We can see from the preceding snippet that the output of ls –la is exactly the same as ls –l –a. For common commands, such as the ls command, it does not matter if the flags are grouped or separated, they will be parsed in the same way. Will show both grouped and ungrouped. If grouping or ungrouping is performed for any specific reason it will be called out; otherwise, the grouping or ungrouping used for visual appeal and memorization.

In addition to grouping and ungrouping, we will also show flags in their long format. In the previous examples, we showed the flag -a, this is known as a short flag. This same option can also be provided in the long format –all:

$ ls -l --all
total 588
drwx------. 5 vagrant vagrant   4096 Jul 4 21:26 .
drwxr-xr-x. 3 root   root       20 Jul 22 2014 ..
-rw-rw-r--. 1 vagrant vagrant 153104 Jun 10 17:03 app.c

The –a and the –all flags are essentially the same option; it can simply be represented in both short and long form.

One important thing to remember is that not every short flag has a long form and vice versa. Each command has its own syntax, some commands only support the short form, others only support the long form, but many support both. In most cases, the long and short flags will both be documented within the commands man page.

Piping command output

Another common command-line practice that will be used several times is piping output. Specifically, examples such as the following:

$ ls -l --all | grep app
-rw-rw-r--. 1 vagrant vagrant 153104 Jun 10 17:03 app.c
-rwxrwxr-x. 1 vagrant vagrant 29390 May 18 00:47 application
-rw-rw-r--. 1 vagrant vagrant   1198 Jun 10 17:03 app.py

In the preceding example, the output of the ls -l –all command is piped to the grep command. By placing | or the pipe character between the two commands, the output of the first command is “piped” to the input for the second command. The example preceding the ls command will be executed; with that, the grep command will then search that output for any instance of the pattern “app“.

Piping output to grep will actually be used quite often, as it is a simple way to trim the output into a maintainable size. Many times the examples will also contain multiple levels of piping:

$ ls -la | grep app | awk '{print $4,$9}'
vagrant app.c
vagrant application
vagrant app.py

In the preceding code the output of ls -la is piped to the input of grep; however, this time, the output of grep is also piped to the input of awk.

While many commands can be piped to, not every command supports this. In general, commands that accept user input from files or command-line also accept piped input. As with the flags, a command’s man page can be used to identify whether the command accepts piped input or not.

Gathering general information

When managing the same servers for a long time, you start to remember key information about those servers. Such as the amount of physical memory, the size and layout of their filesystems, and what processes should be running. However, when you are not familiar with the server in question it is always a good idea to gather this type of information.

The commands in this section are commands that can be used to gather this type of general information.

w – show who is logged on and what they are doing

Early in my systems administration career, I had a mentor who used to tell me I always run w when I log into a server. This simple tip has actually been very useful over and over again in my career. The w command is simple; when executed it will output information such as system uptime, load average, and who is logged in:

# w
04:07:37 up 14:26, 2 users, load average: 0.00, 0.01, 0.05
USER     TTY       LOGIN@   IDLE   JCPU   PCPU WHAT
root     tty1     Wed13   11:24m 0.13s 0.13s -bash
root     pts/0     20:47   1.00s 0.21s 0.19s -bash

This information can be extremely useful when working with unfamiliar systems. The output can be useful even when you are familiar with the system. With this command, you can see:

  • When this system was last rebooted:
    04:07:37 up 14:26:This information can be extremely useful; whether it is an alert for a service like Apache being down, or a user calling in because they were locked out of the system. When these issues are caused by an unexpected reboot, the reported issue does not often include this information. By running the w command, it is easy to see the time elapsed since the last reboot.
  • The load average of the system:
    load average: 0.00, 0.01, 0.05:The load average is a very important measurement of system health. To summarize it, the load average is the average number of processes in a wait state over a period of time. The three numbers in the output of w represent different times.
    The numbers are ordered from left to right as 1 minute, 5 minutes, and 15 minutes.
  • Who is logged in and what they are running:
    • USER     TTY       LOGIN@  IDLE   JCPU   PCPU WHAT
    • root     tty1     Wed13   11:24m 0.13s 0.13s -bash

    The final piece of information that the w command provides is users that are currently logged in and what command they are executing.

This is essentially the same output as the who command, which includes the user logged in, when they logged in, how long they have been idle, and what command their shell is running. The last item in that list is extremely important.

Oftentimes, when working with big teams, it is common for more than one person to respond to an issue or ticket. By running the w command immediately after login, you will see what other users are doing, preventing you from overriding any troubleshooting or corrective steps the other person has taken.

rpm – RPM package manager

The rpm command is used to manage Red Hat package manager (RPM). With this command, you can install and remove RPM packages, as well as search for packages that are already installed.

We saw earlier how the rpm command can be used to look for configuration files. The following are several additional ways we can use the rpm command to find critical information.

Listing all packages installed

Often when troubleshooting services, a critical step is identifying the version of the service and how it was installed. To list all RPM packages installed on a system, simply execute the rpm command with -q (query) and -a (all):

# rpm -q -a
kpatch-0.0-1.el7.noarch
virt-what-1.13-5.el7.x86_64
filesystem-3.2-18.el7.x86_64
gssproxy-0.3.0-9.el7.x86_64
hicolor-icon-theme-0.12-7.el7.noarch

The rpm command is a very diverse command with many flags. In the preceding example the -q and -a flags are used. The -q flag tells the rpm command that the action being taken is a query; you can think of this as being put into a “search mode”. The -a or –all flag tells the rpm command to list all packages.

A useful feature is to add the –last flag to the preceding command, as this causes the rpm command to list the packages by install time with the latest being first.

Listing all files deployed by a package

Another useful rpm function is to show all of the files deployed by a specific package:

# rpm -q --filesbypkg kpatch-0.0-1.el7.noarch
kpatch                   /usr/bin/kpatch
kpatch                   /usr/lib/systemd/system/kpatch.service

In the preceding example, we again use the -q flag to specify that we are running a query, along with the –filesbypkg flag. The –filesbypkg flag will cause the rpm command to list all of the files deployed by the specified package.

This example can be very useful when trying to identify a service’s configuration file location.

Using package verification

In this third example, we are going to use an extremely useful feature of rpm, verify. The rpm command has the ability to verify whether or not the files deployed by a specified package have been altered from their original contents. To do this, we will use the -V (verify) flag:

# rpm -V httpd
S.5....T. c /etc/httpd/conf/httpd.conf

In the preceding example, we simply run the rpm command with the -V flag followed by a package name. As the -q flag is used for querying, the -V flag is for verifying. With this command, we can see that only the /etc/httpd/conf/httpd.conf file was listed; this is because rpm will only output files that have been altered.

In the first column of this output, we can see which verification checks the file failed. While this column is a bit cryptic at first, the rpm man page has a useful table (as shown in the following list) explaining what each character means:

  • S: This means that the file size differs
  • M: This means that the mode differs (includes permissions and file type)
  • 5: This means that the digest (formerly MD5 sum) differs
  • D: This means indicates the device major/minor number mismatch
  • L: This means indicates the readLink(2) path mismatch
  • U: This means that the user ownership differs
  • G: This means that the group ownership differs
  • T: This means that mTime differs
  • P: This means that caPabilities differs

Using this list we can see that the httpd.conf’s file size, MD5 sum, and mtime (Modify Time) are not what was deployed by httpd.rpm. This means that it is highly likely that the httpd.conf file has been modified after installation.

While the rpm command might not seem like a troubleshooting command at first, the preceding examples show just how powerful of a troubleshooting tool it can be. With these examples, it is simple to identify important files and whether or not those files have been modified from the deployed version.

Summary

Overall we learned that log files, configuration files, and the /proc filesystem are key sources of information during troubleshooting. We also covered the basic use of many fundamental troubleshooting commands.

You also might have noticed that quite a few commands are also used in day-to-day life for nontroubleshooting purposes. While these commands might not explain the issue themselves, they can help gather information about the issue, which leads to a more accurate and quick resolution. Familiarity with these fundamental commands is critical to your success during troubleshooting.

Resources for Article:


Further resources on this subject:


LEAVE A REPLY

Please enter your comment!
Please enter your name here