System Monitoring and Logging

Why Monitor Your System?
Proactive Monitoring
predicting the future such awareness of storage situation or CPU usage

Reactive Monitoring
explaining previous events such as why a crash occurred

System Resources Monitoring: top, htop, vmstat, iostat
Load Average
in top's top right corner load averages are displayed
three numbers are displayed for last 1, 5 and 15 minutes
load number of processes waiting for the CPU
  • Load < Number of Cores - can handle more work
  • Load = Number of Cores - perfect utilization
  • Load > Number of Cores - system is overloaded

vmstat (Virtual Memory Statistics
top gives a snapshot
vmstate provides a trend
vmstat 1
updates every second

columns to watch

  • r (run queue) - number of processes waiting for CPU
  • si / so (swap in/ swap out) - data moving between RAM and disk
    if non-zero, out of RAM
  • us /sy/id /wa (CPU)
    • us - user time (apps)
    • sy - system time (kernel)
    • id - idle time
    • wa - wait I/O (waiting for hard drive
iostat (Input/Output Statistics
app focuses on hard drives
iostat -xz 1
if %util is near 100% the drive is a bottleneck

Memory Usage: free, Understanding RAM and Swap
perfomance drops when system swaps between RAM and disk
the free command
free -h
output
       total        used        free      shared  buff/cache   available
Mem:   15Gi         4.0Gi       8.0Gi     1.0Gi   3.0Gi        9.0Gi
Swap: 2.0Gi     0B  2.0Gi
columns
  • total - total physical RAM
  • used - memory used by active processes
  • free - unused memory
  • buff/cache - Linux uses empty RAM to cache files from hard drive to speed up reads
  • available - actual ammount of memory available for a new app
    the number which matters

Swap Memory
swap is space on drive used as emergency RAM
if swap is high server has run out of available RAM
server will be sluggish and unresponsive

Disk I/O Monitoring
want to know what app is writing to disk
iotop works like top but for disks
sudo iotop
lists processes sorted by how much they are reading/writing
if system feels slow but CPU usage is low, run iotop
might find a backup script or a database query thrashing the drive
Understanding System Logs: /var/log/
The Big Two
Linux logs everything to the /var/log directory
on Ubuntu/Debian systems the file is
/var/log/syslog
on Red Hat/CentOS system the file is
/var/log/messages
Authentication Log
to view logins and login attempts
  • Ubuntu - /var/log/auth.log
  • Red Hat - /var/log/secure

Application Log
most major applications keep their own subdirectories
  • Apache - var/log/apache2/
  • Nginx - var/log/nginx2/
  • MySQL - /var/log/mysql/
  • Mail - /var/log/mail.log

Key Log Files: syslog, auth.log, kern.log, dmesg
syslog (System Log)
a catch all
format
Date Time Hostname Process[PID]: Message
example
Oct 30 10:00:01 server cron[123]: (root) CMD (backup.sh)
auth.log (Security Log)
tracks sudo usage, SSH logins and user creation
example
Oct 30 10:05:00 server sshd[456]: Failed password for invalid user admin from 192.168.1.50
kern.log (Kernel Log)
messages directly from the kernel
  • hardware errors
  • USB device events
  • filesystem errors
dmesg (Boot Log)
command prints the 'ring buffer' of the kernel
shows exactly what happened during the boot process (detecting hard drives, loading drivers)
dmesg | less 
or to show human-readable timestamps
dmesg -T
Real-time Log Monitoring: tail -f
tail -f is the primary tool for monitoring
covered in Reading File Contents: cat, less, more, head, tail
scenario - restart web server but it fails to start
  1. open one terminal window
  2. run
    tail -f /var/log/syslog
  3. open a second terminal window and restart the service
    sudo systemctl restart apache2
  4. error message will appear in first terminal window

Log Rotation and Management: logrotate
obvious problem with infinite logging and disk space
logrotate runs daily via Cron
checks the logs
if log is too big or too old tool will
  1. rename the current log (eg syslog becomes syslog.1)
  2. compresses the old one (eg syslog.2.gz)
  3. deletes the oldest one (eg syslog.5.gz)
  4. creates new syslog

Setting Up Simple Alerts and Notifications
set up simple alerts using shell scripts and cron

Example: The Disk Space Alerter
create a script check_disk.sh
#!/bin/bash
USAGE=$(df / | grep / | awk '{ print $5 }' | sed 's/%//g')
THRESHOLD=90
if [ "$USAGE" -gt "$THRESHOLD" ]; then
    echo "Warning: Disk usage is at ${USAGE}%" | mail -s "Disk Alert" [email protected]
fi
add script to crontab to run every hour
Summary
covers
  • vital signs - top (CPU), free (RAM), iostat (disk)
  • diaries - /var/log/syslog and /var/log/auth.log
  • tools - vmstat for trends, tail -f for real-time debugging
  • hygiene - logrotate cleans log files

key points
  • load average - measures CPU demand
  • vmstat 1 - shows system performance trend (CPU, RAM, disk) every second
  • free -h - shows available RAM
  • swap - disk used as memory
  • /var/log/syslog - main Ubuntu system log
  • /var/log/auth.log - security and login log
  • tail -f <filename> - watch log file in real-time
  • dmesg - view kerneland hardware messages from boot
index