Help Docs

Performance Metrics for Monitoring Linux Servers

This page is a complete reference of all performance metrics collected by the Site24x7 Linux server monitoring agent. Once the Linux Full-Stack agent is installed, log in to Site24x7 and navigate to Server > Server Monitor > Servers, then click the Linux server monitor to view its metrics.

How to read this document

Each metric category has a table listing every supported metric. The Availability column tells you where and how the metric is accessible:

Availability label What it means
Dashboard Visible in the server monitor tabs in Site24x7.
Custom dashboard only Not shown in the default server monitor tabs. Must be added to a custom dashboard to view.
Threshold or alerts only Available for threshold alerting only.

The Threshold column indicates whether an alert threshold can be configured for the metric.

The Metric profile required column indicates that the metric is not collected by default. You must create and assign a Metric Profile to enable collection.

To enable these metrics, go to Admin > Configuration Profiles > Metric Profile > Add Metric Profile, select the desired metrics, and associate the profile with your server monitor.

CPU

Overall CPU utilization metrics for the server.

Metric name Description Availability Threshold Metric profile required
CPU Utilization (%) Overall CPU usage percentage across all cores. Dashboard Yes No
Idle Time Percentage of time the CPU spent idle. Dashboard Yes No
Wait Time Percentage of time the CPU was waiting for I/O operations to complete. Dashboard Yes No
User Space Time Percentage of CPU time spent executing user-space processes. Dashboard Yes No
System Time Percentage of CPU time spent on kernel or system processes. Dashboard Yes No
Steal Time Percentage of CPU time stolen by the hypervisor for use on other virtual machines—relevant for virtualized environments. Dashboard Yes No
Hardware Interrupts Time Percentage of CPU time spent servicing hardware interrupts. Dashboard Yes No
Software Interrupts Time Percentage of CPU time spent servicing software interrupts. Dashboard Yes No

CPU (per core)

Granular CPU metrics broken down by individual core.

Metric name Description Availability Threshold Metric profile required
CPU Load CPU load percentage for an individual core. Dashboard Yes No
Per-Core User Time Percentage of time each core spent on user-space processes. Custom dashboard only Yes Yes
Per-Core System Time Percentage of time each core spent on kernel or system processes. Custom dashboard only Yes Yes
Per-Core Idle Time Percentage of time each core was idle. Custom dashboard only Yes Yes
Per-Core I/O Wait Time Percentage of time each core was waiting for I/O operations. Custom dashboard only Yes Yes
Per-Core Steal Time Percentage of time stolen by the hypervisor on each core. Custom dashboard only Yes Yes

Memory

Physical RAM and swap memory metrics collected by the agent.

Metric name Description Availability Threshold Metric profile required
Installed Memory (MB) Total physical RAM installed on the server in MB. Dashboard Yes No
Free Physical Memory Free physical RAM in MB. Dashboard Yes No
Used Physical Memory Used physical RAM in MB. Dashboard Yes No
Memory Utilization (%) Percentage of physical RAM currently in use. Dashboard Yes No
Physical Memory Available Truly free physical RAM in MB. Custom dashboard only Yes Yes
Swap Memory Total swap space on the server in MB. Dashboard Yes No
Free Swap Memory Free swap space in MB. Dashboard Yes No
Used Swap Memory Used swap space in MB. Dashboard Yes No
Swap Memory Utilization (%) Percentage of swap space currently in use. Dashboard Yes No
Memory Page Faults Number of page faults occurring per second. A page fault happens when a process accesses memory that is not currently in its working set. Dashboard Yes No
Memory Pages In Number of pages read from disk per second to resolve hard page faults. Dashboard Yes No
Memory Pages Out Number of pages written to disk per second to free up physical memory. Dashboard Yes No
Cache Bytes Memory used for page cache in MB. Dashboard Yes No
Committed Memory Total virtual memory committed by all processes in MB. Dashboard Yes No
Active Memory Memory that has been used more recently and is not reclaimable unless necessary in MB. Custom dashboard only Yes Yes
Active Anonymous Memory Active memory used by anonymous mappings (process heap and stack) in MB. Custom dashboard only Yes Yes
Active File-Cache Memory Active memory used for file-backed pages (file cache) in MB. Custom dashboard only Yes Yes
Anonymous Huge Pages Memory Memory used by anonymous transparent huge pages in MB. Custom dashboard only Yes Yes
Anonymous Pages Memory Memory used by non-file-backed anonymous pages in MB. Custom dashboard only Yes Yes
Commit Limit Memory Maximum amount of memory that can be committed by the system in MB. Custom dashboard only Yes Yes
Unevictable Memory Memory locked in RAM that cannot be swapped out in MB. Custom dashboard only Yes Yes
Slab Memory Portion of slab memory that cannot be reclaimed under memory pressure in MB. Custom dashboard only Yes Yes

Disk

Disk space and inode metrics per partition and across the server.

Metric name Description Availability Threshold Metric profile required
Name The mount point path of the disk partition. Dashboard No No
Device Name The backing block device path (for example, /dev/sda1). Dashboard No No
Partition The name of the specific partition. Dashboard No No
File System The filesystem type of the partition (for example, ext4, xfs). Dashboard No No
Disk Partition Capacity (in bytes) Total capacity of the disk partition in MB. N/A Yes No
Used (MB) Used disk space on the partition in MB. Dashboard Yes No
Free (MB) Free disk space on the partition in MB. Dashboard Yes No
Used (%) Percentage of disk space used on the partition. Dashboard Yes No
Free (%) Percentage of disk space free on the partition. Dashboard Yes No
Total Inodes (Linux) Total number of inodes on the partition. N/A Yes Yes
Inodes Used Number of inodes currently in use on the partition. Dashboard Yes Yes
Available Inodes (Linux) Number of free inodes on the partition. N/A Yes Yes
Utilized Inodes % (Linux) Percentage of inodes used on the partition. N/A Yes Yes
Free Inodes % Percentage of inodes free on the partition. Custom dashboard only Yes Yes
Overall Disk Size Sum of total disk space across all partitions in MB. Dashboard Yes No
Overall Disk Usage Sum of used disk space across all partitions in MB. Dashboard Yes No
Overall Disk Free Space Sum of free disk space across all partitions in MB. Dashboard Yes No
Overall Disk Utilization Overall disk used percentage across all partitions. Dashboard Yes No
Overall Disk Free Space (%) Overall disk free percentage across all partitions. Dashboard Yes No
Note

The Linux monitoring agent uses the iostat utility to collect disk I/O metrics. Ensure iostat is installed on the server. If disk I/O data is missing, install iostat and restart the agent service.

Disk I/O

Read and write throughput, latency, and operation count metrics for disk I/O, collected using iostat. Aggregate metrics cover all disks; per-disk metrics are broken down by individual disk device.

Metric name Description Availability Threshold Metric profile required
Disk I/O Total disk I/O across all disks in bytes per second—sum of reads and writes. N/A Yes No
Disk Reads (Bytes/sec) Aggregate disk read throughput across all disks in bytes per second. Dashboard Yes No
Disk Writes (Bytes/sec) Aggregate disk write throughput across all disks in bytes per second. Dashboard Yes No
Read Latency Aggregate read latency across all disks in milliseconds. Dashboard Yes No
Write Latency Aggregate write latency across all disks in milliseconds. Dashboard Yes No
Disk Busy Percentage Percentage of time the disk was busy serving I/O requests. Dashboard Yes No
Disk Idle Percentage Percentage of time the disk was idle. Dashboard Yes No
Disk Read Operations/Sec Number of read operations completed per second across all disks. Dashboard Yes No
Disk Write Operations/Sec Number of write operations completed per second across all disks. Dashboard Yes No
Average Disk Queue Length Average number of I/O requests waiting in the disk command queue across all disks. Dashboard Yes No
Disk Reads (Bytes/sec) Read throughput for each individual disk in bytes per second. Dashboard Yes No
Disk Writes (Bytes/sec) Write throughput for each individual disk in bytes per second. Dashboard Yes No
Disk I/O Total I/O (reads and writes) for each individual disk in bytes per second. Dashboard Yes No
Disk Read Operations/Sec Number of read operations (delta counter) for each disk. Custom dashboard only Yes Yes
Disk Write Operations/Sec Number of write operations (delta counter) for each disk. Custom dashboard only Yes Yes
Read Wait Time Time spent on read operations for each disk in milliseconds. Custom dashboard only Yes Yes
Write Wait Time Time spent on write operations for each disk in milliseconds. Custom dashboard only Yes Yes
Per-Disk Total Transactions Total number of I/O transactions (reads and writes) for each disk. Custom dashboard only Yes Yes
Disk Busy Percentage Percentage of time each individual disk was busy serving I/O requests. Custom dashboard only Yes Yes
Average Disk Partition Latency Total latency (read and write) per I/O operation for each disk in milliseconds. Custom dashboard only Yes Yes

Network

Network interface metrics covering traffic throughput, packet statistics, and interface properties.

Metric name Description Availability Threshold Metric profile required
Network Interface Name The name of the network interface (for example, eth0, ens3). Dashboard No No
Link Status Whether the interface link is up (1) or down (0). Dashboard No No
MAC Address The hardware MAC address of the interface. Dashboard No No
IPv4 Address The IPv4 address assigned to the interface. Dashboard No No
IPv6 Address The IPv6 address assigned to the interface. Dashboard No No
Broadcast Address The broadcast address of the interface. Custom dashboard only Yes Yes
Interface Type The type of the network interface—for example, ethernet, Wi-Fi, or loopback. Custom dashboard only Yes Yes
Input Traffic Inbound data rate for the interface in bytes per second. Dashboard Yes No
Output Traffic Outbound data rate for the interface in bytes per second. Dashboard Yes No
Data Received (KBps) Inbound data rate for the interface in KB per second. Dashboard Yes No
Data Sent (KBps) Outbound data rate for the interface in KB per second. Dashboard Yes No
Traffic Combined inbound and outbound data rate for the interface in KB per second. Dashboard Yes No
Packets Received Number of unicast packets received on the interface. Dashboard Yes No
Packets Sent Number of unicast packets sent on the interface. Dashboard Yes No
Error Packets Count of outbound packets that encountered errors (delta counter). Dashboard Yes No
Discarded Packets Count of outbound packets that were discarded (delta counter). Dashboard Yes No
Network Error Packets Percentage of error packets—calculated as (inbound plus outbound errors) over total packets. Custom dashboard only Yes Yes
Discarded Packets % Percentage of discarded packets—calculated as (inbound plus outbound discards) over total packets. Custom dashboard only Yes Yes
Operational State The raw operational state of the interface, read from /sys/class/net/<if>/operstate. Custom dashboard only Yes Yes
Admin State The administrative state of the interface. Custom dashboard only Yes Yes
MTU Maximum Transmission Unit (MTU) value configured on the interface. Custom dashboard only Yes Yes
Speed (Mbps) Maximum speed of the network interface in Mbps. Dashboard Yes No
Outbound Traffic Total outbound traffic across all network interfaces in KB per second. N/A Yes No
Inbound Traffic Total inbound traffic across all network interfaces in KB per second. N/A Yes No
Network Throughput Total network throughput across all interfaces in KB per second—sum of bytes sent and received. Dashboard Yes No

A network interface is added for every unique MAC address. If multiple interfaces share a MAC address, only one will be added and the rest will be ignored.

System

Overall server health metrics including load average, uptime, process counts, and CPU-level interrupt and context switch rates.

Metric name Description Availability Threshold Metric profile required
1 Min Load Average Average number of processes in the run queue over the last one minute. Dashboard Yes No
5 Min Load Average Average number of processes in the run queue over the last five minutes. Dashboard Yes No
15 Min Load Average Average number of processes in the run queue over the last 15 minutes. Dashboard Yes No
Server Uptime Total system uptime. Dashboard Yes No
CPU Busy Time Total CPU busy time in seconds since boot. N/A Yes No
CPU Idle Time Total CPU idle time in seconds since boot, normalized per core. N/A Yes No
Total Process Total number of processes currently on the system. Dashboard Yes No
Active Process Number of processes in the running state, sourced from /proc/stat. Dashboard Yes No
Blocked Process Number of processes blocked waiting for I/O, sourced from /proc/stat. Dashboard Yes No
Number of Active Processes Snapshot count of processes actively using CPU at the moment of collection. N/A Yes No
Zombie Process Count Number of zombie (defunct) processes on the system. Custom dashboard only Yes Yes
Total Thread Count Total number of threads across all running processes. Custom dashboard only Yes Yes
Number of Open Ports Number of open or listening sockets (ports) on the system. N/A Yes No
Login Count Number of users currently logged in to the server. N/A Yes No
Context Switches Average rate of context switches, i.e., transitions between threads, per second. Dashboard Yes No
Processor Interrupts Average number of hardware interrupts the processor receives per second. Dashboard Yes No

Processes

Per-process metrics for processes configured for monitoring. Processes must be added via the Processes tab or the Discover Processes option in the server monitor.

Metric name Description Availability Threshold Metric profile required
Process Name The name of the monitored process. Dashboard No No
Args The command line arguments used to start the process. Dashboard No No
Execution Path The full path to the process executable. Dashboard No No
CPU Usage (%) CPU usage percentage. Dashboard Yes No
Memory Usage (%) Average memory usage percentage. Dashboard Yes No
Thread Count Number of threads used by the process. Dashboard Yes No
Handle Count Number of open file handles or descriptors used by the process. Dashboard Yes No
Instances Number of running instances of the process. Dashboard Yes No
Process ID The process identifier. Dashboard No No
Process Uptime The length of time the process has been running operationally. Dashboard No No
Resident Set Size (RSS) The amount of physical memory currently used by the process. N/A Yes No
Priority The scheduling priority (nice value) of the process. Dashboard No No
User The username of the account that owns the process. Dashboard No No

Services

Metrics for Linux systemd services configured for monitoring. Services are monitored based on their systemd unit state and resource usage. Learn more about service and process monitoring.

Note

Linux services monitoring is supported by Linux Full-Stack agents version 22.1.00 and above.

Metric name Description Availability Threshold Metric profile required
Service Name The name of the systemd service unit. Dashboard No No
Service Display Name The human-readable description of the service as defined in the unit file. Dashboard No No
Service Status Indicates if the service is active. N/A Yes No
Active State The systemd ActiveState of the service—for example, active, inactive, or failed. N/A Yes No
Sub-State The systemd SubState of the service—for example, running, dead, or exited. N/A Yes No
Unit File Path The path to the service's unit file, including any drop-in paths. Dashboard No No
Service Memory Utilization (%) Memory used by the service as a percentage of total server RAM. Dashboard Yes No
Thread Count Number of threads or tasks currently running under the service. Dashboard Yes No
Service Restart Count Number of times the service has been restarted. Dashboard Yes No
Service Uptime How long the service has been running, in seconds. Dashboard Yes No

Users

Resource consumption metrics per user on the server. Only users associated with active processes appear in the user list.

Note

User monitoring is available from Linux server monitoring agent version 21.0.00 and above.

Metric name Description Availability Threshold Metric profile required
Username The login name of the user. Dashboard No No
CPU Utilization (%) by User CPU usage percentage for the user, normalized per core—gives a comparable value regardless of the number of cores. Dashboard Yes No
Memory Utilization (%) by User Aggregate memory usage percentage across all processes owned by the user. Dashboard Yes No
Physical Memory Utilization by User Aggregate resident set size in KB across all processes owned by the user. Dashboard Yes No

Connection stats

TCP connection state counts for the server.

Note

Linux connection statistics monitoring is supported by Linux Full-Stack agents version 22.2.00 and above.

Metric name Description Availability Threshold Metric profile required
ESTABLISHED Connection Count Number of TCP connections currently in the ESTABLISHED state—active, open connections. Custom dashboard only Yes Yes
FIN_WAIT1 Connection Count Number of TCP connections in the FIN_WAIT1 state—connection is closing, waiting for the remote end to acknowledge. Custom dashboard only Yes Yes
FIN_WAIT2 Connection Count Number of TCP connections in the FIN_WAIT2 state—waiting for the remote end to send its FIN. Custom dashboard only Yes Yes
TIME_WAIT Connection Count Number of TCP connections in the TIME_WAIT state—waiting to ensure the remote end received the final acknowledgment before the connection is fully closed. Custom dashboard only Yes Yes
LISTEN Connection Count Number of sockets in the LISTEN state—actively listening for incoming connections. Custom dashboard only Yes Yes

Mail queue

Mail queue depth metric for servers running a mail transfer agent (MTA) such as Postfix or Sendmail.

Note

Linux Mail Queue monitoring is supported by Linux Full-Stack agents version 22.2.00 and above.

Metric name Description Availability Threshold Metric profile required
Mail Queue Length Total number of messages across all mail queues (active, deferred, incoming, corrupt, and hold). Custom dashboard only Yes Yes

PSI (pressure stall information)

Pressure stall information (PSI) metrics measure the proportion of time that tasks are stalled waiting for CPU, memory, or I/O resources. PSI is available on Linux kernels 4.20 and above. All PSI metrics require a Metric Profile to enable and are viewable only via a custom dashboard.

Each PSI metric comes in two forms:

  • Some pressure: At least one task was stalled—some work was delayed.
  • Full pressure: All tasks were stalled—no work was progressing.

Averages are reported over 10-second, 60-second, and 300-second windows. Total stall time is the cumulative stall duration in microseconds since boot.

Note

Linux PSI monitoring is supported by Linux Full-Stack agents version 22.2.00 and above.

CPU pressure

Metric name Description Availability Threshold Metric profile required
CPU Pressure (Some) 10s Avg Percentage of time at least one task was stalled waiting for CPU, averaged over 10 seconds. Custom dashboard only Yes Yes
CPU Pressure (Some) 60s Avg Percentage of time at least one task was stalled waiting for CPU, averaged over 60 seconds. Custom dashboard only Yes Yes
CPU Pressure (Some) 300s Avg Percentage of time at least one task was stalled waiting for CPU, averaged over 300 seconds. Custom dashboard only Yes Yes
Total CPU Stall Time Cumulative total CPU stall time (some) in microseconds since boot. Custom dashboard only Yes Yes

Memory pressure

Metric name Description Availability Threshold Metric profile required
Memory Pressure (Some) 60s Avg Percentage of time at least one task was stalled waiting for memory, averaged over 60 seconds. Custom dashboard only Yes Yes
Memory Pressure (Some) 300s Avg Percentage of time at least one task was stalled waiting for memory, averaged over 300 seconds. Custom dashboard only Yes Yes
Total Memory Stall Time Cumulative total memory stall time (some) in microseconds since boot. Custom dashboard only Yes Yes
Memory Pressure (Full) 60s Avg Percentage of time all tasks were stalled waiting for memory, averaged over 60 seconds. Custom dashboard only Yes Yes
Memory Pressure (Full) 300sm Avg Percentage of time all tasks were stalled waiting for memory, averaged over 300 seconds. Custom dashboard only Yes Yes

I/O pressure

Metric name Description Availability Threshold Metric profile required
I/O Pressure (Some) 60s Avg Percentage of time at least one task was stalled waiting for I/O, averaged over 60 seconds. Custom dashboard only Yes Yes
I/O Pressure (Some) 300s Avg Percentage of time at least one task was stalled waiting for I/O, averaged over 300 seconds. Custom dashboard only Yes Yes
Total I/O Stall Time Cumulative total I/O stall time (some) in microseconds since boot. Custom dashboard only Yes Yes
I/O Pressure (Full) 60s Avg Percentage of time all tasks were stalled waiting for I/O, averaged over 60 seconds. Custom dashboard only Yes Yes
I/O Pressure (Full) 300s Avg Percentage of time all tasks were stalled waiting for I/O, averaged over 300 seconds. Custom dashboard only Yes Yes

Resource Checks

Monitor internal resources like files, directories, URLs, ports, and syslogs on a Linux server. Click on Create/Edit Resource Check Profile to create/edit resource checks. You can also go to the Admin tab in the Site24x7 web client and click on Server Monitor > Resource Check Profile to add a resource for monitoring. The following internal resources are supported for monitoring:

Learn more.

Syslogs

Get ample amount of data laid out in a graphical format detailing downtime, performance drops, and security infringements. Detailed metrics on logging program messages and process severity can be extrapolated from the Syslogs graph. 

The user can also check for specific keywords and their occurrences in the syslogs. The logs can be filtered by ID and source to get notified instantly when unexpected behavior occurs.

Tools

Manage various actions and carry out tasks at ease and all in one place using Server Tools. You can also access this page by going to Server > Server Monitor > Server Tools > select your Linux server from the drop-down.

I. Process Viewer

Get the complete list of all the active processes running on your Linux server with their CPU (%) usage, memory (%) usage, handle count, thread count, and instances. You can search for any particular process in the Search Bar at the top (highlighted in red in the screenshot below). You can add processes for monitoring by using the +Add option beside the process name (highlighted in blue in the screenshot below).

 

Add Custom Tab

Create your own tab and monitor the performance metrics you need.

Steps to add a customized view: 

  1. Click on the Add Custom Tab button.
  2. Provide a Display Name for identification purposes.
  3. Select the metrics that you wish to view and monitor under this view.
  4. Save your changes.
  5. Click on More > click on the custom dashboard that you created. 

Note

You can edit the display name or delete a custom view by going to Edit Custom View.

Root Cause Analysis (RCA)

Every time a downtime is detected, a Root Cause Analysis (RCA) report is triggered and sent to the user based on the alerting contact and medium. The RCA generated for a Linux server monitor provides the actual reason behind the downtime, along with the trace route map to diagnose connectivity issues.

Performance Reports

Log in to Site24x7 and go to Reports > Server Monitor to access performance reports for Linux monitoring. In addition to the common reports available for all monitor types in Site24x7, server monitoring has some exclusive reports on disk usage, network adapter details, agent inventory, and top n reports for CPU, memory, and disk. Learn more.  

Server Inventory & Health Dashboards

Get complete view of your entire server environment with our intuitive dashboards.

  • Inventory Dashboard - Displays a count of all your servers, applications, resource checks, plugins and more.
  • Health Dashboard - Know the current count and status of all the servers, plugins, and apps in your account.

Poll Now

The Poll Now option allows you to manually trigger the monitoring agent to collect data from the server immediately. This bypasses the regular polling schedule and ensures that the most recent metrics are available right away.

How to access Poll Now

  1. Click the hamburger icon (three horizontal lines) next to the Linux server monitor name.
  2. From the actions menu, click Poll Now.

The agent will initiate an instant poll and the latest metrics and status updates will be retrieved.

Licensing

Know what metrics you get for a single Linux server monitor. Learn more.

Was this document helpful?

Would you like to help us improve our documents? Tell us what you think we could do better.


We're sorry to hear that you're not satisfied with the document. We'd love to learn what we could do to improve the experience.


Thanks for taking the time to share your feedback. We'll use your feedback to improve our online help resources.

Shortlink has been copied!