Elasticsearch Monitoring

Monitor your Elasticsearch cluster performance by knowing details on cluster status, nodes and shards details, JVM stats, and more.

Install and configure the Elasticsearch plugin to monitor the open source, distributed document store and search engine. It depends strongly on Apache Lucene, a full text search engine in Java. Keep a pulse on the performance of the Elasticsearch environment to ensure you are up to date with the internals of your working cluster.

This document details how to configure the Elasticsearch plugin and the monitoring metrics for providing in-depth visibility into the performance, availability, and usage stats of Elasticsearch clusters.

Performance Metrics

Active shards

The active_shards indicates the number of primary shards in your cluster. This is an aggregate total across all indices.

Intializing shards

The initializing_shards is the number of shards that are being freshly created.

Number of nodes/data nodes

The number of nodes/data nodes in the cluster is represented by the metric number_of_nodes and number_of_data_nodes respectively. Data nodes hold data and perform data related operations such as CRUD, search and aggregations.

Relocating shards

The relocating_shards is the number of shards that are currently moving from one node to another node.

Active primary shards

The active_primary_shards is an aggregate total of all shards across all indices, including replica shards.

Unassigned shards

From the initializing position, the shards move to a state of unassigned, as the master node starts to assign shards to the nodes in the cluster. The unassigned_shards exist in the cluster state, but can’t be found in the cluster itself. Being in the unassigned position for a long time could be a warning for an unstable cluster.

Cluster status

The status of the cluster is represented by Red: 0, Green: 1 and Yellow: 2. Cluster status in green means that all primary and replica shards are allocated. Being yellow indicates that atleast one replica shard is unallocated or missing. The cluster status being red means one or more primary shards have not been assigned.

JVM metrics

Elasticsearch runs on Java Virtual Machine (JVM) and one of the ways through which it uses the RAM on your nodes is via JVM heap. The metric jvm_mem_pool_old_used_perc is the average of each node's JVM memory usage (in percentage) of old generation in the Garbage Collection (GC). Metrics jvm_gc_old_coll_time and jvm_gc_old_coll_count give the GC time (in milli seconds) and count of old generation in all the nodes since last poll (5 minutes by default).

Memory and CPU usage

As Elasticsearch depends on the machine it is installed, it is critical to monitor CPU and memory usage. Monitoring CPU usage for each of your node types help in studying the distribution of workload between the nodes. Metrics including free (mem_free), used (mem_used), shared (shared_mem), resident (resident_mem), total virtual memory (virtual_mem) help to keep an eye on memory usage and understand how it loads and impacts the cluster.

Prerequisites

  • Ensure the Elasticsearch is installed in the server and it is up and running.
  • While installing the Elasticsearch plugin, create an empty JSON file 'counter.json' under the 'elasticsearch' directory.
  • Our Linux server monitoring agent should be installed in the network or on the specific host where the Elasticsearch cluster is running.
  • While adding a plugin, the plugin name and its folder name should be identical.

Plugin Installation

  • Download and install the latest version of the Site24x7 Linux agent in the server where you plan to run the plugin. If it is installed successfully, you will see a Linux server monitor in the Site24x7 Control Panel. This confirms that the agent is able to communicate with our data center.
  • To run this plugin on Windows, refer steps in this article.
  • Depending on your requirement, download the elasticsearch plugins from our GitHub repository - elasticsearch.py, elasticsearchcluster.py or elasticsearchnodes.py.
  • Change the values of HOST, USERNAME, PORT, PASSWORD to match your configurations. By default, proxy is not configured. You can also execute multiple configurations using a single plugin script. To do so, download the configuration file, for example, elasticsearch.cfg file for the Elasticsearch plugin from our GitHub repository and provide the configurations of your elasticsearch cluster.
  • Create a folder with the name 'elasticsearch' or 'elasticsearchcluster' or 'elasticsearchnodes', under the Site24x7 Linux agent plugin directory '/opt/site24x7/monagent/plugins/' and place the respective plugin files inside their respective folders.
  • Only for the elasticsearch plugin, create an empty JSON file, 'counter.json' and place it under '/opt/site24x7/monagent/plugins/elasticsearch'.
The agent will automatically execute the plugin within five minutes and send performance data to the Site24x7 data center.
Tip

Manually execute the plugin script using the following command and verify its output:

python elasticsearch.py

View Data in the Site24x7 Web Client

  1. Log in to Site24x7 and go to Server > Plugin Integrations > click on the plugin monitor.
  2. You will be able to view the performance charts on the various metrics for your Elasticsearch cluster.

Plugin Contribution

Feel free to contribute to our existing plugin and come up with suggestions or feedback on our Community.