From Yellow to Green: How to Achieve a Healthy Elasticsearch Cluster

I often see ElasticSearch booted into single-node mode and show cluster health in a "Yellow" mode. I want to explain why that is happening today and what we can do to prevent/fix that, even in single-node mode.

Elasticsearch is a highly scalable, open-source search and analytics engine built on top of Apache Lucene. It is designed for horizontal scalability and near real-time search and analytics capabilities. Elasticsearch is commonly used for various use cases, including log and event data analysis, full-text search, and business analytics. Its powerful querying capabilities and distributed nature make it a preferred choice for handling large datasets and complex search operations.

Cluster Health:

In Elasticsearch, data is distributed across multiple nodes in a cluster to ensure high availability, fault tolerance, and load balancing. Cluster health status indicates the overall well-being and operational status of an Elasticsearch cluster. The cluster health can be one of three states:

Green: All primary and replica shards are allocated. This is the optimal state, indicating that the cluster is fully functional with redundancy.

Yellow: All primary shards are allocated, but some or all replica shards are not. While the cluster is operational, it lacks redundancy, which means it is vulnerable to data loss if a node fails.

Red: One or more primary shards are unassigned, leading to potential data loss and unavailability of some data.

Maintaining a green cluster health status is crucial because it ensures that your data is not only available but also redundant, providing resilience against node failures and ensuring high availability of your services. A healthy cluster also optimizes performance and reliability, which are essential for search and analytics operations.

Now, guess you have a Magento 2 installation, you reindex but when you execute following command your Cluster state (yes, even in a single node) is always Yellow.

Check Current Cluster Health

curl -X GET "localhost:9200/_cluster/health?pretty"  

We can Analyze Indices and Shards using following command

curl -X GET "localhost:9200/_cat/indices/?pretty"  

To bring your Elasticsearch cluster to a green state from the current yellow state, you need to address the issue of unassigned shards. In your case, this typically occurs because your cluster has only one data node, which means there is no other node to allocate replica shards to.

Steps to Bring the Cluster to a Green State

Idea 1:

  • Add More Data Nodes:
    The best way to achieve a green state is to add more data nodes to the cluster to properly allocate replicas. Here's a quick guide on how to add a node:

1) Install Elasticsearch on another server.
2) Configure the new node to join the existing cluster by setting the same cluster.name in the elasticsearch.yml file.
3) Set the node.name to a unique name.
4) Ensure network settings (network.host, discovery.seed_hosts, etc.) are correctly configured to allow the nodes to communicate.
5) Start Elasticsearch on the new server.

Idea 2 (and stay on the single node):

  • Adjust Replica Settings: If adding more nodes is not feasible, you can adjust the replica settings to 0 for indices with unassigned replicas. This will bring the cluster to a green state, but you will lose redundancy. Here's how you can do it:
# Set the number of replicas to 0 for a specific index
curl -X PUT "localhost:9200/<index_name>/_settings" -H 'Content-Type: application/json' -d'  
{
  "index": {
    "number_of_replicas": 0
  }
}'

The next step is to verify Cluster Health

After making these changes, verify the cluster health:

curl -X GET "localhost:9200/_cluster/health?pretty"  

Boom! It's in the green state:

{"acknowledged":true}[root@server ~]# curl -X GET "localhost:9200/_cluster/health?pretty"
{
  "cluster_name": "magento",
  "status": "green",
  "timed_out": false,
  "number_of_nodes": 1,
  "number_of_data_nodes": 1,
  "active_primary_shards": 7,
  "active_shards": 7,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 0,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 100.0
}
[root@server ~]# curl -X GET "localhost:9200/_cat/indices/?pretty"
green open .geoip_databases                                          jMYDTc1NRZuQDZ9tGtyGSQ 1 0   34    34 31.3mb 31.3mb  
green open xyz_production_22052024_amasty_elastic_popup_data_1_v2 VoF-bkUDQEef-HeFzS-4og 1 0  196     0 91.4kb 91.4kb  
green open xyz_production_22052024_product_1_v2                   e7xmqI3XTgWprQB4EWuD8g 1 0 4352 64006 36.1mb 36.1mb  

By following these steps, you should be able to bring your Elasticsearch cluster to a green state, ensuring that all shards are allocated and there are no unassigned shards. Now, index names can change, so I suggest extending core functionalities to add another function, which would be helpful each time indexing is running.

I hope this article helps. Good luck!