In the previous tutorial, we introduced you to Elasticsearch, Logstash, and Kibana. You can check out that tutorial here:
NOTE: We are assuming that you are not entirely new to Elasticsearch. Hence you should know what is an index, be a little familiar with the Elasticsearch REST API, etc.
If you are not, check our tutorials on the topic to learn more.
What is an Elasticsearch Shard?
An Elasticsearch shard refers to a self-contained index that contains a subsection of the data in a given index.
When you create an index in Elasticsearch, you can specify the number of shards that you want the index to have. Elasticsearch will then distribute the index's data across the specified shards, allowing it to scale horizontally across many machines.
Each shard is a fully functional and independent index. This means we can query the data from the shard independently or manage the shard as a standalone unit.
However, it is good to keep in mind that all the shards of an index are treated as a single logical index. This means you can search the data in all the shards in a single unit.
Elasticsearch Create Shard
As mentioned, we specify the number of shards during index creation. Once the number of shards has been specified, it cannot be changed in the lifetime of the index.
There are some ways to accomplish as demonstrated in our tutorial.
Consider the following request:
The example request above will create an index called "blog" with 3 shards and 2 replicas. The number of replicas specifies how many copies of each shard should be created, for a total of 3 * 2 = 6 shards. The replicas are used to provide high availability and improve search performance by allowing searches to be executed on multiple copies of the data.
Note that this request will create the index with the default settings for other index-level settings, such as the mapping and analysis settings.
Elasticsearch List Shards
To list shards in Elasticsearch, we can use the
cat API which allows us to query various features of an Elasticsearch cluster.
To show the shards of a given index, we can use the request synatx as shown:
target represent the name of the target index, datat stream, or alias.
To show all the shards in the cluster, run the command:
blog 0 p STARTED 622b docker-cluster 1.7gb 10.3.2.8 docker-node-2
blog 1 p STARTED 622b docker-cluster 1.7gb 10.3.2.8 docker-node-2
blog 2 p STARTED 622b docker-cluster 1.7gb 10.3.2.8 docker-node-2
blog 0 r UNASSIGNED
blog 1 r UNASSIGNED
blog 2 r UNASSIGNED
Each line in the response represents a shard, with the following fields:
- index name
- shard number
- p for primary or r for replica
- shard state (e.g., STARTED, UNASSIGNED)
- id of the shard
- node name
- store size
- node IP address
This information can be helpful for troubleshooting and monitoring the health of your Elasticsearch cluster.
In this short tutorial, you learned how you can use the Elasticsearch
cat API to get the information about the shards of a given index.
We hope you enjoyed this tutorial.