Elasticsearch is a distributed, free and open search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured. Elasticsearch is built on top of the Apache Lucene project which means it borrows the general architecture. However, Elasticsearch comes with comprehensive tools and features that make it a revolutionary tool.

Let's dive in.

What's an Elasticsearch Index?

An Elasticsearch index is a collection of documents that are related to each other. Elasticsearch stores data as JSON documents. Each document correlates a set of keys (names of fields or properties) with their corresponding values (strings, numbers, Booleans, dates, arrays of values, geolocations, or other types of data).

The following diagram illustrates the equivalence of Elasticsearch architecture to relational database.

  • MySQL => Databases => Tables => Columns/Rows
  • Elasticsearch => Indices => Types => Documents with Properties

An Elasticsearch cluster can contain multiple Indices (databases), which in turn contain multiple Types (tables). These types hold multiple Documents (rows), and each document has Properties(columns).

Elasticsearch cat Indices API

Elasticsearch utilizes RESTful APIs extensively. Hence, we can use the cat API endpoint to get high-level information about the indices in a Elasticsearch cluster.

Request Syntax

The request syntax is as shown:

GET /_cat/indices/<target>
GET /_cat/indices
If the Elasticsearch security features are enabled, you must have the monitor or manage privilege on the target cluster.

Upon successful execution, the query should return information about the index including:

  • Shard count
  • Document count
  • Deleted document count
  • Primary store size
  • Total store size of all shards, including shard replicas

Query Parameters

The request accepts the following query parameters:

  1. bytes - (Optional) Unit used to display byte values.
  2. format - (Optional, string) Short version of the HTTP accept header. Valid values include JSON, YAML, etc.
  3. h - (Optional, string) Comma-separated list of column names to display.
  4. health - (Optional, string) Health status used to limit returned indices. Valid values are:green``yellow``redBy default, the response includes indices of any health status.
  5. help - (Optional, Boolean) If true, the response includes help information. Defaults to false.
  6. include_unloaded_segments - (Optional, Boolean) If true, the response includes information from segments that are not loaded into memory. Defaults to false.
  7. master_timeout - (Optional) Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error. Defaults to 30s.
  8. pri (primary shards) - (Optional, Boolean) If true, the response only includes information from primary shards. Defaults to false.
  9. s - (Optional, string) Comma-separated list of column names or column aliases used to sort the response.
  10. time - (Optional) Unit used to display time values.
  11. v - (Optional, Boolean) If true, the response includes column headings. Defaults to false.
  12. expand_wildcards - (Optional, string) Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are:
  13. allMatch any data stream or index, including hidden ones
  14. openMatch open, non-hidden indices. Also matches any non-hidden data stream.
  15. closedMatch closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
  16. hiddenMatch hidden data streams and hidden indices. Must be combined with open, closed, or both.
  17. noneWildcard patterns are not accepted.

Example

The following example returns all the indices in the cluster:

GET _cat/indices?v=true

cURL version is as shown:

curl -XGET "http://localhost:9200/_cat/indices?v=true" -H "kbn-xsrf: reporting"

The resulting output is as shown:

If you only want the index name, you can select the index column as shown:

GET _cat/indices?v=true&h=index

cURL version

curl -XGET "http://localhost:9200/_cat/indices?v=true&h=index" -H "kbn-xsrf: reporting"

Output:

index
earthquake
.ds-logs-enterprise_search.audit-default-2022.08.20-000001
.ds-logs-crawler-default-2022.08.20-000001
.ds-logs-enterprise_search.api-default-2022.08.20-000001
.ds-logs-app_search.analytics-default-2022.08.20-000001

Example 2

To get information for a specific index, we can run the query:

GET _cat/indices/.kibana?v=true

cURL version.

curl -XGET "http://localhost:9200/_cat/indices/.kibana?v=true" -H "kbn-xsrf: reporting"

The query above should return information about the kibana index as shown:

health status index             uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .kibana_8.3.3_001 np8zzpriSiSZm3xz2b2N9w   1   1        628           53      6.8mb          3.4mb

Conclusion

In this article, you learned how to use the Elasticsearch cat index API to retrieve information about the indices in the Elasticsearch cluster.

We hope you found this helpful. Feel free to leave a comment below or contact us.

See you in the next one!!

If you enjoy our content, please consider buying us a coffee to support our work:

Table of Contents
Great! Next, complete checkout for full access to GeekBits.
Welcome back! You've successfully signed in.
You've successfully subscribed to GeekBits.
Success! Your account is fully activated, you now have access to all content.
Success! Your billing info has been updated.
Your billing was not updated.