Elasticsearch is a NoSQL search and analytics engine that is easy to get started using for log analytics, text search, real-time analytics and more. That said, under the hood Elasticsearch is a complex, distributed system with many levers to pull to achieve optimal performance.
In this blog, we walk through solutions to common Elasticsearch performance challenges at scale including slow indexing, search speed, shard and index sizing, and multi-tenancy. Many solutions originate from interviews and discussions with engineering leaders and architects who have hands-on experience operating the system at scale.
When dealing with workloads that have a high write throughput, you may need to tune Elasticsearch to increase the indexing performance. We provide several best practices for having adequate resources on-hand for indexing so that the operation does not impact search performance in your application:
Increase the refresh interval: Elasticsearch makes new data available for searching by refreshing the index. Refreshes are set to automatically occur every second when an index has received a query in the last 30 seconds. You can increase the refresh interval to reserve more resources for indexing.
Use the Bulk API: When ingesting large-scale data, the indexing time using the Update API has been known to take weeks. In these scenarios, you can speed up the indexing of data in a more resource-efficient way using the Bulk API. Even with the Bulk API, you do want to be aware of the number of documents indexed and the overall size of the bulk request to ensure it does not hinder cluster performance. Elastic recommends benchmarking the bulk size and as a general rule of thumb is 5-15 MB/bulk request.
Increase index buffer size: You can increase the memory limit for outstanding indexing requests to above the default value of 10% of the heap. This may be advised for indexing-heavy workloads but can impact other operations that are memory intensive.
Disable replication: You can set replication to zero to speed up indexing but this is not advised if Elasticsearch is the system of record for your workload.
Limit in-place upserts and data mutations: Inserts, updates and deletes require entire documents to be reindexed. If you are streaming CDC or transactional data into Elasticsearch, you might want to consider storing less data because then there’s less data to reindex.
Simplify the data structure: Keep in mind that using data structures like nested objects will increase writes and indexes. By simplifying the number of fields and the complexity of the data model, you can speed up indexing.
When your queries are taking too long to execute it may mean but you need to simplify your data model or remove query complexity. Here are a few areas to consider:
Create a composite index: Merge the values of two low cardinality fields together to create a high cardinality field that can be easily searched and retrieved. For example, you could merge a field with zipcode and month, if these are two fields that you are commonly filtering on for your query.
Enable custom routing of documents: Elasticsearch broadcasts a query to all the shards to return a result. With custom routing, you can determine which shard your data resides on to speed up query execution. That said, you do want to be on the lookout for hotspots when adopting custom routing.
Use the keyword field type for structured searches: When you want to filter based on content, such as an ID or zipcode, it is recommended to use the keyword field type rather than the integer type or other numeric field types for faster retrieval.
Move away from parent-child and nested objects: Parent-child relationships are a good workaround for the lack of join support in Elasticsearch and have helped to speed up ingestion and limit reindexing. Eventually, organizations do hit memory limits with this approach. When that occurs, you’ll be able to speed up query performance by doing data denormalization.
Many scaling challenges with Elasticsearch boil down to the sharding and indexing strategy. There’s no one size fits all strategy on how many shards you should have or how large your shards should be. The best way to determine the strategy is to run tests and benchmarks on uniform, production workloads. Here’s some additional advice to consider:
Use the Force Merge API: Use the force merge API to reduce the number of segments in each shard. Segment merges happen automatically in the background and remove any deleted documents. Using a force merge can manually remove old documents and speed up performance. This can be resource-intensive and so should not happen during peak usage.
Beware of load imbalance: Elasticsearch does not have a good way of understanding resource utilization by shard and taking that into account when determining shard placement. As a result, it’s possible to have hot shards. To avoid this situation, you may want to consider having more shards than data notes and smaller shards than data nodes.
Use time-based indexes: Time-based indexes can reduce the number of indexes and shards in your cluster based on retention. Elasticsearch also offers a rollover index API so that you can rollover to a new index based on age or document size to free up resources.
The most common strategies for multi-tenancy are to have one index per customer or tenant or to use custom routing. Here's how you can weigh the strategies for your workload:
Index per customer or tenant: Configuring separate indexes by customer works well for companies that have a smaller user base, hundreds to a few thousand customers, and when customers do not share data. It's also helpful to have an index per customer if each customer has their own schema and needs greater flexibility.
Custom routing: Custom routing enables you to specify the shard on which a document resides, for example customer ID or tenant ID, to specify the routing when indexing a document. When querying based on a specific customer, the query will go directly to the shard containing the customer data for faster response times. Custom routing is a good approach when you have a consistent schema across your customers and you have lots of customers, which is common when you offer a freemium model.
Elasticsearch is designed for log analytics and text search use cases. Many organizations that use Elasticsearch for real-time analytics at scale will have to make tradeoffs to maintain performance or cost efficiency, including limiting query complexity and the data ingest latency. When you start to limit usage patterns, your refresh interval exceeds your SLA or you add more datasets that need to be joined together, it may make sense to look for alternatives to Elasticsearch.
Rockset is one of the alternatives and is purpose-built for real-time streaming data ingestion and low latency queries at scale. Learn how to migrate off Elasticsearch and explore the architectural differences between the two systems.