Elasticsearch good practices

When I was working at Nviso, I was in charge of engineering the Elasticsearch clusters. It’s a fabulous technology and open source project with very strong features, scalability and performance.

However, you’ll only get the full benefit if you follow the best practices from the beginning of your project. If you don’t take some precautions you can quickly get into trouble or even in an unrecoverable state. I invite you to read these interesting article I wrote for Nviso before diving into an cluster deployment..

Reduce the number of shards

I have discovered many Elasticsearch clusters in a very bad state in my past experience. For most of them, it turns out the sharding configuration didn’t fit neither the data nor the purposes. In the time I wrote the article, it was difficult to modify a sharding configuration but still doable and I took the challenge! I invite you to read the outcome and the summary of my work:

Optimizing Elasticsearch for security log collection – part 1: reducing the number of shards

Index Lifecycle Management

Once the cluster recovers from an unstable state, I needed to make sure to not fallback again in an undesired situation. Back in time, there was a brand new feature released by Elastic called “Index Lifecycle Management” which aims at keeping your shards under control. This feature is particularly interesting for time series data which is eventually deleted according to a defined life-cycle:

Optimizing Elasticsearch – Part 2: Index Lifecycle Management

Elasticsearch is evolving very fast, it started as a search engine on top of lucene for developers. It is now a complete SIEM with endpoints solution. The content of these blog posts might not be true anymore. It was for me a good lesson on how to configure it correctly to not end in difficult situations.