Re: 2024R2.0.1 most annoying bugs.
Posted: Mon Aug 11, 2025 11:03 am
Hello again Karel Novák.
A couple of notes on your situation: Increasing the maximum number of shards per instance, while feasible is not recommended. This is largely due to the heap size being limited to where it is. We default the heap size to half of the available RAM with a limit of 32GB. There is a performance related reason for this. With a 32GB Heap Size, it allows the Java Virtual Machine that runs OpenSearch to use 32-bit numbers for addresses instead of 64-bit numbers. 32GB requires 35 bits to store, but everything in a 64-bit world aligns on 8 byte boundaries so the lower 3 bits of that 35-bit number become irrelevant. The JVM Calls this "Compressed oops", and you can read more about it here:
https://wiki.openjdk.org/display/HotSpot/CompressedOops
We can't promise you what will happen if you change that; you're free to experiment with it but you may find you have more value in adding more instances to your cluster to accomodate the higher number of shards (as you note, the default is 1000 per instance, and that default is based on the 32GB maximum heap size recommendation). Your mileage may vary, and if you end up calling support, this is what they will tell you.
That said, these recommendations were put in place well over a decade ago, for performance reasons on the hardware of the day. Obviously the hardware has greatly improved in the time since then, so while the recommendations are still in place, the performance improvements may have made them obsolete, or at least reduced the impact of having larger heap sizes, and the recommendations just haven't caught up with that. At the moment I don't have access to the hardware resources to actually experiment with larger heap sizes and the consequences of them (I'd like to hear more about what you find).
I suspect that if you're like most users, you don't need live access to all of that data all the time. In that case you can go into Maintenance and Snapshots, and configure Nagios Log Server to close indexes older than x number of days, which will free up the number of open shards allowing you to index further data.
Since I mentioned adding more instances, I'd also like to mention that I'm giving a talk at the Nagios World Conference at the end of September on scaling your Log Server cluster. There's work in progress on assigning different roles to cluster instances, and my talk will be about these roles, how to change them within Nagios Log Server, and how to size them for their use. The overarching issue to track in the Changelog is NLS#576.
A couple of notes on your situation: Increasing the maximum number of shards per instance, while feasible is not recommended. This is largely due to the heap size being limited to where it is. We default the heap size to half of the available RAM with a limit of 32GB. There is a performance related reason for this. With a 32GB Heap Size, it allows the Java Virtual Machine that runs OpenSearch to use 32-bit numbers for addresses instead of 64-bit numbers. 32GB requires 35 bits to store, but everything in a 64-bit world aligns on 8 byte boundaries so the lower 3 bits of that 35-bit number become irrelevant. The JVM Calls this "Compressed oops", and you can read more about it here:
https://wiki.openjdk.org/display/HotSpot/CompressedOops
We can't promise you what will happen if you change that; you're free to experiment with it but you may find you have more value in adding more instances to your cluster to accomodate the higher number of shards (as you note, the default is 1000 per instance, and that default is based on the 32GB maximum heap size recommendation). Your mileage may vary, and if you end up calling support, this is what they will tell you.
That said, these recommendations were put in place well over a decade ago, for performance reasons on the hardware of the day. Obviously the hardware has greatly improved in the time since then, so while the recommendations are still in place, the performance improvements may have made them obsolete, or at least reduced the impact of having larger heap sizes, and the recommendations just haven't caught up with that. At the moment I don't have access to the hardware resources to actually experiment with larger heap sizes and the consequences of them (I'd like to hear more about what you find).
I suspect that if you're like most users, you don't need live access to all of that data all the time. In that case you can go into Maintenance and Snapshots, and configure Nagios Log Server to close indexes older than x number of days, which will free up the number of open shards allowing you to index further data.
Since I mentioned adding more instances, I'd also like to mention that I'm giving a talk at the Nagios World Conference at the end of September on scaling your Log Server cluster. There's work in progress on assigning different roles to cluster instances, and my talk will be about these roles, how to change them within Nagios Log Server, and how to size them for their use. The overarching issue to track in the Changelog is NLS#576.