Hello again Karel Novák.
A couple of notes on your situation: Increasing the maximum number of shards per instance, while feasible is not recommended. This is largely due to the heap size being limited to where it is. We default the heap size to half of the available RAM with a limit of 32GB. There is a performance related reason for this. With a 32GB Heap Size, it allows the Java Virtual Machine that runs OpenSearch to use 32-bit numbers for addresses instead of 64-bit numbers. 32GB requires 35 bits to store, but everything in a 64-bit world aligns on 8 byte boundaries so the lower 3 bits of that 35-bit number become irrelevant. The JVM Calls this "Compressed oops", and you can read more about it here:
https://wiki.openjdk.org/display/HotSpot/CompressedOops
We can't promise you what will happen if you change that; you're free to experiment with it but you may find you have more value in adding more instances to your cluster to accomodate the higher number of shards (as you note, the default is 1000 per instance, and that default is based on the 32GB maximum heap size recommendation). Your mileage may vary, and if you end up calling support, this is what they will tell you.
That said, these recommendations were put in place well over a decade ago, for performance reasons on the hardware of the day. Obviously the hardware has greatly improved in the time since then, so while the recommendations are still in place, the performance improvements may have made them obsolete, or at least reduced the impact of having larger heap sizes, and the recommendations just haven't caught up with that. At the moment I don't have access to the hardware resources to actually experiment with larger heap sizes and the consequences of them (I'd like to hear more about what you find).
I suspect that if you're like most users, you don't need live access to all of that data all the time. In that case you can go into Maintenance and Snapshots, and configure Nagios Log Server to close indexes older than x number of days, which will free up the number of open shards allowing you to index further data.
Since I mentioned adding more instances, I'd also like to mention that I'm giving a talk at the Nagios World Conference at the end of September on scaling your Log Server cluster. There's work in progress on assigning different roles to cluster instances, and my talk will be about these roles, how to change them within Nagios Log Server, and how to size them for their use. The overarching issue to track in the Changelog is NLS#576.
2024R2.0.1 most annoying bugs.
- jmichaelson
- Posts: 375
- Joined: Wed Aug 23, 2023 1:02 pm
Re: 2024R2.0.1 most annoying bugs.
Please let us know if you have any other questions or concerns.
-Jason
-Jason
-
[email protected]
- Posts: 15
- Joined: Wed May 07, 2025 7:53 am
Re: 2024R2.0.1 most annoying bugs.
Hello Jason.
first of all, i am not java specialist, so i used OpenSearch pages and forums as reference manual.
In this pages i found out, 1000 shards are default. Version 2.14, that is used by nagios, can use maximum 4000 shards per instance.
Yes, they said, you can set this, but you must be carefull to memory usage. In general you could use counting 16GB per 1000 shards.
This nuber is dependent on shard size, amout of reads, etc...
So, i used simple logic. Our current server is a dedicated "big beast", with many cores and big memory. On our old "smaller" server with old version of NLS we have opened 1100 shards, without any problems and the NLS does not limit to 32GB for heap memory.
I set 3000 shards as limit in OpenSearch DB, and rase the memory to
256GB. It could be enough for worst case and some overhead. and we are below limitation.
Result:
1. We have 1100 shards opened
2. server use almost all of 1TB memory.
3. CPU is lower than 10% usage. (With installed antivirus.)
4. We can have open 90days. (We must for audits)
5. We do not have any performance issue. (Web portal is faster now)
Yes i understand. Our deployment is little unusual and if we will have some problems with performace, we need set shards and memory limit lower. This information is clear from Opensearch documentation. And our hardware is on maximum, that OpenSearch can utilize. So if we need more power, we must add another nod.
In basic numbers, we have 100k logs per 15 minutes and store 10GB per day from 140 clients. And these numbers will grow up in future.
I can't findout, if our system is small, big or something between.
BR
first of all, i am not java specialist, so i used OpenSearch pages and forums as reference manual.
In this pages i found out, 1000 shards are default. Version 2.14, that is used by nagios, can use maximum 4000 shards per instance.
Yes, they said, you can set this, but you must be carefull to memory usage. In general you could use counting 16GB per 1000 shards.
This nuber is dependent on shard size, amout of reads, etc...
So, i used simple logic. Our current server is a dedicated "big beast", with many cores and big memory. On our old "smaller" server with old version of NLS we have opened 1100 shards, without any problems and the NLS does not limit to 32GB for heap memory.
I set 3000 shards as limit in OpenSearch DB, and rase the memory to
256GB. It could be enough for worst case and some overhead. and we are below limitation.
Result:
1. We have 1100 shards opened
2. server use almost all of 1TB memory.
3. CPU is lower than 10% usage. (With installed antivirus.)
4. We can have open 90days. (We must for audits)
5. We do not have any performance issue. (Web portal is faster now)
Yes i understand. Our deployment is little unusual and if we will have some problems with performace, we need set shards and memory limit lower. This information is clear from Opensearch documentation. And our hardware is on maximum, that OpenSearch can utilize. So if we need more power, we must add another nod.
In basic numbers, we have 100k logs per 15 minutes and store 10GB per day from 140 clients. And these numbers will grow up in future.
I can't findout, if our system is small, big or something between.
BR
MITSUBISHI ELECTRIC AUTOMOTIVE CZECH s.r.o.
Karel Novák Infrastructure Architect & Systems Engineer, Information Technologies department
Karel Novák Infrastructure Architect & Systems Engineer, Information Technologies department
-
[email protected]
- Posts: 15
- Joined: Wed May 07, 2025 7:53 am
Re: 2024R2.0.1 most annoying bugs.
Hi everyone.
I found out "workaround" for 7 bug, if anyone have same issue.
"7. When Non-admin user open public dashboard, after few seconds page refresh to blank. Tested on two accounts.
After change rights to administrator, page works properly."
In first place, the issue is mainly for AD accounts and its contain two bugs.
1. If you migrate accounts from 1.3.X nagios LS, the accounts has been corrupted and you cannot change default dashboard at all.
If your default dasboard not exist in new LS or has been renamed you have got error mesage and you has been redirected to home page.
This cannot be fixed anyway and you must delete user account and create it again.
2. If you import account from old LS or if you import account from AD, you could have this problem.
Any dashboard, will be refresh to blank after few seconds. It can be fixed by opening broken user account to edit and save again.
Its worked with or without any changes.
BR
Karel
I found out "workaround" for 7 bug, if anyone have same issue.
"7. When Non-admin user open public dashboard, after few seconds page refresh to blank. Tested on two accounts.
After change rights to administrator, page works properly."
In first place, the issue is mainly for AD accounts and its contain two bugs.
1. If you migrate accounts from 1.3.X nagios LS, the accounts has been corrupted and you cannot change default dashboard at all.
If your default dasboard not exist in new LS or has been renamed you have got error mesage and you has been redirected to home page.
This cannot be fixed anyway and you must delete user account and create it again.
2. If you import account from old LS or if you import account from AD, you could have this problem.
Any dashboard, will be refresh to blank after few seconds. It can be fixed by opening broken user account to edit and save again.
Its worked with or without any changes.
BR
Karel
MITSUBISHI ELECTRIC AUTOMOTIVE CZECH s.r.o.
Karel Novák Infrastructure Architect & Systems Engineer, Information Technologies department
Karel Novák Infrastructure Architect & Systems Engineer, Information Technologies department
-
[email protected]
- Posts: 15
- Joined: Wed May 07, 2025 7:53 am
Re: 2024R2.0.1 most annoying bugs.
18. Licence page shows whole licence key, not only last digits as usual...
Last edited by [email protected] on Wed Sep 24, 2025 4:39 am, edited 2 times in total.
MITSUBISHI ELECTRIC AUTOMOTIVE CZECH s.r.o.
Karel Novák Infrastructure Architect & Systems Engineer, Information Technologies department
Karel Novák Infrastructure Architect & Systems Engineer, Information Technologies department
-
[email protected]
- Posts: 15
- Joined: Wed May 07, 2025 7:53 am
Re: 2024R2.0.1 most annoying bugs.
9. Some of password txt dialogs make text visible. (Add domain user as example) update.
Proxy user and pasword still be visible.
Proxy user and pasword still be visible.
MITSUBISHI ELECTRIC AUTOMOTIVE CZECH s.r.o.
Karel Novák Infrastructure Architect & Systems Engineer, Information Technologies department
Karel Novák Infrastructure Architect & Systems Engineer, Information Technologies department
-
[email protected]
- Posts: 15
- Joined: Wed May 07, 2025 7:53 am
Re: 2024R2.0.1 most annoying bugs.
19. Panel Type "Terms" with Chart Type "Table".
If you click on row, the filter has not be created.
With another Chart Type, filter is created sucessufully, if you click on bar or section.
If you click on row, the filter has not be created.
With another Chart Type, filter is created sucessufully, if you click on bar or section.
MITSUBISHI ELECTRIC AUTOMOTIVE CZECH s.r.o.
Karel Novák Infrastructure Architect & Systems Engineer, Information Technologies department
Karel Novák Infrastructure Architect & Systems Engineer, Information Technologies department