The problem is finally found, and reproducable. A rrd file with a malformed name causes it, as indicated in
this thread.
Nagiosgraph generates rrd databases, using the pattern
$yourservicedescription___$perfdatalabel to name them. A service named PING, i. e., returns perfdata with the label "rta" and a value, and "pl" and a value. Consequently, two files named
PING___rta.rrd and
PING___pl.rrd are created.
For reasons I haven't worked out yet, one of my checks returned the value
NET-SNMP-EXTEND-MIB::nsExtendOutput1Line."exq", and consequently a file named
<$myservicedescription>___NET-SNMP-EXTEND-MIB%3A%3AnsExtendOutput1Line.%22exq%22.rrd was created. It's the name of the file, not its contents, that causes nagiosgraph.js to drop out without any message (apart from stopping to work properly and displaying the infamous "JavaScript is disabled..." bar).
Removing the check and restarting nagios does not help - as long as the file is present, the js component malfunctions. It can be fixed by simply removing it, but it will be re-created with the next run.
By adding a label statement to the check command like
-l anylabel I'm getting a file named
<$myservicedescription>___anylabel.rrd. It has the same content, but a different name. The problem does not show up then.
Generally, nagiosgraph does a good job in replacing/masking non-standard characters (blanks, slashes, backslashes, non-7bit-ascii symbols). I figure it's the
", masked by
%22, that knocks the js component out.
The problem can be circumvented by always labeling check results to make sure that no weird symbols/characters end up in the rrd filenames.
As stated before, I have never seen any error messages or suspicious log entries - nagiosgraph.js just silently malfunctions. I am not aware of any documentation (nagiosgraph in general isn't very well documented and seems to be dormant, the latest version dating back to mid-2015) that explains file naming.
So - use sensible labels, don't go with the default.
I stumbled over the solution (I had almost given up on it) while setting up a new machine and copying the checkfiles one after the other from the old to the new machine. Even in a medium-sized environment, your rrd files quickly add up to a thousand and more, probably containing loads of
%20s in their names. Good luck on finding that when you don't even know where to begin.
And another thing: the mere presence of the file causes the problem, even when you have no window open that would display/call it, and no active check that adds to it. I take it that the whole rrd tree is constantly parsed - it shall be interesting to watch the load as the databases grow in size and number. Again - no docu how this thing actually works, unless you're a Java programmer...
Anyway - thanks a lot for everybody's time and patience, especially in light of the fact that this isn't even an actual nagios problem.