WAN/LAN Bacchus Feenix Gemini Hydra Mercury Narvik Pixie Puffin Stamsund Torvik Disk °C How-to
Network Timekeeping CPU load Ecast I/O Ecast SNR Pkt loss Ecast Europe Europe Pkts FSY size Correlations DVB/USB

How to add these performance monitors

Information here covers memory & CPU loading, hard disk temperatures, and Cable Modem monitoring.  I have also used MRTG for monitoring signal levels and error rates on the EUMETCast satellite data service, and for monitoring timekeeping using NTP.  If you are using Windows Vista or Windows-7 you may first need to install and enable the Windows SNMP component.
  

Memory and CPU load

Once you have installed: SNMP Informant on each PC you wish to monitor, you can access the data directly from MRTG, as it has a specific SNMP object ID (OID), so the script fragments are as shown below. To keep the configuration file clean, I actually used include statements in the mrtg.cfg file, such as:

  Include: narvik-monitor.inc

As PC Narvik has two CPUs, there are two instances 48 and 49 listed in the [Target] line in the sample below.

Contents of narvik-monitor.inc

 
#---------------------------------------------------------------
# PC Narvik - Memory
#---------------------------------------------------------------

Target[Narvik-mem]: 1.3.6.1.4.1.9600.1.1.2.4.0&1.3.6.1.4.1.9600.1.1.2.1.0:public@narvik
MaxBytes[Narvik-mem]: 4000000000
Options[Narvik-mem]: integer, gauge, nopercent, growright, unknaszero
YLegend[Narvik-mem]: Memory
ShortLegend[Narvik-mem]: B
LegendI[Narvik-mem]: Used  
LegendO[Narvik-mem]: Avail  
Legend1[Narvik-mem]: Memory committed
Legend2[Narvik-mem]: Memory available
Title[Narvik-mem]: Narvik Memory
PageTop[Narvik-mem]: <H2>PC Narvik - Memory</H2>

#---------------------------------------------------------------
# PC Narvik - CPU load, dual-core CPU
#---------------------------------------------------------------

Target[Narvik-CPU]: 1.3.6.1.4.1.9600.1.1.5.1.5.1.48&1.3.6.1.4.1.9600.1.1.5.1.5.1.49:public@narvik
MaxBytes[Narvik-CPU]: 100
YLegend[Narvik-CPU]: CPU %
ShortLegend[Narvik-CPU]: %
LegendI[Narvik-CPU]: CPU 1
LegendO[Narvik-CPU]: CPU 2
Legend1[Narvik-CPU]: CPU 1 usage
Legend2[Narvik-CPU]: CPU 2 usage
Options[Narvik-CPU]: integer, gauge, nopercent, growright, unknaszero
Title[Narvik-CPU]: Narvik CPU
PageTop[Narvik-CPU]: <H2>PC Narvik - CPU load</H2>
# If PC Narvik were a single-core CPU, use two instances of object 48, as MRTG requires that 
# you have two variables returned.  You may also want to prevent display of the second output
# line by adding the "no-ouput" option (noo) to the Options line:
Target[Narvik-CPU]: 1.3.6.1.4.1.9600.1.1.5.1.5.1.48&1.3.6.1.4.1.9600.1.1.5.1.5.1.48:public@narvik
Options[Narvik-CPU]: integer, gauge, nopercent, growright, noo
# I found that on a lower-spec PC (Bacchus), returning the CPU twice caused an artificially
# high value to be returned for the second call (presumably the CPU busy processing the first
# request?!), so I actually changed to using the SNMP value: Maximum Number of Process Contexts
# i.e.  .1.3.6.1.2.1.25.1.7.0 (check this on your system using GetIF), which returns integer 0.
Target[Bacchus-CPU]: 1.3.6.1.4.1.9600.1.1.5.1.5.1.48&1.3.6.1.2.1.25.1.7.0:public@192.168.0.4
 

As this is my first attempt, any suggestions for improvements are welcome.  The only thing noticeably different is using OIDs in the [Target] line, as described here, and I used the GetIF program and the MIBs from SNMP Informant to work out what to monitor.  There are a lot more parameters available from the free SNMP Informant.  I added the unknaszero option so that when the PC is offline, the zero CPU and memory usage are clearly visible.

All running under Windows, including Vista! Here's some current data:

Memory
Used
Free
CPU 1 & 2
usage

 

Disk space usage

Windows since Windows 2000 has included basic performance monitoring counters which include a set for disk usage measurement.  For each disk, there are at least three basic values available: disk size, disk used, and disk space units.  Having the units specified separately means that to get the disk used in bytes you must multiply the disk-used by the disk-allocation-units.  Fortunately, MRTG allows a target to be specified as A * B, so that's not a problem.  What is slightly more of an issue is the variable number of disks, which means that as disks come and go on your system - even plugging in a USB memory stick, for example - the index in the table of disks of a particular drive may vary, at least if you have a RAMdisk or extra partition with a high drive-letter - T: or Z:, for example.  I haven't discovered a way of fixing this so far, but that may just be my ignorance of using SNMP!

How to determine the drive index

You need to "walk the MIB" for the PC in question.  To do this, download a program such as GetIF version 2.3.1 which allows this - direct download link.  Having installed GetIF, open the program, enter the PC's name or IP address into the Host Name box, and press the Start button.

Now move to the MBrowser tab, and the string:

  .iso.org.dod.internet.mgmt.mib-2.host.hrStorage

in the first box (to replace .iso), and press the Start button.  You should see the list box window at the bottom populate with a set of values, 92 values in the example below:

You can now scroll down the list of values to find the index for the name of the monitored volume (drive T:) in this case.  Click on the entry is the list, and its description and value will appear.

where the OID (object ID) is given in the last line as ".1.3.6.1.2.1.25.2.3.1.3.10", so the index is "10", and you can now scroll further down to find the corresponding allocation units (".1.3.6.1.2.1.25.2.3.1.4.10") and storage used (".1.3.6.1.2.1.25.2.3.1.6.10").  

  

  

So the storage used in bytes is the product of the numbers returned from these two values (4096 * 33933 = 138989568 bytes, 133.55MB).

How to use the values in SNMP monitoring

As MRTG requires two values to be returned by the Target string (at least I think it does....), you need to add a second OID so that two values are returned.  I simply used a value which returns and integer on my system.  I don't know whether you could just put a zero.  Please tell me if you know better!  So the Target is:

  OID1 & OID2 * OID1 & OID2.

Please note that to stop the Web page becoming excessively wide, I have shown the Target line below as:

  <A> *
  <B>

whereas it should be written on a single line since MRTG doesn't allow continuation lines (as far as I know):

  <A> * <B>

Contents of gemini-disk-used.inc

Target[Gemini-fsy]: 1.3.6.1.2.1.25.1.7.0&.1.3.6.1.2.1.25.2.3.1.6.10:public@gemini * 
    1.3.6.1.2.1.25.1.7.0&.1.3.6.1.2.1.25.2.3.1.4.10:public@gemini

MaxBytes[Gemini-fsy]: 40000000000
Options[Gemini-fsy]: integer, gauge, nopercent, growright, unknaszero, noi
# Unscaled[Gemini-fsy]: dwmy
YLegend[Gemini-fsy]: Temp disk used
ShortLegend[Gemini-fsy]: B
LegendO[Gemini-fsy]: Size  &nbsp;
Legend2[Gemini-fsy]: Temp disk used
Title[Gemini-fsy]: Gemini - 40GB temp disk
PageTop[Gemini-fsy]: <H2>PC Gemini - Temp disk used</H2

Sample Results of disk space monitoring


   

Monitoring disk temperature

If you are fortunate enough to have a PC which is supported by the Mother Board Monitor program, you can just use that and add the appropriate SNMP objects as described above.  My PCs did not support MBM, so I wrote a small program which accesses the S.M.A.R.T. data provided by some hard disks and the BIOS.  Not all PCs do this, and not all PCs make all of the data accessible.  To test your PC, download the DiskTemp.exe program, and run it from the command-line.  You should see four lines like:

C:\>DiskTemp.exe
30
33
0
0

C:\>

So as the program returns the disk temperatures, you can plot it in MRTG like this, using the ability of MRTG to read the output of a command-line program.

Contents of narvik-disk-temp.inc

#---------------------------------------------------------------
# PC Narvik - disk temperatures
#---------------------------------------------------------------

Target[narvik_disk_temp]: `DiskTemp`
MaxBytes[narvik_disk_temp]: 100
MaxBytes2[narvik_disk_temp]: 100
Title[narvik_disk_temp]: Disk temperatures for PC Narvik
Options[narvik_disk_temp]: integer, gauge, nopercent, growright, unknaszero
YLegend[narvik_disk_temp]: Temperature °C
ShortLegend[narvik_disk_temp]: °C
Legend1[narvik_disk_temp]: Disk 0 temperature in °C
Legend2[narvik_disk_temp]: Disk 1 temperature in °C
LegendI[narvik_disk_temp]: Disk 0:
LegendO[narvik_disk_temp]: Disk 1:
PageTop[narvik_disk_temp]: <H1>PC Narvik -- Disk Temperatures</H1>

Here's some current data:
PC Narvik
Disk temperature

750GB Samsung HD753LJ
1TB Samsung HD103SI

Here's another example, showing what happened when I replaced a 750GB 7200rpm standard disk with a 1TB "eco" disk spinning at just 5400rpm, and with a slower seek speed.  While the green line is more or less constant allowing for the daily temperature changes, the blue line showing the second disk on PC Narvik has dropped significantly from being a few degrees above the 750GB disk to being a degree or three below.  A lower working temperature should produce greater reliability, and it's a few watts less power consumption.  Performance of the PC appears to be unaffected.
The horizontal lines from before 0800 to after 1100 were the nonexistent values while the PC was powered down for the disk clone.  After seeing those misleading values, I added unknaszero to the options shown above.

Vista and Windows-7

I found that the code I used required to be run in Administrator mode with Windows Vista and Windows-7, which meant that I could not start MRTG automatically at startup.  I decided that the best way to work round this problem was to write a separate program (DiskTemperatureReporter) to read the disk temperatures, and then deposit a file in the \MRTG\bin\ directory so that MRTG could read the data with the configuration:

Target[stamsund_disk_temp]: `type disk-temps.dat`

This enabled me to capture what was possibly the hottest May day ever in Edinburgh - 2010 May 23 - and the next couple of days for comparison!  Here's the screen-shot The DiskTemperatureReporter program is unpublished, but available on an as-is basis by e-mail request.  It needs to be started by hand when the user logs into the PC.

 

Cable Modem Signal levels

I found that the Motorola Cable Modem I have happens to report its signal levels via SNMP, provided you know the right object ID (OID).  I'm not sure where I found this data from, but you might want to search Google with "snmp oid docsIfDownChannelPower" or look here:

http://www.oidview.com/mibs/0/DOCS-IF-MIB.html

http://support.ipmonitor.com/mibs/DOCS-IF-MIB/item.aspx?id=docsIfDownChannelPower

The only thing of note is that I put the IP address for the cable modem into my Hosts file as "cm-hfc", since my ISP only provides a dynamic IP.

Contents of: cable-modem.inc

#---------------------------------------------------------------
# Cable modem RF signal levels
#---------------------------------------------------------------

Target[CM-levels]: 1.3.6.1.2.1.10.127.1.1.1.1.6.3&1.3.6.1.2.1.10.127.1.2.2.1.3.2:public@cm-hfc / 10
AbsMax[CM-levels]: 70
MaxBytes[CM-levels]: 70
Title[CM-levels]: Motorola SB5101E Cable Modem - RF Signal Levels
Options[CM-levels]: integer, gauge, nopercent, growright, unknaszero
ShortLegend[CM-levels]: dBmV
YLegend[CM-levels]: dBmV
LegendO[CM-levels]: Transmit level&nbsp;
LegendI[CM-levels]: Received level&nbsp;
Legend2[CM-levels]: Transmit (upstream) level: +30..+56dBmV is OK
Legend1[CM-levels]: Received (downstream) level: -10..+10dBmV is OK
PageTop[CM-levels]: <H2>Motorola SB5101E Cable Modem - RF Signal Levels</H2>

#---------------------------------------------------------------
# Cable modem SNR
#---------------------------------------------------------------

Target[CM-SNR]: 1.3.6.1.2.1.10.127.1.1.4.1.5.3&1.3.6.1.2.1.10.127.1.1.4.1.5.3:public@cm-hfc / 10
AbsMax[CM-SNR]: 70
MaxBytes[CM-SNR]: 70
Title[CM-SNR]: Motorola SB5101E Cable Modem - SNR & bandwidth
Options[CM-SNR]: integer, gauge, nopercent, growright, unknaszero, noo
ShortLegend[CM-SNR]: dB
YLegend[CM-SNR]: dB
LegendI[CM-SNR]: RX SNR&nbsp;
Legend1[CM-SNR]: Received SNR (dB)
PageTop[CM-SNR]: <H2>Motorola SB5101E Cable Modem - SNR</H2>

#---------------------------------------------------------------

  

Monitoring network I/O

I haven't said anything about monitoring network I/O as this is already covered in the MRTG documentation.

I did happen to capture this screenshot, showing just how useful having monitoring on your PCs can be.  In this case, I had changed the firewall on a PC, and suddenly the network input to all devices on the network shot up.  Checking with wireshark on a laptop PC, the software was sending out 254 ARP packets every 120 seconds, and each packet was the maximum wire size (1514 bytes).  Other ARP packets were either 42 or 60 bytes!  I've report this to the developers.

 


Acknowledgements: the SNMP work was triggered by an e-mail exchange with Lonni J Friedman who asked how I got MRTG working under Vista (answer: Run As Administrator, having added SNMP and allowed it through the firewall), but who had the performance monitoring working under Windows XP!  Steve Catto first introduced me to MRTG - thanks Steve!

Update: I just found this PDF document which covers monitoring a Windows system with MRTG.

 
Copyright © David Taylor, Edinburgh Last modified: 2010 Aug 02 at 11:49:49