SUN zfs storage 7320 monitoring using net-snmp and mrtg
This article is going to talk about zfs storage 7320 monitoring using net-snmp and mrtg. Although the monitored system is sun zfs storage 7320, you’ll find the main idea of this article can be applied to many different system monitoring, including but not limited to cpu usage/network/bandwidth/disk/temperature of cisco switches, other linux systems and even windows systems.
As net-snmp extending agent functionality is not supported on sun zfs storage 7320 which is solaris 11 express system, so I’m going to monitor sun zfs storage through using of one linux snmp client by writing monitoring scripts on that linux client rather than on zfs itself.
Now, here goes the steps:
Part 1 – set up snmp client and mrtg on a linux host
yum -y install gcc-* gd-* libpng-* zlib-* httpd
yum -y install net-snmp* net-snmp-libs lm_sensors lm_sensors-devel
yum -y install mrtg
cp /etc/snmp/snmpd.conf{,.bak}
echo “rocommunity public” > /etc/snmp/snmpd.conf
mkdir -p /etc/mrtg
chkconfig –level 2345 snmpd on
service snmpd start
Ensure snmpd is listening:
netstat -tunlp |grep snmp
tcp 0 0 127.0.0.1:199 0.0.0.0:* LISTEN 26427/snmpd
udp 0 0 0.0.0.0:161 0.0.0.0:* 26427/snmpd
And let’s have a test to make sure snmp client is working as expected:
[root@test-centos mrtg]# snmpwalk -v2c -c public localhost interface
IF-MIB::ifNumber.0 = INTEGER: 4
IF-MIB::ifIndex.1 = INTEGER: 1
IF-MIB::ifIndex.2 = INTEGER: 2
IF-MIB::ifIndex.3 = INTEGER: 3
IF-MIB::ifIndex.4 = INTEGER: 4
IF-MIB::ifDescr.1 = STRING: lo
IF-MIB::ifDescr.2 = STRING: eth0
IF-MIB::ifDescr.3 = STRING: eth1
……
……
Now it’s time to configure mrtg on linux:
[root@test-centos mrtg]# cat /etc/httpd/conf.d/mrtg.conf
Alias /mrtg /var/www/mrtg
<Location /mrtg>
Order deny,allow
Allow from all
</Location>
Now, do a httpd restart:
[root@test-centos ~]# apachectl restart
Part 2 – configure snmp on SUN zfs storage 7320 web UI
Log on SUN zfs storage 7320 web UI, navigate through “Configuration” -> “SNMP”, and configure as the following:
After this, you’ll need enable/restart SNMP service on zfs.
Now do a snmpwalk from snmp linux client to SUN zfs storage:
[root@test-centos ~]# snmpwalk -v2c -c public test-zfs-host interface
IF-MIB::ifNumber.0 = INTEGER: 6
IF-MIB::ifIndex.1 = INTEGER: 1
IF-MIB::ifIndex.4 = INTEGER: 4
IF-MIB::ifIndex.5 = INTEGER: 5
IF-MIB::ifIndex.6 = INTEGER: 6
IF-MIB::ifIndex.7 = INTEGER: 7
IF-MIB::ifIndex.8 = INTEGER: 8
IF-MIB::ifDescr.1 = STRING: lo0
……
……
-
Part 3 – extending snmp mibs/oids on snmp client(the linux host)
As stated at the beginning of this article, net-snmp extending agent functionality is not supported on sun zfs storage 7320. So I’m going to monitor sun zfs storage through using of one linux snmp client by writing monitoring scripts on that linux client rather than on zfs itself.
To avoid ssh asking for password when connecting from the linux host, you need add the linux host’s pubkey to sun zfs storage:
On snmp Linux client:
[root@test-centos ~]# ssh-keygen -t rsa #if you already have pubkey, skip this step
[root@test-centos ~]# cat /root/.ssh/id_rsa.pub |awk ‘{print $2}’
AAAAB3NzaC1yc2EAAAABIwAAAQEAxYd97A/V5RwdkfzbkmYBqF189pTLOlbYt0dZzO395dfU0Sp/Ykrk+sOJO0bJZEtytuTcCz/bVutWB7vLzeQPxIToRUQnZX7ZoMsjyaFk3LhtAgFhYIycOw2FQL8Qvb5yMBASB2/KthsqaiNqOP/2Vy5e0aCFFIV5DlKQTp/3eceSMq8kTx+e801lZow++yT70rp3p+5WtriN/NKYI0B3cpSQY/36D/TcOF9v5IaqQokp/mLRoc1MLOhN0sy0ipCdT+0bbkZ4Lh8bEeQO48UGKEOnYrYto33tay4mZk8HPWFK4w/TQGxBLthiuPQ4oZzG3gVpQUS4GRwI9zZoGtgELQ==
On Sun zfs storage:
Click “Configuration”, then add the snmp client’s public key(ensure RSA is selected):
Do a ssh connection from snmp linux client to the sun zfs storage host, it should now not asking for password.(ensure to do this, as there’s possibility that sun zfs’s key is not on the linux client as you may never have connected from that linux client to the sun zfs storage system)
Now we’re to the most important part. Assume we want to monitoring space usage on the SUN zfs storage system, to do this, you’ll need do the following on the snmp linux client:
[root@test-centos mrtg]# cat /etc/snmp/snmpd.conf
rocommunity public
extend .1.3.6.1.4.1.2021.31 zfs-test-zfs-host-total /bin/bash /var/tmp/mrtg/zfs-test-zfs-host-total.sh
extend .1.3.6.1.4.1.2021.32 zfs-test-zfs-host-used /bin/bash /var/tmp/mrtg/zfs-test-zfs-host-used.sh[root@test-centos ~]# cat /var/tmp/mrtg/zfs-test-zfs-host-used.sh
#!/bin/bash
_used=`ssh test-zfs-host “status ls”|grep Used|awk ‘{print gensub(‘/T/’,”",”g”,$2)}’`
echo $_used;[root@test-centos ~]# cat /var/tmp/mrtg/zfs-test-zfs-host-total.sh
_used=`ssh test-zfs-host “status ls”|grep Used|awk ‘{print gensub(‘/T/’,”",”g”,$2)}’` #I trimed ‘T’, you may need modify this to meet your environment
_avail=`ssh test-zfs-host “status ls”|grep Avail|awk ‘{print gensub(‘/T/’,”",”g”,$2)}’` #I trimed ‘T’, you may need modify this to meet your environment
_all=`echo $_used + $_avail|bc`
echo $_all;[root@test-centos ~]# chmod +x /var/tmp/mrtg/zfs-test-zfs-host-used.sh
[root@test-centos ~]# chmod +x /var/tmp/mrtg/zfs-test-zfs-host-total.sh
Now, let’s do a snmp restart on snmp linux client and then test the newly added OIDs:
[root@test-centos ~]# service snmpd restart
[root@test-centos ~]# snmpwalk -v2c -c public localhost .1.3.6.1.4.1.2021.32
UCD-SNMP-MIB::ucdavis.32.1.0 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.32.2.1.2.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = STRING: “/bin/bash”
UCD-SNMP-MIB::ucdavis.32.2.1.3.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = STRING: “/var/tmp/mrtg/zfs-test-zfs-host-used.sh”
UCD-SNMP-MIB::ucdavis.32.2.1.4.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = “”
UCD-SNMP-MIB::ucdavis.32.2.1.5.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = INTEGER: 5
UCD-SNMP-MIB::ucdavis.32.2.1.6.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.32.2.1.7.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.32.2.1.20.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = INTEGER: 4
UCD-SNMP-MIB::ucdavis.32.2.1.21.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.32.3.1.1.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = STRING: “9.31″
UCD-SNMP-MIB::ucdavis.32.3.1.2.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = STRING: “9.31″
UCD-SNMP-MIB::ucdavis.32.3.1.3.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.32.3.1.4.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 = INTEGER: 0
UCD-SNMP-MIB::ucdavis.32.4.1.2.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100.1 = STRING: “9.31″
[root@test-centos ~]# snmpwalk -v2c -c public localhost .1.3.6.1.4.1.2021.31
UCD-SNMP-MIB::ucdavis.31.1.0 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.31.2.1.2.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = STRING: “/bin/bash”
UCD-SNMP-MIB::ucdavis.31.2.1.3.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = STRING: “/var/tmp/mrtg/zfs-test-zfs-host-total.sh”
UCD-SNMP-MIB::ucdavis.31.2.1.4.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = “”
UCD-SNMP-MIB::ucdavis.31.2.1.5.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = INTEGER: 5
UCD-SNMP-MIB::ucdavis.31.2.1.6.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.31.2.1.7.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.31.2.1.20.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = INTEGER: 4
UCD-SNMP-MIB::ucdavis.31.2.1.21.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.31.3.1.1.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = STRING: “16.32″
UCD-SNMP-MIB::ucdavis.31.3.1.2.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = STRING: “16.32″
UCD-SNMP-MIB::ucdavis.31.3.1.3.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.31.3.1.4.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 = INTEGER: 0
UCD-SNMP-MIB::ucdavis.31.4.1.2.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108.1 = STRING: “16.32″
From the output, we can see that OIDs UCD-SNMP-MIB::ucdavis.32.3.1.1.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100 (used space) and UCD-SNMP-MIB::ucdavis.31.3.1.1.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108 (total space) are the two OIDs we want.
-
Part 4 – do the mrtg drawing
As we got the OIDs we want, it’s now easier for us to do the mrtg drawing. On the snmp linux host, do the following steps:
[root@test-centos ~]# cat /etc/mrtg/test-zfs-host.cfg
#LoadMIBs: /usr/share/snmp/mibs/UCD-SNMP-MIB.txt,/usr/share/snmp/mibs/TCP-MIB.txt
workdir: /var/www/mrtg/
Title[zfs_space_test-zfs-host]: Percentage used space on zfs
PageTop[zfs_space_test-zfs-host]: <h1>Percentage used space on zfs</h1>
Target[zfs_space_test-zfs-host]: .1.3.6.1.4.1.2021.32.3.1.1.19.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.117.115.101.100&.1.3.6.1.4.1.2021.31.3.1.1.20.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.116.111.116.97.108:public@localhost
Options[zfs_space_test-zfs-host]: growright,gauge,transparent,nopercent
Unscaled[zfs_space_test-zfs-host]: ymwd
MaxBytes[zfs_space_test-zfs-host]: 100
YLegend[zfs_space_test-zfs-host]: UsedSpace %
ShortLegend[zfs_space_test-zfs-host]: T
LegendI[zfs_space_test-zfs-host]: Used
LegendO[zfs_space_test-zfs-host]: Total
Legend1[zfs_space_test-zfs-host]: Percentage used space on zfs
Legend2[zfs_space_test-zfs-host]: Percentage all space on zfs
PS:
You need replace “UCD-SNMP-MIB::ucdavis” with .1.3.6.1.4.1.2021 or you’ll get error messages like the following:
[root@test-centos ~]# env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg
Argument “v4only” isn’t numeric in int at /usr/bin/../lib64/mrtg2/SNMP_Session.pm line 183.
backoff (v4only) must be a number >= 1.0 at /usr/bin/../lib64/mrtg2/SNMP_util.pm line 465
Let’s continue:
[root@test-centos ~]# env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg
24-12-2012 01:46:41, Rateup WARNING: /usr/bin/rateup could not read the primary log file for zfs_space_test-zfs-host
24-12-2012 01:46:41, Rateup WARNING: /usr/bin/rateup The backup log file for zfs_space_test-zfs-host was invalid as well
24-12-2012 01:46:41, Rateup WARNING: /usr/bin/rateup Can’t remove zfs_space_test-zfs-host.old updating log file
24-12-2012 01:46:41, Rateup WARNING: /usr/bin/rateup Can’t rename zfs_space_test-zfs-host.log to zfs_space_test-zfs-host.old updating log file[root@test-centos ~]# env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg
24-12-2012 01:46:46, Rateup WARNING: /usr/bin/rateup Can’t remove zfs_space_test-zfs-host.old updating log file[root@test-centos ~]# env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg
[root@test-centos ~]# indexmaker –output=/var/www/mrtg/index.html /etc/mrtg/test-zfs-host.cfg
Now add cronjob:
0-59/5 * * * * env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg
Visit http://<your linux box’s ip address>/mrtg/ to get the GUI result(you’ll wait 5 minutes for the initial result, be patient!)
-
Part 5 – troubleshooting
- If you met error messages like the following:
[root@test-centos ~]# env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg
Argument “v4only” isn’t numeric in int at /usr/bin/../lib64/mrtg2/SNMP_Session.pm line 183.
backoff (v4only) must be a number >= 1.0 at /usr/bin/../lib64/mrtg2/SNMP_util.pm line 465
That’s because you’re using OIDs/MIBs alias rather than number. Change alias to number, i.e. change UCD-SNMP-MIB::ucdavis to .1.3.6.1.4.1.2021, and then re-check.
- If you met error messages like the following:
[root@test-centos ~]# env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg
cannot encode Object ID 34.4.1.2.28.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.110.105.99.45.98.97.110.100.119.105.100.116.104.3: first subid too big in Object ID 34.4.1.2.28.122.102.115.45.115.108.99.101.49.48.115.110.48.49.45.110.105.99.45.98.97.110.100.119.105.100.116.104.3 at /usr/bin/mrtg line 2035
Tuesday, 25 December 2012 at 0:34: ERROR: Target[zfs-test-zfs-host-io-average-latency-igb2][_IN_] ‘ $target->[3]{$mode} ‘ did not eval into defined data
Tuesday, 25 December 2012 at 0:34: ERROR: Target[zfs-test-zfs-host-io-average-latency-igb2][_OUT_] ‘ $target->[3]{$mode} ‘ did not eval into defined data
Then you should carefully check the scripts you write for the OIDs, and run several times of snmpwalk to ensure the values are correct(is your script’s output variable, this may cause problems)
- If you met error messages like the following:
[root@test-centos mrtg]# env LANG=C TZ=Asia/Shanghai /usr/bin/mrtg /etc/mrtg/test-zfs-host.cfg
SNMP Error:
Received SNMP response with error code
error status: noSuchName
index 2 (OID: 1.3.6.1.4.1.2021.44)
SNMPv1_Session (remote host: “localhost” [127.0.0.1].161)
community: “public”
request ID: 1786815468
PDU bufsize: 8000 bytes
timeout: 2s
retries: 5
backoff: 1)
at /usr/bin/../lib64/mrtg2/SNMP_util.pm line 490
SNMPGET Problem for .1.3.6.1.4.1.2021.44 .1.3.6.1.4.1.2021.44 sysUptime sysName on public@localhost::::::v4only
at /usr/bin/mrtg line 2035
SNMP Error:
Received SNMP response with error code
error status: noSuchName
index 2 (OID: 1.3.6.1.4.1.2021.45)
SNMPv1_Session (remote host: “localhost” [127.0.0.1].161)
community: “public”
request ID: 1786815469
PDU bufsize: 8000 bytes
timeout: 2s
retries: 5
backoff: 1)
at /usr/bin/../lib64/mrtg2/SNMP_util.pm line 490
SNMPGET Problem for .1.3.6.1.4.1.2021.45 .1.3.6.1.4.1.2021.45 sysUptime sysName on public@localhost::::::v4only
at /usr/bin/mrtg line 2035
Monday, 24 December 2012 at 14:41: ERROR: Target[zfs_space_test-zfs-host][_IN_] ‘( $target->[0]{$mode} ) * 10000 / ( $target->[1]{$mode} )’ (warn): Use of uninitialized value in division (/) at (eval 16) line 1.
Monday, 24 December 2012 at 14:41: ERROR: Target[zfs_space_test-zfs-host][_OUT_] ‘( $target->[0]{$mode} ) * 10000 / ( $target->[1]{$mode} )’ (warn): Use of uninitialized value in division (/) at (eval 17) line 1.
Then one possible culprit is that new net-snmp stop supporting “exec”, use “extend” instead and re-try.
Note that you can also use “snmpd -f -Le” to check error messages related to snmpd.
PS:
Here’s more links from where you can read more about net-snmp/mrtg:
- net-snmp FAQ http://www.net-snmp.org/FAQ.html
- mrtg configuration opstions http://oss.oetiker.ch/mrtg/doc/mrtg-reference.en.html
- SUN zfs storage SNMP – http://docs.oracle.com/cd/E22471_01/html/820-4167/configuration__services__snmp.html
- net-snmp extending agent functionality – http://linux.die.net/man/5/snmpd.conf (search for ‘extending agent functionality’ on this page)
- Now here’s the image of the baby(click on it to see the larger one:




