http://blogs.dootdoot.com/mike

Last week we setup a Cacti monitoring server and basic SNMP on our Open Indiana test machine. Some basic Linux statistics are provided out of box, such as interface traffic monitoring, which is nice but does not offer a lot of helpful information about the ZFS storage pools we want to monitor.

I began my search looking for a decent monitoring solutions for ZFS via SNMP. In my searching, sadly I found that there was very little out there. The best resource I found was actually on a French site, http://www.hypervisor.fr/?p=3828

The post outlines a few simple steps for getting started with a very basic storage space usage graph on any zpool’s in the system using the zfs zpool list commands. I modified it slightly putting the commands in scripts so I wouldn’t have to restart snmp every time I made a change. From here on out, I store my scripts in /opt/utils/

/opt/utils/zpools_name.sh:

zpool list -H -o name

/opt/utils/zpools_capacity.sh:

zpool list -H -o capacity | sed -e 's/%//g'

The first script simply returns the name of each individual zpool (one per line). The second script returns the amount of spaced used as a percentage of each individual zpool (one per line) and then strips the % from the string result.

So far pretty easy. Now to add the two commands to our /etc/net-snmp/snmp/snmpd.conf file:

########################################################################################
# SNMP : zpool capacity
########################################################################################

extend .1.3.6.1.4.1.2021.87 zpool_name /usr/gnu/bin/sh /opt/utils/zpools_name.sh
extend .1.3.6.1.4.1.2021.87 zpool_capacity /usr/gnu/bin/sh /opt/utils/zpools_capacity.sh

The base OID .1.3.6.1.4.1.2021 is basically an arbitrary OID that’s registred to UC DAVIS, I only think the sub-ID .9 is used for some disk checks, so using .87 and appending the name converted to ASCII guarentees us a fairly unique OID.

At this point we had to copy/create a zpools_capacity.xml file in the resource/snmp_queries/ folder within Cacti:

<interface>
        <name>Get ZFS zpool capacity</name>
        <index_order_type>numeric</index_order_type>
        <oid_index>.1.3.6.1.4.1.2021.87.3</oid_index>
        <oid_num_indexes>.1.3.6.1.4.1.2021.87.3</oid_num_indexes>

        <fields>
                <ZpoolName>
                        <name>Name</name>
                        <method>walk</method>
                        <source>value</source>
                        <direction>input</direction>
                        <oid>.1.3.6.1.4.1.2021.87.4.1.2.10.122.112.111.111.108.95.110.97.109.101</oid>
                </ZpoolName>
                <ZpoolCapacity>
                        <name>Capacity</name>
                        <method>walk</method>
                        <source>value</source>
                        <direction>output</direction>
                        <oid>.1.3.6.1.4.1.2021.87.4.1.2.14.122.112.111.111.108.95.99.97.112.97.99.105.116.121</oid>
                </ZpoolCapacity>
        </fields>
</interface>

At this point you can build your own Data Query and Graph Template or the post actually provided each as well. We went ahead and used the provided templates to give it a shot.

This worked fairly well as a base test, and gives us an easy monitoring of capacity usage (while still not really all that useful).

Comments

2 Responses to “ZFS Monitoring and Cacti – Part 2”

  1. NiTRo on July 3rd, 2012 12:19 pm

    I’ll soon update the post with an enhanced graph and other metrics (busy time, vfs iops, ARC usage L2ARC stats…). Stay tuned :)

  2. mike on July 17th, 2012 10:51 pm

    Those would be some good stats.

    I started some stat collections based off of zpool iostat. You’ll have to check out my later ZFS Monitoring and Cacti posts (Part 4 and Part 5).

Leave a Reply