Friday, July 19, 2013

DISK I/O - how to monitor with Zabbix

In the spirit of Zabbix and good monitoring I have decided to share a little something I have found and modified.

In this article I will explain and display how we can go about monitoring DISK I/O.


It should give you a rough idea of what your disks are doing and when it possibly could be a problem.

So here Goes:

First off I will explain a few things on the understanding of what is printed within "/proc/diskstats" for HD activity.

Here is an example:

cat /proc/diskstats
8    0 sda 490 2002 17576 5984 66 17 220 3495 0 9435 9479

I will give you a break down coupled with the explanation from the kernel iostats.txt file for what it is you are looking at:

From left to right:
1 - major number - Used for kernel/driver registration
2 - minor mumber - Used for kernel/driver registration but also maps position on device
3 - device name
4 - reads completed successfully
5 - reads merged
6 - sectors read
7 - time spent reading (ms)
8 - writes completed
9 - writes merged
10 - sectors written
11 - time spent writing (ms)
12 - I/Os currently in progress
13 - time spent doing I/Os (ms)
14 - weighted time spent doing I/Os (ms)

So now you asked yourself. What are all these numbers for and what could it possibly do for me. That my friend is a simple one. In most enterprise level data-centers today, it is not RAM or CPU that kills a server environment. It's DISK I/O. Waiting to write, waiting to read... Waiting... Waiting... Waiting...

To begin our monitoring we add the following User Parameter into your zabbix_agentd.conf of the client/monitored server.

cat << STOP >> /usr/local/etc/zabbix_agentd.conf
#
#
#
### DISK I/O###
UserParameter=custom.vfs.dev.read.ops[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$4}'
UserParameter=custom.vfs.dev.read.ms[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$7}'
UserParameter=custom.vfs.dev.write.ops[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$8}'
UserParameter=custom.vfs.dev.write.ms[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$11}'
UserParameter=custom.vfs.dev.io.active[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$12}'
UserParameter=custom.vfs.dev.io.ms[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$13}'
UserParameter=custom.vfs.dev.read.sectors[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$6}'
UserParameter=custom.vfs.dev.write.sectors[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$10}'
### DISK I/O###
STOP


Coupled with our new user parameters, all we need is to specify a list of disks in our template name that we wish to monitor. I will paste the XML for the template below:

Just copy my code from the template below. Save it as a *.xml. Import it to your Zabbix server and watch the monitoring grow
Server Check IO: See at the bottom of the post, you will need to download it directly :)

Once you have completed this you will need to restart the zabbix_agentd where you have added the new user parameters.

If you have any questions... Feel free to drop me a note :)

On a side note. You can easily clone the disk names once you have imported the template to include any other drives you may have on the system. This can simply be done by cloning an item in zabbix and changing the key value to the corresponding drive name for example: read.ms[sda] can be changed to read.ms[hda] if you have IDE drives.

PS: Some credits to articles on the net I have read.... and also the beauty of LINUX "THE" MAN pages.

PPS: Attached link to template - Zabbix IOSTAT Template

40 comments:

  1. Trying to import the XML template into Zabbix 2.0.6 without success

    ReplyDelete
    Replies
    1. What is the error your getting when your trying to import?

      I will try to upload a complete XML for import. It should make things easier :)

      Delete
    2. I have uploaded the XML template. Please let me know if you have trouble with it.

      Delete
  2. Hello

    here is the error that I have got

    Undefined index: macros [include/classes/import/CConfigurationImport.php:210]
    Invalid argument supplied for foreach() [include/classes/import/CConfigurationImport.php:210]
    Undefined index: key_ [include/classes/import/formatters/C20ImportFormatter.php:148]
    Undefined index: key_ [include/classes/import/formatters/C20ImportFormatter.php:148]
    Undefined index: key_ [include/classes/import/formatters/C20ImportFormatter.php:148]
    Undefined index: key_ [include/classes/import/formatters/C20ImportFormatter.php:148]
    Undefined index: key_ [include/classes/import/formatters/C20ImportFormatter.php:148]
    Undefined index: key_ [include/classes/import/formatters/C20ImportFormatter.php:148]
    Undefined index: key_ [include/classes/import/formatters/C20ImportFormatter.php:148]
    Undefined index: key_ [include/classes/import/formatters/C20ImportFormatter.php:148]
    Undefined index: key_ [include/classes/import/CConfigurationImport.php:250]
    Undefined index: key_ [include/classes/import/CConfigurationImport.php:250]
    Undefined index: key_ [include/classes/import/formatters/C20ImportFormatter.php:178]
    Undefined index: key_ [include/classes/import/CConfigurationImport.php:264]
    Undefined index: key_ [include/classes/import/CConfigurationImport.php:264]
    Created: Application "IOStat" on "Service Check IOstat".
    Undefined index: key_ [include/classes/import/CConfigurationImport.php:577]
    Incorrect arguments passed to function.

    Hope you can upload that template as soon as possible

    thank you

    ReplyDelete
    Replies
    1. I have attached the XML template to the post. Check at the bottom. I am not entirely sure if it exported correctly though O.o

      Just let me know :)

      Delete
  3. Tks, worked fine for me!
    (Zabbix Server 2.0.7 x64 with agent 2.0.6 x64 - all Amazon AWS EC2)

    ReplyDelete
    Replies
    1. Can you please share your partition name ? as i have /dev/nvme0n1p1 and how to monitor this partition

      Delete
    2. @Ravi, the partition naming only matters when you are setting up the view pane on Zabbix server.

      Delete
  4. What does this mean?

    "all we need is to specify a list of disks in our template name that we wish to monitor."

    Where do you specify the disks?

    ReplyDelete
    Replies
    1. Hi Brandon,

      What it means is:

      Once you have imported the template into your Zabbix server. Head over to the items list within the newly imported template. When your there, select items. In your items list you can specify a paramater and a disk name that is on the server your monitoring.

      For example the template has a userparamger: "UserParameter=custom.vfs.dev.io.ms[*],cat /proc/diskstats | egrep $1 | head -1 y| awk '{print $$13}'"

      Which you can then use as follows:
      custom.vfs.dev.io.ms[sda]
      And/OR "depending on how many disks you have"
      custom.vfs.dev.io.ms[sda]

      Let me know if that explains it.

      Delete
  5. One thing to note: Make sure your items are set to Delta (speed per second) instead of "As Is" for the "Stored As" parameter.

    ReplyDelete
  6. Hi, I would like to say that everything is working just super, but I have one question, what I should modify if I want to monitor for example only all "dm-*" disks? In your template I can see that there is only one disk monitoring - sda. So is it possible to do it with Your script and template?

    ReplyDelete
    Replies
    1. Yes you most certainly can. Simply add the additional parameters you want. Only question is though... I would be interested to see what returns are to be gotten from LVM devices, or rather your motivation for the question :)

      Delete
  7. I tried to add in information in different way but without success...:(
    I tried for example:
    custom.vfs.dev.write.ms[*]
    custom.vfs.dev.write.ms[]
    custom.vfs.dev.write.ms[dm-*]

    but all aftter some time show "Not supported"... :(

    ReplyDelete
    Replies
    1. You dont have to edit that. Simply go to the template once you have imported it. Change the item to the following: custom.vfs.dev.io.ms[dm-0]

      You can add whatever else you have in your LVM logical device list, such as: dm-1

      Let me know if that answers your question.

      Delete
    2. But how to make it add automatically all dm-nn?

      Delete
    3. It already does. You just need to statically supply it in the template. The user parameter: "custom.vfs.dev.write.ops[*]" is calculated for every variable you supply in the template items list.

      I understand it would just be awesome to create a list of all disks and have it dynamically pull the stats, but there is still some measure of manual labour required. If your struggling to get the list of dm devices just do this on a CLI:

      cat /proc/diskstats | egrep 'dm\-[0-9]*'| awk '{print $3}'

      That should print out a list of DM devices you can specify in the zabbix item list.

      Delete
    4. But what changes I need to add in template to monitor all "dm-nn" automatically. Because at this moment in template I have key:
      custom.vfs.dev.write.ops[sda]

      I tried to change it to custom.vfs.dev.write.ops[] or custom.vfs.dev.write.ops[*] but without success I have zabbix not supported (Zabbix 2.0.8).

      I also tried to creaty Discovery rules with key custom.vfs.dev.write.ops[{#DEVNAME}] but also without success... :(

      Delete
    5. Ok. Here is what you have to do.

      In the template. Copy the item with you just mentioned about "custom.vfs.dev.write.ops[sda]"

      Then modify it so that it looks like this: custom.vfs.dev.write.ops[dm-0]

      The same can be done for every other item in the template and it will display the write ops for DM-0.

      Let me know if that clears things up for you.

      Delete
    6. Thank You, but it is a big manual job, is there any possibility that it create automatically. I mean that he automatically find and create item for all "dm-nn" example:
      custom.vfs.dev.write.ops[dm-0]
      custom.vfs.dev.write.ops[dm-01]
      custom.vfs.dev.write.ops[dm-02]
      .
      .
      .etc

      So I don't have manually add all "dm-nn". And if system have only 5 "dm-nn" so it automatically create 5 items, if 10 then automatically 10 item and so on.

      Delete
    7. Ok well I suppose it wont be to hard to create automatic discovery rules for disk types. I will just need a functional list out of /proc/partitions to make the general layout of them template.

      I will update the post once I have created one.

      Keep an eye out for it.

      Delete
    8. Super, Thank You!

      Delete
  8. hi man,

    for this works in my zabbix server, i can only include this lines in userparameter, because i put this, and agent service not start

    cat << STOP >> /usr/local/etc/zabbix_agentd.conf
    #
    #
    #
    ### DISK I/O###
    UserParameter=custom.vfs.dev.read.ops[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$4}'
    UserParameter=custom.vfs.dev.read.ms[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$7}'
    UserParameter=custom.vfs.dev.write.ops[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$8}'
    UserParameter=custom.vfs.dev.write.ms[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$11}'
    UserParameter=custom.vfs.dev.io.active[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$12}'
    UserParameter=custom.vfs.dev.io.ms[*],cat /proc/diskstats | egrep $1 | head -1 y| awk '{print $$13}'
    UserParameter=custom.vfs.dev.read.sectors[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$6}'
    UserParameter=custom.vfs.dev.write.sectors[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$10}'
    ### DISK I/O###
    STOP

    ReplyDelete
    Replies
    1. In the zabbix_agentd config there is a section that indicates where the log is written to. When you start the agent and it breaks the log should have some information. Post that to me and I will assist.

      Delete
  9. Hi, thanks for this great information!

    I took the liberty of using this post as a base for a new custom template with low-level discovery support so you don't have to manually add/remove disk devices.

    I wrote a blogpost on what steps I took to create the template and instructions on how to use it here: http://www.denniskanbier.nl/blog/monitoring/monitoring-disk-io-using-zabbix/

    I hope it might help some people.

    ReplyDelete
  10. I edited file zabbix_agentd.conf, put there UserParameter as you said to do, but in zabbix graphic shows [no data]. The client is RHEL 5

    ReplyDelete
    Replies
    1. Hi Nilufar,

      Did you add the Zabbix template to the server? I imagine you did since you are expecting to see data somewhere. Please confirm.

      Also check in your Zabbix client log if there are any reported arrows and send them to me :)

      Delete
  11. this one only monitor xvda or whatever i put there , however how can i monitor two disks in the same template nad have two separate graphs?

    ReplyDelete
    Replies
    1. You can monitor any number of disks, the trick is to just keep adding them to your zabbix monitoring system and yes you get a graph for each and every disk you add.

      Delete
  12. Please remove the "y" after the head -1 command in line 11. If anybody is copy/pasting this will certainly be an issue :-)

    ReplyDelete
    Replies
    1. Lol this post is rather old. But yes I updated it. I made a mistake when I wrote out the lines. Also checked my own scripts that are synced to my servers and the "y" isn't there. None the less thank you :)

      Delete
  13. This comment has been removed by the author.

    ReplyDelete
  14. Might be OLD but works just fine in Zabbix 3.0.5. Many thanks for your efforts 3 years ago. Too bad this isnt available in stock out of the box Zabbix.

    ReplyDelete
    Replies
    1. Its been a while since I had time to even blog about the new things I have done but thank you for posting comments on this page, its always a good reason to remind me to start again.

      Delete
  15. Even after all this years, keep working in my Zabbix 3.4.4, very thank you.

    ReplyDelete
  16. Hi Renaldo,

    Thanks for the great post.

    I have 8 disk in my server which i need to monitor, right now i am only able to monitor sda. how to monitor other disks.

    I read all the comment mention in post and made something like below for all the disk but it is not working.

    Disk:sda:IO:ms custom.vfs.dev.io.ms[sda,sdb,sdc,sdd]
    Disk:sda:IO currently executing custom.vfs.dev.io.active[sda,sdb,sdc,sdd]
    Disk:sda:Read:Bytes/sec custom.vfs.dev.read.sectors[sda,sdb,sdc,sdd]
    Disk:sda:Read:ops per Second custom.vfs.dev.read.ops[sda,sdb,sdc,sdd]
    Disk:sda:Write:Bytes/sec custom.vfs.dev.write.sectors[sda,sdb,sdc,sdd]
    Disk:sda:Write:ops per Second custom.vfs.dev.write.ops[sda,sdb,sdc,sdd]
    Read Speed custom.vfs.dev.read.ms[sda,sdb,sdc,sdd]
    Write Speed custom.vfs.dev.write.ms[sda,sdb,sdc,sdd]

    Kindly help me how to add template for all the disks.

    Thanks you.

    ReplyDelete
    Replies
    1. Hi, the user paramaters do not need to be edited, at the time of writing this I was using a single vm with a single disk which is why I did not go into detail on how to expand it for extra drives. To keep it simple all you need to do is copy the existing drive on your zabbix server and update it to be sdX. The above method already includes all drives.

      Delete