debugging IO calls in Solaris 11 with kstat

Solaris 11 update 1 was released just a few days ago and I have been curious to find out about new features. One of the things mentioned in the What’s-New guide that caught my attention were “File System Statistics for Oracle Solaris Zones”. Until now, there was no way to tell which zone was responsible for how much IO. So if you found high IO utilization for your devices using iostat or zpool iostat or anything similar, it was no trivial task to find the responsible zone or process. DTrace scripts like iotop or iosnoop helped a bit but in the case of ZFS they this does not get you too far because they do not report the user process that is really responsible for the IO. This is where these new kstat statistics for each filesystem type in each zone come in handy.

Let me walk you through a simple example: I set up a box with solaris 11.1 and installed two zones on it. In one of the zones I am running ‘dd if=/dev/urandom of=random.data’ to generate a bit of IO load. Now I would like to investigate and find this process from the global zone. I can easily see that something is going on with either iostat or zpool iostat from the global zone and we see about 30 write IOs per second utilizing a bandwidth of about 15MB/s. But we do not get to see which process or zone is responsible.

portrix@ptx11:~$ iostat -xtc 10 4
                 extended device statistics                    tty         cpu
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b  tin tout  us sy wt id
sd0       1.0    2.2    8.5  203.3  0.0  0.0    7.0   0   1    0  187   0  0  0 100
sd1       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
                 extended device statistics                    tty         cpu
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b  tin tout  us sy wt id
sd0       0.0   36.6    0.0 15258.8  0.0  0.3    7.0   0   3    0   31   0  4  0 95
sd1       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
                 extended device statistics                    tty         cpu
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b  tin tout  us sy wt id
sd0       0.0   33.1    0.0 15214.8  0.0  0.3    7.7   0   3    0   31   0  4  0 96
sd1       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
                 extended device statistics                    tty         cpu
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b  tin tout  us sy wt id
sd0       0.0   29.1    0.0 15198.9  0.0  0.3    8.7   0   3    0   31   0  4  0 96
sd1       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 

portrix@ptx11:~$ zpool iostat 10 4
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
rpool       46.3G   498G      0      2  8.51K   214K
rpool       46.5G   498G      0     30      0  14.8M
rpool       46.6G   497G      0     31      0  14.8M
rpool       46.7G   497G      0     35      0  14.9M

The DTrace Toolkit has some neat stuff to analyze this in more detail and this works great for UFS. But unfortunately due to the nature of ZFS, io calls to devices are not recorded per process. So all we can see there is that this IO is made through ZFS.

root@ptx11:~/DTraceToolkit-0.99# ./iotop

2012 Oct 27 10:11:06,  load: 1.20,  disk_r:      0 KB,  disk_w:  74999 KB

  UID    PID   PPID CMD              DEVICE  MAJ MIN D            BYTES
    0      5      0 zpool-rpool      sd0     206   1 W         77061632

root@ptx11:~/DTraceToolkit-0.99# ./iosnoop 
  UID   PID D    BLOCK   SIZE       COMM PATHNAME
    0     5 W 105897846 131072 zpool-rpool 
    0     5 W 105896822 131072 zpool-rpool 
    0     5 W 105897078 131072 zpool-rpool 
    0     5 W 105897334 131072 zpool-rpool 
    0     5 W 105897590 131072 zpool-rpool 
    0     5 W 105898102 1048576 zpool-rpool 
    0     5 W 105900150 1048576 zpool-rpool 
    0     5 W 105902198 1048576 zpool-rpool 
    0     5 W 105904246 1048576 zpool-rpool 
    0     5 W 105906294 1048576 zpool-rpool 

New kstat to the rescue! There are now modules for each filesystem type for all zones. The unix module has seperate instances for each zone identified through the zone id. So just query the zfs fs statistics and you will get utilization (or simply the number of writes in my case) per zone:

root@ptx11:~# kstat -m unix -n vopstats_zfs -s nwrite ; sleep 10 ; kstat -m unix -n vopstats_zfs -s nwrite
module: unix                            instance: 0     
name:   vopstats_zfs                    class:    misc
        nwrite                          696291

module: unix                            instance: 3     
name:   vopstats_zfs                    class:    misc
        nwrite                          35248

module: unix                            instance: 4     
name:   vopstats_zfs                    class:    misc
        nwrite                          41282391

module: unix                            instance: 0     
name:   vopstats_zfs                    class:    misc
        nwrite                          696291

module: unix                            instance: 3     
name:   vopstats_zfs                    class:    misc
        nwrite                          35248

module: unix                            instance: 4     
name:   vopstats_zfs                    class:    misc
        nwrite                          41584810

And clearly it is the zone with id 4 that is generating those writes. I am sure this will be very helpful on our production systems where we run multiple zones that are all doing IO simultaneously. This enables us to analyze the IO requirements of zones on a shared filesystem.

One thought on “debugging IO calls in Solaris 11 with kstat

  1. Pingback: Solaris 11.1 update (new features) | portrix systems

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>