Solaris 11 update 1 was released just a few days ago and I have been curious to find out about new features. One of the things mentioned in the What’s-New guide that caught my attention were “File System Statistics for Oracle Solaris Zones”. Until now, there was no way to tell which zone was responsible for how much IO. So if you found high IO utilization for your devices using iostat or zpool iostat or anything similar, it was no trivial task to find the responsible zone or process. DTrace scripts like iotop or iosnoop helped a bit but in the case of ZFS they this does not get you too far because they do not report the user process that is really responsible for the IO. This is where these new kstat statistics for each filesystem type in each zone come in handy.
Let me walk you through a simple example: I set up a box with solaris 11.1 and installed two zones on it. In one of the zones I am running ‘dd if=/dev/urandom of=random.data’ to generate a bit of IO load. Now I would like to investigate and find this process from the global zone. I can easily see that something is going on with either iostat or zpool iostat from the global zone and we see about 30 write IOs per second utilizing a bandwidth of about 15MB/s. But we do not get to see which process or zone is responsible.
portrix@ptx11:~$ iostat -xtc 10 4
extended device statistics tty cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b tin tout us sy wt id
sd0 1.0 2.2 8.5 203.3 0.0 0.0 7.0 0 1 0 187 0 0 0 100
sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
extended device statistics tty cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b tin tout us sy wt id
sd0 0.0 36.6 0.0 15258.8 0.0 0.3 7.0 0 3 0 31 0 4 0 95
sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
extended device statistics tty cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b tin tout us sy wt id
sd0 0.0 33.1 0.0 15214.8 0.0 0.3 7.7 0 3 0 31 0 4 0 96
sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
extended device statistics tty cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b tin tout us sy wt id
sd0 0.0 29.1 0.0 15198.9 0.0 0.3 8.7 0 3 0 31 0 4 0 96
sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
portrix@ptx11:~$ zpool iostat 10 4
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
rpool 46.3G 498G 0 2 8.51K 214K
rpool 46.5G 498G 0 30 0 14.8M
rpool 46.6G 497G 0 31 0 14.8M
rpool 46.7G 497G 0 35 0 14.9M
The DTrace Toolkit has some neat stuff to analyze this in more detail and this works great for UFS. But unfortunately due to the nature of ZFS, io calls to devices are not recorded per process. So all we can see there is that this IO is made through ZFS.
root@ptx11:~/DTraceToolkit-0.99# ./iotop
2012 Oct 27 10:11:06, load: 1.20, disk_r: 0 KB, disk_w: 74999 KB
UID PID PPID CMD DEVICE MAJ MIN D BYTES
0 5 0 zpool-rpool sd0 206 1 W 77061632
root@ptx11:~/DTraceToolkit-0.99# ./iosnoop
UID PID D BLOCK SIZE COMM PATHNAME
0 5 W 105897846 131072 zpool-rpool
0 5 W 105896822 131072 zpool-rpool
0 5 W 105897078 131072 zpool-rpool
0 5 W 105897334 131072 zpool-rpool
0 5 W 105897590 131072 zpool-rpool
0 5 W 105898102 1048576 zpool-rpool
0 5 W 105900150 1048576 zpool-rpool
0 5 W 105902198 1048576 zpool-rpool
0 5 W 105904246 1048576 zpool-rpool
0 5 W 105906294 1048576 zpool-rpool
New kstat to the rescue! There are now modules for each filesystem type for all zones. The unix module has seperate instances for each zone identified through the zone id. So just query the zfs fs statistics and you will get utilization (or simply the number of writes in my case) per zone:
root@ptx11:~# kstat -m unix -n vopstats_zfs -s nwrite ; sleep 10 ; kstat -m unix -n vopstats_zfs -s nwrite
module: unix instance: 0
name: vopstats_zfs class: misc
nwrite 696291
module: unix instance: 3
name: vopstats_zfs class: misc
nwrite 35248
module: unix instance: 4
name: vopstats_zfs class: misc
nwrite 41282391
module: unix instance: 0
name: vopstats_zfs class: misc
nwrite 696291
module: unix instance: 3
name: vopstats_zfs class: misc
nwrite 35248
module: unix instance: 4
name: vopstats_zfs class: misc
nwrite 41584810
And clearly it is the zone with id 4 that is generating those writes. I am sure this will be very helpful on our production systems where we run multiple zones that are all doing IO simultaneously. This enables us to analyze the IO requirements of zones on a shared filesystem.
Pingback: Solaris 11.1 update (new features) | portrix systems