ZFS migration using incremental send/receive

We are currently migrating our internal systems from an older 2510 iSCSI Array to a brand new 7120 Unified Storage Box. We moved some filesystems like home directories from ZFS (over iSCSI) to NFS on the new box and performance with NFSv4 is a blast. Some of the other zpools were simply migrated to a new LUN shared via iSCSI from the new array. Fortunately, zfs makes these types of migrations very, very easy and possible with just a tiny bit of downtime even for large Volumes with a Terabyte of data (or more) spread across a numer of zfs filesystems.
The idea is quite simple and all parts are well documented all over the internet. I just could not find a single place that had all these steps together:

So in one of the cases, I needed to migrate this whole zpool with all filesystems to another volume:

root@hermes:~# zfs list
NAME                      USED  AVAIL  REFER  MOUNTPOINT
zp03                      163G   109G    23K  /zp03
zp03/PRServer             744M   109G   744M  /opt/PRServer
zp03/download              18K   109G    18K  /zp03/download
zp03/export              4.19G   109G  4.19G  /zp03/export
zp03/oracle               864M   109G   864M  /opt/oracle
zp03/pca                  278K   109G   278K  /zp03/pca
zp03/zones                125G   109G    33K  /zp03/zones
zp03/zones/ad-bugs       5.77G   109G  5.77G  /zp03/zones/ad-bugs
zp03/zones/bacula        9.26G   109G  9.26G  /zp03/zones/bacula
zp03/zones/dimstat       5.32G   109G  5.32G  /zp03/zones/dimstat
zp03/zones/glpi          6.18G   109G  6.18G  /zp03/zones/glpi
zp03/zones/gp-wiki       21.1G   109G  21.1G  /zp03/zones/gp-wiki
zp03/zones/lfi-ios       5.70G   109G  5.70G  /zp03/zones/lfi-ios
zp03/zones/ora11g        5.50G   109G  5.50G  /zp03/zones/ora11g
zp03/zones/pgsql         40.8G   109G  40.8G  /zp03/zones/pgsql
zp03/zones/sdasp         4.32G   109G  4.32G  /zp03/zones/sdasp
zp03/zones/solr          7.89G   109G  7.89G  /zp03/zones/solr
zp03/zones/tomcat        7.24G   109G  7.24G  /zp03/zones/tomcat
zp03/zones/zabbix        5.81G   109G  5.81G  /zp03/zones/zabbix

So I created a new volume on my new array, exported it to this server via iSCSI and create a new pool on it:

zpool create zpnew c4t600144F08A2AC12000004F59EDA70044d0

First, make a recursive snapshot of the whole source pool/filesystem. Then, transfer the stuff to the new destination. This may take a while.

zfs snapshot -r zp03@01
zfs send -R zp03@01 | zfs receive -Fdvu zpnew

Set the destination to read-only. I don’t quite understand why this is important but if you omit this you may get errors on the incremental receives even if you never touched the destination.

Now, take a second set of recursive snapshots.

zfs snapshot -r zp03@02

And do an incremental send/receive. Notice how fast this is since we are just transferring the blocks that have changed since we made those first snapshots.

zfs send -R -i zp03@01 zp03@02 | zfs receive -dvu zpnew

Now you can repeat these last two steps a few times until you feel comfortable with it and you are ready for the real migration. Stop all access to the source filesystem (for example by shutting down all zones that are running off it and setting it read-only), perform one last incremental send/receive and then rename your source and destination. Renaming a zpool is supported through exporting and importing them with new names.

zfs set readonly=on zp03
zpool export zp03
zpool export zpnew
zpool import zp03 zpold
zpool import zpnew zp03

And you are done.

Edit 2013-05-16: Some parts of the example code used another zpool, zp04. I modified those examples to work with zp03

4 thoughts on “ZFS migration using incremental send/receive

  1. I think you introduced zp04 without explaining where it is coming from. A typo maybe?

  2. Thanks for mentioning that. zp04 was another pool that I migrated on the same box but I guess it is confusing to have it in the example so I modified it.

  3. Thanks, I found this very helpful.

    Even though it has been a long time, I hit a wall and haven’t seen anywhere that this issue and fix have been documented. I was getting random fails which yielded “…broken pipe..” over my ssh. At this beginning of the message, I finally noted that the problem was that the destination had snapshots.

    As you likely know, automatic snapshots were added to ZFS on linux as an external module and looks like now part of the main build. (Please pardon my terminology as I do this for my home servers not professionally). It turned out that at times the newly created pool started making auto-snapshots and the would cause the receive to fail. It may be that since I was sending the top level dataset, with its children with one command, that brought this out. I have seen with these error people saying it worked better to send datasets by themselves and that might prevent the problem at the cost if extra steps.

    I share this with you as adding this information would be helpful to many others judging by the many posts and frustration I’ve seen. Where would you suggest putting this to help others know to shut off any auto-snapshots while doing the initial send with large datasets which seem to bring this out.

    cheers

Leave a Reply

Your email address will not be published. Required fields are marked *