ZFS pool with SSD ZIL (log) device shared with NFS – Performance problems!

Some times ago I bought a STEC ZeusIOPS SSD with 18GB capacity. This disks comes out from a Sun ZFS Storage 7420 system. But it’s 3.5″ large and without a server which supports 3.5″ large SAS disk drives I couldn’t test the SSD. Today I was able to test the drive on a Fujitsu Primergy RX300 S5 server. I installed five 500 GB large SATA drives and my STEC ZeusIOPS SSD. The first disk contains an OpenIndiana installation, the rpool. The remaing four SATA drives are grouped as a ZFS RAIDZ2 pool. I exported a ZFS dataset over NFS and 1 GbE to a VMware ESX system. With a Ubuntu linux virtual machine I run several benchmarks.

The results without the SSD are 75-80 MBytes/s write (850ms latency),  between 40 and 65 MBytes/s rewrite and 120 MBytes/s read performance. I did different runs with bonnie++ and iozone and achieved always similar values. During the tools did their benchmarks I watched the IO with „zfs iostat“. The write and rewrite results matched the numbers above. Reading lot’s of data from disk was not necessarry due a large enough ARC mem cache. That’s why the iostat output values was lower than 10 Mybtes/s.

Then I added the STEC SSD as log device to the ZFS pool and rerun all the tests. But I couldn’t believe the values!!! My benchmarks finished with only 45-50 MBytes/s write and 35-45 MBytes/s rewrite. Read performance didn’t changed, of course. The write latency exceeded 10000ms!!! Something went wrong but I don’t know what. I did the runs again and I watched the zfs iostat output parallel. But the output of zfs iostat throwed values always above 100 Mbytes/s. Sometimes I reached even values above 170 MBytes/s, but always more than 100 Mbytes/s. This is the maximum rate for a single 1 GbE connection! But the benchmark output was very different. They didn’t reached the results of the benchmark without the SSD. I was confused. I disabled the log device with the logbias option and set it to throughput. The benchmark result and the iostat results went back to 75-80 Mbytes/s write. I reenabled the log device with logbias=latency and I had again the benchmark result of max. 50 Mybtes/s write and big latency values but with an iostat ouput always over 100 Mbytes/s!

Something is wrong, but I don’t know what. 🙁 Do you have an idea?

6 thoughts on “ZFS pool with SSD ZIL (log) device shared with NFS – Performance problems!

  1. I’m not sure why the LOG device is hurting you but I suspect you’re getting IOps inflation due to mis-alligned IO.

    What file system is ubuntu using? (EXT3?)

    What is the ZFS record size on the filer?

    Does the ubuntu system have a partition table on the vdisk? What sector does it start on?

    I would recommend changing the ZFS recordsize to match the ubuntu system file system block size (probably 4k). Then create a new vmdk on the share and add it to the ubuntu system. When you write a partition table to this new vmdk ensure that it aligns to a 4K offset, ie start on sector 8, 16, 32, 64.

  2. Interesting problem, indeed.
    @Alexan: Misalignment might be the cause, indeed. But shouldn’t be the configuration WITH logzilla deliver more throughput then? And not less?

    @tschokko: Have you checked the throughput of the ZeusIOPS standalone?

    wolfgang

  3. I had really bad performance with following STEC SSD until I used the following settings.

    # vi /etc/driver/drv/sd.conf
    sd-config-list=“STEC Z16IZF2E-100UCU „,“throttle-max:32, disksort:false, cache-nonvolatile:true“;

    Andreas

  4. Hi!
    Do you have more information about ZIL on ZFS zith NFS VMFS Datastore ?

    Did you resolve your performance problem, and if yes, how ?

    Thank you !

Comments are closed.