Quantcast
Channel: Slow copying between NFS/CIFS directories on same server - Server Fault
Viewing all articles
Browse latest Browse all 2

Slow copying between NFS/CIFS directories on same server

$
0
0

Please bear with me, I know it's a lot to read. This problem may be applicable to others, so it would be great to have an answer. I had to give away the bounty because it was going to expire.

When I copy to or from my NFS server (Debian) from a client (Ubuntu), it maxes out the gigabit. But, when I copy between two directories on the same server, it's speed bounces around between < 30MB/sec up to over 100MB/sec. Most of the time it's around 50MB/sec.

The same copy performed directly on the NFS server (local disks) I get 100-150 MB/sec, sometimes more. A file copy between this NFS export and a CIFS share exported from the same directory on the same server is just as slow and a copy between two directories over CIFS on the same server is slow. iperf shows bidirectional speed is 941Mb/940Mb between the client and server.

I made sure NFS is using async on the server. I also disabled sync on the ZFS dataset and tried removing the ZFS cache and log devices.

I've tested on a very fast ZFS striped mirror of 4x2TB disks, with an SSD for log and cache devices.

NFS server specs:

Debian 8.2 core 4Ghz AMD-FX
32GB ram
ZFS raid 10, SSD cache/log
17GB ARC
4x2GB WD red drives
Intel 82574L NIC

Test client:

Ubuntu 15.04, Core2Quad 2.4Ghz
8GB ram
SSD
Intel 82574L NIC

This is how things are currently set up. /pool2/Media is the share I've been testing with.

/etc/fstab on client:

UUID=575701cc-53b1-450c-9981-e1adeaa283f0 /               ext4        errors=remount-ro,discard,noatime,user_xattr 0       1
UUID=16e505ad-ab7d-4c92-b414-c6a90078c400 none            swap    sw              0       0 
/dev/fd0        /media/floppy0  auto    rw,user,noauto,exec,utf8 0       0
tmpfs    /tmp    tmpfs   mode=1777       0       0


igor:/pool2/other     /other        nfs         soft,bg,nfsvers=4,intr,rsize=65536,wsize=65536,timeo=50,nolock
igor:/pool2/Media       /Media          nfs     soft,bg,nfsvers=4,intr,rsize=65536,wsize=65536,timeo=50,nolock,noac
igor:/pool2/home        /nfshome        nfs     soft,bg,nfsvers=4,intr,rsize=65536,wsize=65536,timeo=50,nolock

/etc/exports on server (igor):

#LAN
/pool2/home 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/pool2/other 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/pool2/Media 192.168.1.0/24(rw,async,no_subtree_check,no_root_squash)
/test 192.168.1.0/24(rw,async,no_subtree_check,no_root_squash)

#OpenVPN
/pool2/home 10.0.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/pool2/other 10.0.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/pool2/Media 10.0.1.0/24(rw,sync,no_subtree_check,no_root_squash)

zpool status:

  pool: pool2
 state: ONLINE
  scan: scrub repaired 0 in 6h10m with 0 errors on Sat Oct  3 08:10:26 2015
config:

        NAME                                                 STATE     READ WRITE CKSUM
        pool2                                                ONLINE       0     0     0
          mirror-0                                           ONLINE       0     0     0
            ata-WDC_WD20EFRX-68AX9N0_WD-WMC300004469         ONLINE       0     0     0
            ata-WDC_WD20EFRX-68EUZN0_WD-WCC4MLK57MVX         ONLINE       0     0     0
          mirror-1                                           ONLINE       0     0     0
            ata-WDC_WD20EFRX-68AX9N0_WD-WCC1T0429536         ONLINE       0     0     0
            ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M0VYKFCE         ONLINE       0     0     0
        logs
          ata-KINGSTON_SV300S37A120G_50026B7751153A9F-part1  ONLINE       0     0     0
        cache
          ata-KINGSTON_SV300S37A120G_50026B7751153A9F-part2  ONLINE       0     0     0

errors: No known data errors

  pool: pool3
 state: ONLINE
  scan: scrub repaired 0 in 3h13m with 0 errors on Sat Oct  3 05:13:33 2015
config:

        NAME                                        STATE     READ WRITE CKSUM
        pool3                                       ONLINE       0     0     0
          ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E5PSCNYV  ONLINE       0     0     0

errors: No known data errors

/pool2 bonnie++ on server:

Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
igor            63G   100  99 187367  44 97357  24   325  99 274882  27 367.1  27

Bonding

I tried bonding and with a direct connection, balance-rr bonding, I get 220MB/sec read and 117MB/sec write, 40-50MB/sec copy.

iperf with bonding

[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  1.10 GBytes   942 Mbits/sec  707             sender
[  4]   0.00-10.00  sec  1.10 GBytes   941 Mbits/sec                  receiver
[  6]   0.00-10.00  sec  1.06 GBytes   909 Mbits/sec  672             sender
[  6]   0.00-10.00  sec  1.06 GBytes   908 Mbits/sec                  receiver
[SUM]   0.00-10.00  sec  2.15 GBytes  1.85 Gbits/sec  1379             sender
[SUM]   0.00-10.00  sec  2.15 GBytes  1.85 Gbits/sec                  receiver

Bonnie++ with bonding over NFS

Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
haze            16G  1442  99 192941  16 89157  15  3375  96 179716  13  6082  77

With the ssd cache/log removed, copying over NFS, iostat shows this

sdb               0.80     0.00   67.60  214.00  8561.60 23689.60   229.06     1.36    4.80   14.77    1.64   1.90  53.60
sdd               0.80     0.00   54.60  214.20  7016.00 23689.60   228.46     1.37    5.14   17.41    2.01   2.15  57.76
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sde               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               1.60     0.00  133.00  385.20 17011.20 45104.00   239.73     2.24    4.31   12.29    1.56   1.57  81.60
sdf               0.40     0.00  121.40  385.40 15387.20 45104.00   238.72     2.36    4.63   14.29    1.58   1.62  82.16
sdg               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdh               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

TMPFS

I exported a tmpfs over NFS and did a file copy - The speed was 108MB/sec. Local from the server, it is 410MB/sec.

zvol mounted over NFS

The speed bounces around between < 50MB/sec up to > 180MB/sec, but averages out to about 100MB/sec. This is about what I'm looking for. This zvol is on the same pool (pool2) as I've been testing on. This really makes me think this is more of a ZFS dataset/caching type issue.

Raw disk read test

Using this command

dd if=/dev/disk/by-id/ata-WDC_WD20EFRX-68AX9N0_WD-WMC300004469 of=/dev/null bs=1M count=2000

I get 146-148MB/sec for all 4 disks

Slow, uneven disk usage in pool

Thanks to a very helpful person on the ZFS mailing list, I know what to do to get more even usage of the disks.

The reason for ZFS to prefer mirror-1 is that it seems to be added after mirror-0 had been filled quite a bit, now ZFS is trying to rebalance the fill level.

In case you want to get rid of that and have some time: Iteratively zfs send the datasets of the pool to new datasets on itself, then destroy the source, repeat until pool is rebalanced.

I've fixed this, the data is level across all disks now This has resulted in a 75MB/sec copy speed over NFS. And 118MB/sec local.

The question

My question(s). If you can answer any one of the questions I will accept your answer:

  1. How can my problem be solved? (slow copy over NFS, but not local)
  2. If you can't answer #1, can you try this on your comparable NFS server with ZFS on Linux and tell me the results so I have something to compare it to?
  3. If you can't answer #1 or #2, can you try the same testing on a similar but non-ZFS server over NFS?

Viewing all articles
Browse latest Browse all 2

Latest Images

Trending Articles





Latest Images