HPE Storage Users Group
https://3parug.net/

Array wide latency during big (1 MB+) writes
https://3parug.net/viewtopic.php?f=18&t=3394
Page 1 of 2

Author:  qbas81 [ Mon May 04, 2020 7:34 pm ]
Post subject:  Array wide latency during big (1 MB+) writes

Hi Fellow Admins!

I am using a managed service, which is running VMware vSphere on 3PAR 9450.

It is running quite well except times, when couple of Windows VMs start batch job writing just couple hundred but big I/O (1 MB+) - whole array struggles and we see high latency spikes across all datastores.

As a temporary workaround I applied IOPS limits on these VMs (managed to get value high enough not to impact the app but low enough to make array happy).

Wonder if anyone had similar problem?
What better solution may be?
I am thinking about requesting provider to use Disk.DiskMaxIOSize parameter ( https://kb.vmware.com/s/article/1003469 ) to limit max IO size on the host, but I cannot find clear info what the value should be. I found old HP EVA documentation suggesting 128 KB and PureStorage 4MB, but nothing 3PAR specific.

Anyone has this parameter changed from the default 32MB in environment?

Thanks in advance, have great day :)

Author:  Richard Siemers [ Mon May 04, 2020 11:19 pm ]
Post subject:  Re: Array wide latency during big (1 MB+) writes

Start by isolating which component is saturating. How is the array configured? FC or ISCSI, are the drives SSD, FC or NL? How many ESX hosts, and how are they connected? Is multipathing configured correctly? Do you have access to the whole array or did the service provider carve you out a virtual domain that has its own QOS limits imposed?

Run some reports, compare latency on PDs, vs backend controllers, vs front end controllers.

Author:  apol [ Mon May 04, 2020 11:57 pm ]
Post subject:  Re: Array wide latency during big (1 MB+) writes

You should look at replication as well, I think there's something like "all mirroring is done with 16k-IOPS, so one 1MB write IO toggles x mirroring IO" going on.

Author:  qbas81 [ Tue May 05, 2020 12:11 am ]
Post subject:  Re: Array wide latency during big (1 MB+) writes

Hi and thanks for useful questions!

According to what we know, it appears to be controller which is saturated.

We have very similar symptoms in couple different environments - from tens to 200+ hosts per array.
Actually not all arrays are 9450 (all SSD) - there are also hybrid ones (FC and SSD), all are affected in very similar way. It is FC, host connectivity seems to be configured according to guidelines.
In each case array is not shared with other customers, however provider runs some infrastructure workloads needed to run service (relatively low load comparing to ours).

Generally everything is fine for most of the time, hundreds or even thousands VMs are running without major problems with decent latency - when big writes are happening, latency jumps to tens of ms or even hundreds in some cases.

Here is thread about similar issues I have found in the meantime:
https://www.reddit.com/r/vmware/comment ... h_problem/

Author:  qbas81 [ Tue May 05, 2020 12:18 am ]
Post subject:  Re: Array wide latency during big (1 MB+) writes

apol wrote:
You should look at replication as well, I think there's something like "all mirroring is done with 16k-IOPS, so one 1MB write IO toggles x mirroring IO" going on.


Good suggestion, however these VMs which generate big I/O are not on replicated volumes.

But we actually see impact of the traffic on both sites using replicated volumes - bigger impact at site where big IO is happening, smaller but detectable in another.

Author:  qbas81 [ Tue May 05, 2020 12:20 am ]
Post subject:  Re: Array wide latency during big (1 MB+) writes

qbas81 wrote:
Here is thread about similar issues I have found in the meantime:
https://www.reddit.com/r/vmware/comment ... h_problem/


Someone has suggested there to change Change Disk.DiskMaxIOSize from 32767KB to 32KB - but this is not from documentation or vendor (and seems to be extreme change to me).

We are planning to do some testing with this parameter, but it would save us time if there is known experience :)

Author:  oby [ Tue May 05, 2020 7:27 am ]
Post subject:  Re: Array wide latency during big (1 MB+) writes

You should look at the frontend ports, if they are saturated.

statport -host -rw -ni -d 5 -idlep

If yes you need to use additional ports.

Author:  qbas81 [ Tue May 05, 2020 8:10 pm ]
Post subject:  Re: Array wide latency during big (1 MB+) writes

oby wrote:
You should look at the frontend ports, if they are saturated.


From what I have been told and some reports I have seen the issue is rather with CPU load, not ports (4 ports are used, with round robin with path switching after 1 I/O).

Author:  qbas81 [ Wed May 06, 2020 11:58 pm ]
Post subject:  Re: Array wide latency during big (1 MB+) writes

We got recommendation from HPE support - they suggested to set max IO size on hosts as 256 KB.

Author:  qbas81 [ Tue May 19, 2020 10:18 pm ]
Post subject:  Re: Array wide latency during big (512 KB+) writes

Some update if anyone is interested.

We found that we actually mostly get 512 KB IO on the array, because VMware PVSCSI adapters are using 512 KB as max:
https://blogs.vmware.com/vsphere/2014/0 ... e-ios.html

anyway enforcing maxiosize=256 KB on host end seems to help a bit with latency but it is still visible during load.
We have not seen any material negative effects of this limit on workloads.

Page 1 of 2 All times are UTC - 5 hours
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/