HPE Storage Users Group

A Storage Administrator Community




Post new topic Reply to topic  [ 13 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Array wide latency during big (1 MB+) writes
PostPosted: Mon May 04, 2020 7:34 pm 

Joined: Tue Apr 01, 2014 8:27 pm
Posts: 21
Hi Fellow Admins!

I am using a managed service, which is running VMware vSphere on 3PAR 9450.

It is running quite well except times, when couple of Windows VMs start batch job writing just couple hundred but big I/O (1 MB+) - whole array struggles and we see high latency spikes across all datastores.

As a temporary workaround I applied IOPS limits on these VMs (managed to get value high enough not to impact the app but low enough to make array happy).

Wonder if anyone had similar problem?
What better solution may be?
I am thinking about requesting provider to use Disk.DiskMaxIOSize parameter ( https://kb.vmware.com/s/article/1003469 ) to limit max IO size on the host, but I cannot find clear info what the value should be. I found old HP EVA documentation suggesting 128 KB and PureStorage 4MB, but nothing 3PAR specific.

Anyone has this parameter changed from the default 32MB in environment?

Thanks in advance, have great day :)


Top
 Profile  
Reply with quote  
 Post subject: Re: Array wide latency during big (1 MB+) writes
PostPosted: Mon May 04, 2020 11:19 pm 
Site Admin
User avatar

Joined: Tue Aug 18, 2009 10:35 pm
Posts: 1328
Location: Dallas, Texas
Start by isolating which component is saturating. How is the array configured? FC or ISCSI, are the drives SSD, FC or NL? How many ESX hosts, and how are they connected? Is multipathing configured correctly? Do you have access to the whole array or did the service provider carve you out a virtual domain that has its own QOS limits imposed?

Run some reports, compare latency on PDs, vs backend controllers, vs front end controllers.

_________________
Richard Siemers
The views and opinions expressed are my own and do not necessarily reflect those of my employer.


Top
 Profile  
Reply with quote  
 Post subject: Re: Array wide latency during big (1 MB+) writes
PostPosted: Mon May 04, 2020 11:57 pm 

Joined: Wed May 07, 2014 1:51 am
Posts: 267
You should look at replication as well, I think there's something like "all mirroring is done with 16k-IOPS, so one 1MB write IO toggles x mirroring IO" going on.

_________________
When all else fails, read the instructions.


Top
 Profile  
Reply with quote  
 Post subject: Re: Array wide latency during big (1 MB+) writes
PostPosted: Tue May 05, 2020 12:11 am 

Joined: Tue Apr 01, 2014 8:27 pm
Posts: 21
Hi and thanks for useful questions!

According to what we know, it appears to be controller which is saturated.

We have very similar symptoms in couple different environments - from tens to 200+ hosts per array.
Actually not all arrays are 9450 (all SSD) - there are also hybrid ones (FC and SSD), all are affected in very similar way. It is FC, host connectivity seems to be configured according to guidelines.
In each case array is not shared with other customers, however provider runs some infrastructure workloads needed to run service (relatively low load comparing to ours).

Generally everything is fine for most of the time, hundreds or even thousands VMs are running without major problems with decent latency - when big writes are happening, latency jumps to tens of ms or even hundreds in some cases.

Here is thread about similar issues I have found in the meantime:
https://www.reddit.com/r/vmware/comment ... h_problem/


Top
 Profile  
Reply with quote  
 Post subject: Re: Array wide latency during big (1 MB+) writes
PostPosted: Tue May 05, 2020 12:18 am 

Joined: Tue Apr 01, 2014 8:27 pm
Posts: 21
apol wrote:
You should look at replication as well, I think there's something like "all mirroring is done with 16k-IOPS, so one 1MB write IO toggles x mirroring IO" going on.


Good suggestion, however these VMs which generate big I/O are not on replicated volumes.

But we actually see impact of the traffic on both sites using replicated volumes - bigger impact at site where big IO is happening, smaller but detectable in another.


Top
 Profile  
Reply with quote  
 Post subject: Re: Array wide latency during big (1 MB+) writes
PostPosted: Tue May 05, 2020 12:20 am 

Joined: Tue Apr 01, 2014 8:27 pm
Posts: 21
qbas81 wrote:
Here is thread about similar issues I have found in the meantime:
https://www.reddit.com/r/vmware/comment ... h_problem/


Someone has suggested there to change Change Disk.DiskMaxIOSize from 32767KB to 32KB - but this is not from documentation or vendor (and seems to be extreme change to me).

We are planning to do some testing with this parameter, but it would save us time if there is known experience :)


Top
 Profile  
Reply with quote  
 Post subject: Re: Array wide latency during big (1 MB+) writes
PostPosted: Tue May 05, 2020 7:27 am 

Joined: Tue Jan 15, 2013 10:33 am
Posts: 41
You should look at the frontend ports, if they are saturated.

statport -host -rw -ni -d 5 -idlep

If yes you need to use additional ports.


Top
 Profile  
Reply with quote  
 Post subject: Re: Array wide latency during big (1 MB+) writes
PostPosted: Tue May 05, 2020 8:10 pm 

Joined: Tue Apr 01, 2014 8:27 pm
Posts: 21
oby wrote:
You should look at the frontend ports, if they are saturated.


From what I have been told and some reports I have seen the issue is rather with CPU load, not ports (4 ports are used, with round robin with path switching after 1 I/O).


Top
 Profile  
Reply with quote  
 Post subject: Re: Array wide latency during big (1 MB+) writes
PostPosted: Wed May 06, 2020 11:58 pm 

Joined: Tue Apr 01, 2014 8:27 pm
Posts: 21
We got recommendation from HPE support - they suggested to set max IO size on hosts as 256 KB.


Top
 Profile  
Reply with quote  
 Post subject: Re: Array wide latency during big (512 KB+) writes
PostPosted: Tue May 19, 2020 10:18 pm 

Joined: Tue Apr 01, 2014 8:27 pm
Posts: 21
Some update if anyone is interested.

We found that we actually mostly get 512 KB IO on the array, because VMware PVSCSI adapters are using 512 KB as max:
https://blogs.vmware.com/vsphere/2014/0 ... e-ios.html

anyway enforcing maxiosize=256 KB on host end seems to help a bit with latency but it is still visible during load.
We have not seen any material negative effects of this limit on workloads.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 13 posts ]  Go to page 1, 2  Next


Who is online

Users browsing this forum: No registered users and 40 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group | DVGFX2 by: Matt