HPE Storage Users Group

A Storage Administrator Community




Post new topic Reply to topic  [ 13 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: which vv or host contributed to high service times?
PostPosted: Thu Oct 23, 2014 3:46 pm 

Joined: Thu May 08, 2014 4:43 pm
Posts: 62
F400, most of these are on VVs from an FC CPG. I believe our backup LUN for DB2 is on a single NL VV. DB backup runs from 19:30 to about 23:00.

Had an event early this morning. Now going back with SR, around the time of the event, IOPs less than 1000 around 00:04, 00:09, through 00:34, with very high service times, then they fell off.

For my first report, I selected the DB host and all associated VVs. For the second report, I selected six application ( no DB ) hosts and their associated VVs. Both reports show high service times in the same time period.

VLUN Perf
The DB host/VVs showed a READ service time of 18ms and a WRITE service time of 31ms ( 21.8 total ) with about 501 IOPs, queue length of 6.

The APP hosts/VVs showed READ service time of 10.5ms and a WRITE service time of 25.6ms ( 25.5 total ) with about 60 IOPs, queue length of 1.

I checked some SQL VVs just to see how they were behaving, looks similar, low IOP ( 443 ), high service time, 34.5ms.

Checking Ports, link transfers ( please explain ) at 00:09 was 92980 transfers/s

Ports
Port Performance for all ports at 00:09 was 90593 Total IOP/s
Bandwidth at 00:09 for all ports was 1.25 million KBytes/s Total
all ports at 00:09 Service Time total 5.6ms
queue length was 216 at 00:09

Is there a way to see if a particular host/VVs were causing this? Could have been my DB host, but IOPs are quite low during the same time. Perhaps a 3PAR admin job?


Top
 Profile  
Reply with quote  
 Post subject: Re: which vv or host contributed to high service times?
PostPosted: Sat Oct 25, 2014 1:27 pm 

Joined: Fri Jun 27, 2014 2:01 am
Posts: 390
If you have external System Reporter, give a look to Hi Res VLUN Perf graphs.
Otherway... Try with the IMC.


Top
 Profile  
Reply with quote  
 Post subject: Re: which vv or host contributed to high service times?
PostPosted: Tue Oct 28, 2014 4:17 pm 

Joined: Thu May 08, 2014 4:43 pm
Posts: 62
I believe I ran hi res performance and just couldn't pinpoint it down to a specific LUN...I guess If I did them one at a time?

IMC, that is the realtime tool in the Management Console, correct?

I have thought about staying up late and just watching it. I know there are some statvv type commands I could run...anyone have success at running these on a schedule basis from your desktop or a host?

We have found Solarwinds beneficial in this particular case in pinpointing a possible culprit.


Top
 Profile  
Reply with quote  
 Post subject: Re: which vv or host contributed to high service times?
PostPosted: Tue Oct 28, 2014 11:42 pm 
Site Admin
User avatar

Joined: Tue Aug 18, 2009 10:35 pm
Posts: 1328
Location: Dallas, Texas
Try using Hi-res to view port perf, for DISK ports, compared by N:S:P... see if any particular loops spike.

Do the same for front end ports, compared by N:S:P .. looking for hosts that are NOT using round robin correctly is pretty hard/difficult.

If you find a spike on the back end, you can track it down to a shelf, and probably a PD with a PD perf report limited to those on that spiked loop. You should be able to ssh to the insert and pull detailed logs to see of there were LESB errors, or bad chunk let being swapped etc.

If you find a spike on the front end, you can build a list of hosts zoned to that FE port, and work your way down... VLUN perf, limited per 1 host, compared by N:S:P, the lines should be close to on-top of each other if round robin is setup properly.

_________________
Richard Siemers
The views and opinions expressed are my own and do not necessarily reflect those of my employer.


Top
 Profile  
Reply with quote  
 Post subject: Re: which vv or host contributed to high service times?
PostPosted: Wed Oct 29, 2014 11:15 am 

Joined: Thu May 08, 2014 4:43 pm
Posts: 62
Richard Siemers wrote:
Try using Hi-res to view port perf, for DISK ports, compared by N:S:P... see if any particular loops spike.


Very interesting. Looking at the 00:21 yesterday morning time, yes, indeed, 3:2:3 and 2:2:3 are 1874 and 1901 IOPs respectively, whereas the remaining six ports are less than 1000 IOPs per ( 970, 960, 947, 964, 947, 950 ).

Bandwidth on 3:2:3 and 2:2:3 is 92,000 and 95,000, the remaining six are around 38,000 per.

Service time on 3:2:3 and 2:2:3 is 36ms and 45ms respectively, the other average 5ms per.

Average Busy, 3:2:3 and 2:2:3 are nearly 100% ( 96% and 97% ) the other six are low 80% range.

A glance at the colors and I see RED 3:2:3 and GREEN 2:2:3 consistently above the rest. Is this a rebalance issue?

Edit: adding to this,
cage4, loop A 3:2:3, loop B, 2:2:3 have both FC and NL disk
cage10, loop A 3:2:3, loop B, 2:2:3 have both FC and NL disk

Of the 12 cages shown, our NL disk is only in those two cages, 4 and 10. Related???

Question??? Would a TUNE on the CPG where the database lives and seems to be most affected, currently RAID 1, Tuned to a RAID 5 CPG help??

Richard Siemers wrote:
Do the same for front end ports, compared by N:S:P .. looking for hosts that are NOT using round robin correctly is pretty hard/difficult.


Is that host port type versus disk port type in previous?
Port Types : host ; Port Rates : --All Port Rates-- ; Ports (n:s:p) : --All Ports-- ; Compare : n:s:p
Select Peak : total_iops

Those all look fairly balanced...only seeing 2:1:1, 2:1:2, 3:1:1, and 3:1:2

Richard Siemers wrote:
If you find a spike on the back end, you can track it down to a shelf, and probably a PD with a PD perf report limited to those on that spiked loop. You should be able to ssh to the insert and pull detailed logs to see of there were LESB errors, or bad chunk let being swapped etc.

If you find a spike on the front end, you can build a list of hosts zoned to that FE port, and work your way down... VLUN perf, limited per 1 host, compared by N:S:P, the lines should be close to on-top of each other if round robin is setup properly.


Top
 Profile  
Reply with quote  
 Post subject: Re: which vv or host contributed to high service times?
PostPosted: Wed Oct 29, 2014 12:10 pm 

Joined: Thu May 08, 2014 4:43 pm
Posts: 62
Does showportlesb show cumulative information? Not exactly sure how to interpret it. I see large numbers for LossSync and InvWord on all controllers. I see occasional double digits on LinkFail for some PDs on some controllers, but mostly 0 or 1.

Loop <2:2:3> Time since last save: 144:01:37
ID ALPA LinkFail LossSync LossSig PrimSeq InvWord InvCRC


Top
 Profile  
Reply with quote  
 Post subject: Re: which vv or host contributed to high service times?
PostPosted: Wed Oct 29, 2014 10:56 pm 
Site Admin
User avatar

Joined: Tue Aug 18, 2009 10:35 pm
Posts: 1328
Location: Dallas, Texas
mohuddle wrote:
Edit: adding to this,
cage4, loop A 3:2:3, loop B, 2:2:3 have both FC and NL disk
cage10, loop A 3:2:3, loop B, 2:2:3 have both FC and NL disk

Of the 12 cages shown, our NL disk is only in those two cages, 4 and 10. Related???

Question??? Would a TUNE on the CPG where the database lives and seems to be most affected, currently RAID 1, Tuned to a RAID 5 CPG help??



Yes I would speculate that you have a hardware imbalance. Especially if cage 4 and cage 10 are in a single daisy chain which sounds like they are from above (both on 3:2:3 and 2:2:3) and contain ALL of your systems NL drives as stated.

You said F400, 8 total backend ports, and 12 cages... from all that info, I assume you have 4 nodes, 2 loops each.... which means you should have 3 cages per loop, yet we only see 2 cages above on those loops.

Ideally, you should have equally counts of NL and FC spindles on each loop.

Basically, and over simplified, when you run backups, which I assume writes to your NL drives... all 8 loops of FC disks are being utilized for read IO... and 2 of those 8 have the extra duty of doing all of the writes to NL drives. Does that sound like an accurate summary?

If all the above is true, then tune will not you help you. You will need to work with HP to rebalance the NL disks across all of your loops, which can be slow and difficult.

_________________
Richard Siemers
The views and opinions expressed are my own and do not necessarily reflect those of my employer.


Top
 Profile  
Reply with quote  
 Post subject: Re: which vv or host contributed to high service times?
PostPosted: Wed Oct 29, 2014 11:01 pm 
Site Admin
User avatar

Joined: Tue Aug 18, 2009 10:35 pm
Posts: 1328
Location: Dallas, Texas
To further drill into your response time issue, I think you can go 1 step further, PD perf, limited to 3:2:3 and 2:2:3 and compare by DISK TYPE (FC vs NL).

Pretty sure it will be the NL causing the alarm.

_________________
Richard Siemers
The views and opinions expressed are my own and do not necessarily reflect those of my employer.


Top
 Profile  
Reply with quote  
 Post subject: Re: which vv or host contributed to high service times?
PostPosted: Thu Oct 30, 2014 10:17 am 

Joined: Thu May 08, 2014 4:43 pm
Posts: 62
Edit: We only have TWO CONTROLLER NODES

Linking img of cage setup. Does this confirm what you suspected in previous email?

Thanks a ton for your help, it is much appreciated.

Image


Last edited by mohuddle on Thu Oct 30, 2014 2:38 pm, edited 2 times in total.

Top
 Profile  
Reply with quote  
 Post subject: Re: which vv or host contributed to high service times?
PostPosted: Thu Oct 30, 2014 11:03 am 

Joined: Thu May 08, 2014 4:43 pm
Posts: 62
considering a look at QoS on the F400 as a short-term patch.

Ultimately utilizing a temp license on Peer Motion to move some of these hosts over to our new 7400 and/or map new LUNs from the 7400 and move data.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 13 posts ]  Go to page 1, 2  Next


Who is online

Users browsing this forum: Google [Bot] and 175 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group | DVGFX2 by: Matt