Exported volumes performance graph

Yavor · Post by **Yavor** » Fri Jun 17, 2022 8:14 am

Hello guys,
i am wandering what can cause excessive service time while the IOPs and bandwith are stable.
Does anyone have an explanation how is that possible, what can i check to diagnose or avoid that?

MammaGutt · Post by **MammaGutt** » Fri Jun 17, 2022 8:48 am

Could you sched some more light on this?

Is this graph for the entire system?
What system is it and what drives?
How is CPU on the system?

My first guess would be that this is statvlun and you have host/fabric issues. Or that this is frontend performance and you have something totally different going on in the backend.

Yavor · Post by **Yavor** » Mon Jun 20, 2022 4:45 am

Hello MammaGutt,

this is a graph of all the exported VVs, so one can say it is a front end representation.
The system is 8400, the disks 15TB SSDs
The CPU strangely between 16th and 20th of May ( the time for the big service time) is used on 87% for some reason.
Can you be more specific on what you mean by host/fabric issues?
As there is no visible cause for the high service time, i am wandering how can we drill down to the root cause. If we had some VV or VVset being hit with too many IOps or too big block sizes ...
but all seems relatively level and just the service time spikes.

Post by **Richard Siemers** » Mon Jun 20, 2022 11:30 am

What services are you using? Such as dedupe, compression, snapshots, replication? Were you compacting a cpg by change? Zero in on what services were causing the CPU spike.

MammaGutt · Post by **MammaGutt** » Mon Jun 20, 2022 12:15 pm

Yavor wrote:Hello MammaGutt,

this is a graph of all the exported VVs, so one can say it is a front end representation.
The system is 8400, the disks 15TB SSDs
The CPU strangely between 16th and 20th of May ( the time for the big service time) is used on 87% for some reason.
Can you be more specific on what you mean by host/fabric issues?
As there is no visible cause for the high service time, i am wandering how can we drill down to the root cause. If we had some VV or VVset being hit with too many IOps or too big block sizes ...
but all seems relatively level and just the service time spikes.

By host I mean CPU thru the roof or hardware issue.
By fabric I mean congrested ISLs or hardware issues.

3PAR 8400 isn't a very powerful controller node. If you have a lot of the 15TB SSDs, you very quickly run out of horsepower.... Considering the price difference between a 8400 and a 8440/8450 ( significant more CPU, significant more cache) is somewhere in the area of maybe two of those SSDs (or at least a low single digit percentage) I just don't get it.

As Richard asks, are you using dedupe or compression? If yes, I'm pretty sure your node CPU and backend IOps looks totally different to your frontend and matches your peaks.

What 3PAR OS version?

Yavor · Post by **Yavor** » Fri Jun 24, 2022 9:32 am

Hello again,
we are running 3.3.1 MU5 up to P156
Replication, compresion, snapshots or deduplication are not used.
The fabrics did not have congestion.
Hosts (~250) are mainly ESXi's, so no excessive CPU utilization.
Unless some automatic compaction was running we did not run it manually.
What would be the report most representative for the load on the back end?

HPE Storage Users Group

Exported volumes performance graph

Exported volumes performance graph

Re: Exported volumes performance graph

Re: Exported volumes performance graph

Re: Exported volumes performance graph

Re: Exported volumes performance graph

Re: Exported volumes performance graph