HPE Storage Users Group

A Storage Administrator Community




Post new topic Reply to topic  [ 16 posts ]  Go to page Previous  1, 2
Author Message
 Post subject: Re: is a 14+2 config on an 8440 with 32 drives valid?
PostPosted: Sat Mar 24, 2018 5:26 pm 

Joined: Fri Jan 20, 2017 9:39 am
Posts: 58
I think I'm more confused than when I started :) I think I'm close to understanding but something between what you're saying and what I thought from discussions with others aren't matching up.

Below are some scenarios based on this discussion. I've tried to explain what's going on as best as I could but there's a couple scenarios which I don't know the answer to.

In all cases I'm most interested in what happens when there's a drive failure. Let's assume all my volumes and thin/dedupe so if something needs to grow I need to be able to allocate space. If there's a controller failure I assume there's no issue other than a performance hit.


  1. 2 nodes, 32 drives, 14+2 > no express layout because each node owns 16 drives. 14+2 requires 16 drives which doesn't meet the requirements of express layout (per 3par docs, a set size needs >50% of available drives on a node for express layout to be used). If a drive fails I'm not sure what happens in this case. Can CPGs/VVs/LDs continue to grow without any issue? I see a node not having enough drives in this case so I'm in big trouble.


  2. 2 nodes, 16 drives, 14+2 > each node owns 8 drives so to create a 14+2 the node will need to share with its partner hence the 3par will do an express layout. If I lose a drive I'm in big trouble because I need 16 drives for a full stripe but only have 15 in the system. I won't be able to grow any CPG/LDs/VVs in this situation and the array will set everything offline the next time things need to grow to avoid corrupting all my data.


  3. 2 nodes, 24 drives, 14+2 > each node owns 12 drives. 14+2 is > 50% of available drives on the node so express layout is used. Because 16 doesn't go into 12 evenly I'll have chunklets overlapping on the disks which isn't ideal in terms of performance. Will I have a problem if I lose a drive?

    The optimal set size in this scenario would be 4+2 (or 5+1) as they divide evenly by 12 and there's no danger in terms of grown if a drive were to fail using these smaller set sizes.

  4. 4 nodes, 32 drives, 14+2 > each node own 8 drives so express layout is needed. Now this is where I'm getting confused. Is the node pair 0-1 and 2-3 or 0-2 and 1-3? (ie. is the partner node in the same chassis on the other chassis?)

    To make the 14+2, each node takes 8 drives it controls and 8 drives its partner controls.

    If a drive fails and things need to grow, will I be ok will I have a problem? From your reply it seems like things will be ok but performance might be degraded since only 1 of 2 nodes can service the i/o if an LD/CPG needs to grow?

  5. 4 nodes, 32 drives, 6+2 (or 7+1) > no express layout as each node owns the exact number of drives needed to build the raid set. If there's a disk failure will I have a problem? In my mind I see the node not having enough drives to meet the raid size but I think the fact that there's a partner node might come into play here so I'm not sure what the answer is.


A couple other questions that came to mind trying to understand all this:

If things weren't thin would all this be a non issue since the space would have been allocated already?

What's the point of having all the spare chunklets if it seems like the loss of a disk can take down the array? I would think the spare chunklets would take over for the disk and the raid stripe would be created using all the spare space set aside (since the 3par puts aside enough space space to cover an entire disk)

Thanks


Top
 Profile  
Reply with quote  
 Post subject: Re: is a 14+2 config on an 8440 with 32 drives valid?
PostPosted: Sat Mar 24, 2018 7:49 pm 

Joined: Wed Nov 19, 2014 5:14 am
Posts: 505
Ok so I won't go through every scenario but the important thing to remember is the minimum disks per node pair, not whether it's express layout or not. When you create the CPG you set out a stripe size that all growth associated with that cpg must honour. In the case of 14+2 that.means 14 data and 2 parity must as a.minimum sit on separate physical drives.

Prior to express layout each node in a system would take ownership of half the disks. Node 0 odd no disks, node 1 even numbered disks. Each node would then build LDs across its assigned disks to support volumes assigned to a cpg. In normal operation the LDs would be owned by that node. Also keep in mind drives are cabled to node pairs and can be failed over or new LDs created on drives belonging to the other node.

As such a 2node system with 32 disks would assign 16 disks per node. Then it would assign chunklets to LDs based on the CPG stripe size, which ideally would divide evenly into 16 drives, so 6+2, 14+2 would all be valid. However what this meant is that from a disk perspective you required at least 2 x the CPG stripe size since each node built LDs on different disks meaning you needed at least 32 drives (16 per node).

Express layout changed this by allowing both nodes to build LDs on the same disks concurrently. So in the above scenario those same stripe sizes are equally valid with half or less disks i.e 8 or 16 per system respectively which is important given the cost of flash media. This has a few.benfits, smaller starting point (half the drives), more importantly though smaller upgrades and a higher queue depth per disk since both controllers are active per disk.

The problem.occurs however when the stripe size you specify on the CPG spans all available disk in a single node pair e.g. I have 2 nodes, 16 disks total and I want to use 14+2. In that scenario I need an absolute minimum of 16 disks available to ensure I can write 14 data and 2 parity all to independant disks. So if i only have 16 disks and one fails I'm down to 15, which means I can no longer write 14 data and 2 parity to independent disks on that node pair.

However if I have more disks than the required stripe size it's not a problem, so using 16 as a baseline we'd be fine with 18 or greater since a disk failure would not drop.me below 16 drives.

The alternative.is to use a smaller stripe size.to ensure the system still has enough disks to spread both data and parity to independent disks even after failure e.g 10+2 or smaller would ensure I always had enough disks even after failure.

As explained above for 14+2 if you have more than 1 x the stripe size.of physical disks e.g 18, 20, 22, 24, 26, 28, 30, 32.......etc per node pair then you will always have more than the required minimum number of disks to service the.stripe size of the CPG.

All of this happens per node pair so 4 controllers would.double the physical disk requirements. But its important to note it's only applicable if the required stripe size is equal to the number of physical disks per node pair,. If you have more disks then you're good so this is typically only an issue either on systems will minimal disks or very wide stripe sizes combined with few disks.

Assuming you don't want to lose chunklets or your going for cage ha then.

2 nodes 16 disks 6+2 is preferred.
2 nodes 32 disks 14+2 or 6+2 is ok.
4 nodes 32 disks 6+2 would be preferred.
4 nodes 48 disks 10+2 or 6+2 is ok
4 nodes 64 disks 14+2 or 6+2 is ok

There's lots of combinations depending on node count, cage or mag level etc but just ensure each node pair has a larger number of disks than the required stripe size.

Just as an aside if I'm sizing these systems I typically stick to 4+2 for very small systems and 6 or 8+2 for anything larger. 10+2 and 14+2 although having less parity overhead can make upgrades very expensive, especially where multiple nodes and cage HA are requirments.


Top
 Profile  
Reply with quote  
 Post subject: Re: is a 14+2 config on an 8440 with 32 drives valid?
PostPosted: Sun Mar 25, 2018 3:45 am 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
Just a comment to confuse even more. Keep in mind that when 3PAR is allocating space, all it thinks about is chunklets. So in a 16 SSD, 2 node system with 14+2 all is good. If a SSD fails you will only have 15 drives and it will not be able to do HA cage/HA mag(disk) so it should do HA chunklet which isn't a good place to be as two chunklets in the same stripe will be on the same SSD, but you are still n+2 so you will still survive a second disk failure. This is basically the same for all allocations on the 3PAR... If it can't do HA cage it will try the next best thing (HA mag) and throw an event that it wasn't able to grow as you configured but it grew with degraded CPG specs rather than coming to a stop. Same with capacity.. if SSD is full and you have FC (or worst case NL) it will use that rather than stopping.

I know this (fallback to HA chunklet) didn't work flawless in the first releases with express layout but from what I understand it should be OK now....
With that in mind I'm not a big fan of a single express layout stripe but I see why many people do it from a cost perspective.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
 Post subject: Re: is a 14+2 config on an 8440 with 32 drives valid?
PostPosted: Sun Mar 25, 2018 2:49 pm 

Joined: Fri Jan 20, 2017 9:39 am
Posts: 58
So if I understand correctly from what JohnMH and MammaGutt...

If I have a 4 node system with 32 drives and do 14+2 with mag availability (that's all SSMC will let me do is mag availability), if a drive fails I will only have 15 drives on that node pair so I can't write each block to a different disk and it would violate my availability settings. Nothing would happen to my data and I could keep growing without issue but some disks will have 2 parts of that raid stripe instead of just one.

Once I replace the drive does the 3par puts things back to the way they should be with only 1 piece of the raid stripe per disk or will I be left with some disks keeping the 2 that were written when I lost a disk?


Top
 Profile  
Reply with quote  
 Post subject: Re: is a 14+2 config on an 8440 with 32 drives valid?
PostPosted: Mon Mar 26, 2018 2:55 am 

Joined: Wed Nov 19, 2014 5:14 am
Posts: 505
With 4 nodes it's probably less of an issue as the other node pair can take up the slack until the disk is replaced. On a 2 node system you don't want to get yourself into this situation. So if possible avoid it by using a smaller stripe size.

I'd assume, but have never tried it, that the spared out data will be moved back when the drive is replaced but new data (assuming it's significant) may require a tunesys or similar operation to re-balance the data.


Top
 Profile  
Reply with quote  
 Post subject: Re: is a 14+2 config on an 8440 with 32 drives valid?
PostPosted: Mon Mar 26, 2018 3:54 am 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
Kyle wrote:
Once I replace the drive does the 3par puts things back to the way they should be with only 1 piece of the raid stripe per disk or will I be left with some disks keeping the 2 that were written when I lost a disk?


With 4 node system I'm not sure if it will prioritize putting all new write behind the node pair with still 16 drives (as here it is able to follow the CPG settings) or if it will balance it between both node pairs.

If it does balance it I would not assume that it will fix the uneven chunklets by itself. Servicemag will move all chunklets off the failed SSD and to spare chunklets on other SSDs. When it grows with degraded parameters those are not put into spare chunklets so they shouldn't be automatically moved back after SSD replacement. So here tunesys would be your help. "showld -d" will show the availability to the LDs and if it isn't mag or cage, you would need to tune (either just the affected LDs or the entire system depending on how many commands you would want to run).

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 16 posts ]  Go to page Previous  1, 2


Who is online

Users browsing this forum: Google [Bot] and 38 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group | DVGFX2 by: Matt