2x 3Par7400 3.1.3MU1 Remote Copy Group Switchover issue

phoglind
Posts: 18
Joined: Thu Jun 12, 2014 3:01 pm

2x 3Par7400 3.1.3MU1 Remote Copy Group Switchover issue

Post by phoglind »

Hi,

I'm new to all 3PAR concepts (though I have been running series of Enterprise Virtual Arrays and used to zoning configurations in SAN switcher)

We are building an new Prove of Concept site on 2x 3PAR 7400 arrays (updated to 3.1.3MU1 today) and 4x VMware vSphere v5.1 nodes in an MetroCluster config stretched between two datacenters.

Setting upp VVs included in an Remote Gopy Group, following best practice for zoning and paths, everything seems OK from the beginning.

Looking at the datastore in VMware I have half the paths Active/Active to the Primary Volume and the other half of paths as Standby to the Secondary Volume.

Now things getting weird, as I run "setrcopygroup switchver [groupname]" Secondary volume gets promoted to Primary, Ex-Primary will be demoted to Seconday. Still all looks good in IMC.

But verifying my four VMware hosts, Datastores are gone on all four!
Eventually they appears again but with Active path to the Ex-Primary volumes and then disappears again.

If I unexport the Ex-Primary volume (Secondary volume) all paths fails over to the resent promoted Volume and all datastores will mount properly again. (at least the Zoning config could be ruled out now from further troubleshootning)

Are there anybody out here that have seen identical things happening before?

Kind Regards,
Peter
phoglind
Posts: 18
Joined: Thu Jun 12, 2014 3:01 pm

Re: 2x 3Par7400 3.1.3MU1 Remote Copy Group Switchover issue

Post by phoglind »

Looks like the path management policy is default OFF when creating new RCG's in 3.1.3.
My understanding is that it was default ON in 3.1.2.

# setrcopygroup pol [path_management | no_path_management] <group>

I wonder if this will applies on Windows Server 2012 R2 to?
Would be nice to stretch such cluster as well

//Peter
bajorgensen
Posts: 142
Joined: Wed May 07, 2014 10:29 am

Re: 2x 3Par7400 3.1.3MU1 Remote Copy Group Switchover issue

Post by bajorgensen »

They changed the path managment policy in 3.1.3.

setrcopygroup

Really, really silly change by HP. Our first 3.1.3 upgrade did not go well...
phoglind
Posts: 18
Joined: Thu Jun 12, 2014 3:01 pm

Re: 2x 3Par7400 3.1.3MU1 Remote Copy Group Switchover issue

Post by phoglind »

Agree to that, annoying that you need to set this manually.

Anyway good to get under the hood of the system right away and get knowledge of all ifs and buts.

Will run an new testdrive tomorrow, hopefully everything will go smoothly and I can demonstrate things for my team.

Anybody that knows if 3.1.3 officially supports Streched Microsoft Cluster Service in the same way as with VMware Metro Storage Cluster?

//Peter
bajorgensen
Posts: 142
Joined: Wed May 07, 2014 10:29 am

Re: 2x 3Par7400 3.1.3MU1 Remote Copy Group Switchover issue

Post by bajorgensen »

You need cluster extension for windows to do that.

http://h20565.www2.hp.com/portal/site/h ... .492883150
Davidkn
Posts: 237
Joined: Mon May 26, 2014 7:15 am

Re: 2x 3Par7400 3.1.3MU1 Remote Copy Group Switchover issue

Post by Davidkn »

Is this something you have to change on the cli, or is there a check box in the GUI we now need to tick?

Do you have the peer persistence license? Are you replicating synchronously using RCFC?

Also make sure that when you present the volumes from the primary and dr array, that the source and dr volumes are presenting using the same lun number.

Have you thought about using the quorum witness on a third site to give you automated failover?
phoglind
Posts: 18
Joined: Thu Jun 12, 2014 3:01 pm

Re: 2x 3Par7400 3.1.3MU1 Remote Copy Group Switchover issue

Post by phoglind »

About Cluster extension I was hoping they would support MSCS without any extra licenses except for Repl suite and Peer Persistence in 3PAR.

Path Management Policy seems to be changed in 3.1.3 and you need to set that trough CLI on all VMware RCgroups.

And yes we have Peer Persistence license and will repl synchronously.
All Pri/Sec volumes also need to have the same WWN as well as exported with the same LUN ID otherwise you will end up with multiple devices when scanning the hba's

We have implemented Quorum Witness on 3rd site and will add automatic failover protection for necessary volumes.
apol
Posts: 267
Joined: Wed May 07, 2014 1:51 am

Re: 2x 3Par7400 3.1.3MU1 Remote Copy Group Switchover issue

Post by apol »

AFAIK the Path Management Policy was newly introduced with 3.1.3. The 3.1.2 microcode always enabled alua path management.

That policy and it's default state "off" are especially painfull if you update from 3.1.2 and have only manual transparent failover implemented - the 3.1.3 installer enables path management just for those remote copy groups where the policy auto_failover is enabled. Watching your esx-clusters crashing is no fun at all... The solution is to activate auto_failover for your remote copy groups, even if there is no quorum witness, and deactivate it again after you successfully upgraded to 3.1.3.
When all else fails, read the instructions.
apol
Posts: 267
Joined: Wed May 07, 2014 1:51 am

Re: 2x 3Par7400 3.1.3MU1 Remote Copy Group Switchover issue

Post by apol »

Also make sure that when you present the volumes from the primary and dr array, that the source and dr volumes are presenting using the same lun number.


Did anybody try without assuring same lun numbers? Only one of the pdfs about Peer Persistence includes this requirement, and the HP guys I asked weren't aware of it. Exporting primaries and secondaries with different lun numbers works, and the esx sees half of the paths as active, the others as standby. But to be honest, I corrected the lun ids and didn't dare to switchover with different ones...
When all else fails, read the instructions.
phoglind
Posts: 18
Joined: Thu Jun 12, 2014 3:01 pm

Re: 2x 3Par7400 3.1.3MU1 Remote Copy Group Switchover issue

Post by phoglind »

I've just tested to add new volumes to an new remote copy group with only "auto_recover" option set.

By enabling this new rcg in peer persistence as an "automatic_failover" group it will not by default add path_management policy option as well to the rcg. You will still need to set that option manually through cli.

I'll bet that many upgrades will fail on this.

Why not implement an checkbox for the "path_management" in advanced settings when creating new remote copy groups?
Post Reply