3par Support?

hdtvguy
Posts: 576
Joined: Sun Jul 29, 2012 9:30 am

Re: 3par Support?

Post by hdtvguy »

Richard Siemers wrote:So 3PAR has a concept of consistency groups, and does take consistent group snaps of volumes (at the source) for periodic replication... your issue with it is that if the periodic replication update fails for whatever reason, it won't roll back the remote group to the last good copy. I misunderstood your original post, I thought you meant they had no consistency feature at all.

To your point, if the customer broke away from using the built in replication scheduler, and wrote their own scripts, could they not mitigate that issue by taking a snapshot of the remote replica before starting the group periodic update? Then delete the snap after successful replication?


That is what we are doing. We are basically taking snapshots of all base volumes on the destination and present those to hosts. We map RC Groups to VVSets and then have a script that watches the date/time of the replicated base volumes, when all of those ina VVSet/RCGroup pair are newer than the exported snapshots AND none of the volumes are replicating we then updateVV against the VVSet. There is a lot of error checking I am adding to catch scenarios with high frequency replication intervals. It is a huge resource drain to write and manage these and not fool proof.

Not to mention with almost 200 RC Groups the CLI load is heavy.

In the end 3par replication has not lived up to our expectation and in hindsight we should have done a more thorough due diligence before selecting our storage technology.
Cleanur
Posts: 254
Joined: Wed Aug 07, 2013 3:22 pm

Re: 3par Support?

Post by Cleanur »

It sounds like you may be trying to build a better mousetrap. My understanding is that the volumes that fail to complete a resync will be promoted (reverted back to snapshot). This means that following the volume promotion the RC group will not appear consistent. However the group will become consistent when the user completes either of the following actions.


1. Assuming this is a DR failover scenario - Executes a ‘failover’ operation on the secondary array, at which time ALL the volumes in the consistency group are promoted (reverted back to previous snapshots and made consistent).
2. Assuming this is transitory failure - The user restarts the consistency group and allows the re synchronization operation to complete successfully.

The reasons for this behavior are :-

1. Promotes take time to complete across large consistency groups with big delta's and until all the promotes have completed the group cannot be restarted, so unless you are initiating failover i.e this isn't just a link outage there's little point in going through the promotion (reverting) of all volumes to the previous snapshot.
2. If the system were to automatically promote all the volumes before failover were initiated by the user this would just mean having to needlessly resync all of the previously successfully synced volumes again once the group gets restarted, wasting both bandwidth and time.

This is really describing an edge case and could be documented better, but what it means is if a sync fails due to the primary array going offline or the links going down part way through the sync the remote volumes will remain read only and you won't have a consistent view. Until that is, you initiate failover, at which point they will ALL be promoted and made read / writeable at the remote site.

I have seen similar complaints in the past but the real reason this was a problem wasn't to do with DR which works as designed (above). It was picked up because the DR solution was being used as an offsite backup solution. Which is a valid use case, it's just you need to take this behavior into account. If you are using it for backup with remote snaps then you need to be aware that your last consistent RPO image will be from the previous completed sync.

Just checked and this is now documented correctly in the current Remote Copy guide (August 2013)
Post Reply