Remote Copy (3par v400) HP P10000

ilv · Post by **ilv** » Fri Mar 23, 2012 1:05 pm

Hello,
We have 2 3Par v400 systems, which are located in 2 data centers (aprox 4 km - fibre measurement).
We established bi-directional remote copy link (replication) and we try to set-up a syncronous replication volume.
The problem we are facing is that the sync replication, in our case, turns out to be a real performance penalty (~75 %). Thus, for a queue-depth=1 , for the replicated disk we have aprox. 330 IOPS while stopping the replication turns out that our unreplicated disk is able to support 1200 IOPS.
checkrclink reports:
Latency: 0.521 ms
Lost pings: 0 %
Through-put: > 2687589 Kbits/second

Can you please share with me your experiences, because I can hardly believe that this is general approved behaviour.

Thank you
ilv

Post by **Richard Siemers** » Fri Mar 23, 2012 4:41 pm

Have you tested the same with async to compare yet?

You mentioned a 4k fibre measurement, does that imply that you are using 4g FC ports and not IP to replicate? What is the link between the 2 sites?

ilv · Post by **ilv** » Fri Mar 23, 2012 5:08 pm

Hi Richard,
No we didn't test it with async, because I could not make it work(I simply tried to change from sync to periodic). Another error message which should be the subject of another post.

Yes I should have specify it's a single mode 8Gbps link , 2 Brocade 5100 switches at each location.

Thank you for any idea on this.

Post by **Richard Siemers** » Fri Mar 23, 2012 11:08 pm

You mentioned a queue depth of 1, I am assuming you are running a benchmark utility and limiting it to 1? I believe those results (330 iops vs 1200) are to be expected/normal.

Sync mode writes will not be acknowledged to your host by the storage until it is first acknowledged by the remote storage, thus incurring the full round trip latency per io. A queue depth of 1 is like two people tossing a single baseball back and forth, much time is consumed by the ball traveling in the air, and with only 1 ball, both players are idle while the ball travels. The farther the distance, the longer the air time. Change the queue depth to 32 and now you can have up to 32 balls in the air at once.

Might I suggest a different test. You indicated 1200 iops on a local test. Rerun the benchmark on the replicated lun increasing the queue depth gradually until you can hit that 1200 number, or it plateaus. That should give you a rough idea of how many IOs you can pump out before the first one returns, as dictated by your latency.

I'm not brocade expert, but what I have briefly read tonight is their extended SAN links have increased buffers/credits to keep the data flowing, but this will be negated by a queue depth of 1.

Cisco MDS has a feature specifically to address this issue with latency and remote writes.

http://www.cisco.com/en/US/prod/collate ... 4fd2b.html

FC-WA minimizes storage latency and improves the number of application transactions per second over long distances. It increases the distance of replication or reduces effective latency to improve performance during synchronous replication.
The improved performance results from a coordinated effort performed by the Storage Services Module local to the initiator and the Storage Services Module local to the target. The initiator Storage Services Module, bearing the host-connected intelligent port (HI-port), allows the initiator to send the data to be written well before the write command has been processed by the remote target, and an SCSI Transfer Ready message has had the time to travel back to start the data transfer in the traditional way. The exchange of information between the HI-port and the disk-connected intelligent port (DI-port) allows the transfer to begin earlier than in a traditional transfer. The procedure makes use of a set of buffers for temporarily storing the data as near to the DI-port as possible. The information between the HI-port and DI-port is piggybacked on the SCSI command and the SCSI Transfer Ready command, so there are no additional FC-WA-specific frames traveling on the SAN. Data integrity is maintained by the fact that the original message that states the correct execution disk side of the write operation (SCSI Status Good) is transferred from the disk to the host.

I hope this helps.

ilv · Post by **ilv** » Sat Mar 24, 2012 2:58 am

Richard,
thank you for trying to analyze this , and mostly you are right. increasing the QD helps a lot, obviously. The ISL on the switches is configured for long-distance ... more credit buffers.

BUT I want to stick around 2 basic facts I noticed :
- in a similar set-up on an EVA8100 I can have almost 1200 IOPS for a replicated disk (QD=1), thus the replication performance penalty is very low (5%)
- we have a constant latency of 3ms between the 3PAR machines, while the SAN troubleshooting tells us it should be much lower

I want to listen others experiences on this sync remote copy to see if this is normal or not. Ultimately 3ms response time for a disk is not bad ..

Thank you.

rasser · Post by **rasser** » Wed Aug 01, 2012 10:30 am

Hi,
Did you ever get any further with this issue. We have a similar setup. We find 62% performance penalty, and I find it way to much. Can that really be right?

ilv · Post by **ilv** » Fri Aug 31, 2012 1:58 pm

The replication problem it's solved using "Interrupt coalescence = Disabled " for the RCFC ports.
There is either a GUI and a CLI option for this.
I don't know the side effects of this, but so far everything it's performing well.

ziadmelhem · Post by **ziadmelhem** » Sat Nov 10, 2012 6:03 am

Thank You Very much for sharing the info, and the solution.

we faced the same performance issue with RCFC replication between 2 V400, and the problem was solved after disabling intercol option from the 3PAR.

Best Regards,
Ziad Melhem

phemmywales · Post by **phemmywales** » Sat Aug 02, 2014 1:28 pm

Hello All,

I need your suggestions and recommendation on a similar P10000 V400 Remote copy Over IP issues. During setup, the two V400 units were at the primary datacenter. we were able to configure all RCIP parameters and also commenced syncronisation between the two units, when all the luns were fully synchronized, we moved the backup unit to DR sites. On power up, i notice that the RCIP LUNS were active and the backup unit was no longer able to synchronize with the Primary v400. after several tries and no luck, i decided to rebuild the RCIP afresh. but on the final stage of the RCIP wizard, i kept getting an error message that the "Remote target was not ready".

I checked to ensure all RCIP port were up and Remote Copy was running on the two v400 unit but still the error kept coming at the end of that RCIP setup. interestingly, the RCIp wizard creates only the link but No RCIP group or RCIP Luns. i am really unsure of why this is happening or what is responsible. I was thinking it could be a bandwidth issue but expeecte that the RCIP wizard would at least complete the configuration.

We took the backup unit to the primary site to ensure we carryout the initial data replication since we do not have so much bandwidth between the two sites. The two site are connected over a 3mbps FIbre optics link.

Any idea

bajorgensen · Post by **bajorgensen** » Tue Aug 05, 2014 4:15 pm

3PAR has an issue with sync replication. I've seen 30-70% performance drop depending on load profile.
In 3.1.3 MU1 they have improved sequential write performance,
but I have not gotten around to test it yet.

"Reduced mutex contention in remote copy to fix RCFC sequential write performance issue."

HPE Storage Users Group

Remote Copy (3par v400) HP P10000

Remote Copy (3par v400) HP P10000

Re: Remote Copy (3par v400) HP P10000

Re: Remote Copy (3par v400) HP P10000

Re: Remote Copy (3par v400) HP P10000

Re: Remote Copy (3par v400) HP P10000

Re: Remote Copy (3par v400) HP P10000

Re: Remote Copy (3par v400) HP P10000

Re: Remote Copy (3par v400) HP P10000

Re: Remote Copy (3par v400) HP P10000

Re: Remote Copy (3par v400) HP P10000