Is there any official documentation or white paper on planning SQL VVs? It would be good to have something so that I can a) confirm what I think and b) have something to demonstrate how it should be setup for people who are dubious.
The current bone of contention I need to clarify is whether there's any point in separating out TempDB to separate VVs. I say no, because you aren't actually separating spindles but I've read so much conflicting information out there about queues and caches I wondered if there was any actual facts and figures from testing workloads on different CPG/VV setups?
One thing I'm currently doing is splitting out logs and databases to separate VVs which come from different CPGs, one even disks, one odd disks as suggested in the Exchange best practice guide but not sure this is even worth it as I'm halving the spindle count for each VV.
Seems there are lots of questions and misinformation out there but very little formal advice from HP/3PAR.
SQL Guidance
Re: SQL Guidance
We still separate it out for a few reasons other than spindles. First is for our DR in that we don't need to replicate the TempDB LUN as often (or ever) as the Data LUN,since our SQL systems are VMs we use RDMs for Data, Logs, TempDB drives and thus can assign each its own logical SCSI adapter to allow the Os to handle disk queues better. It also gives us the ability to tune each to a different CPG as we often tune TempDB to RAID 1 and Data to RAID 5 and LOGs vary depending on how write intensive the app is. Our typical SQL VM looks like
VV1 = VMDK datastore of OS C: and Data drive for mount points and executables SCSI 0:x
VV2 = RDM Databases SCSI 1:0
VV3 = RDM LOGs SCSI 2:0
VV4 = TempDB SCSI 3:0
VV5 = Log backups (optional) SCSI 0:x (for large SQL systems this may be an RDM otherwise it is a VMDK on the datastore LUN)
This is usually overkill, but provides us a great deal of flexibility. I also then use 3par System Reporter to generate per volume reports that we publish so DBAs can see how any given SQL server and volume is performing so I don;t have to get the "it must be a storage issue" when SQL performance sucks.
VV1 = VMDK datastore of OS C: and Data drive for mount points and executables SCSI 0:x
VV2 = RDM Databases SCSI 1:0
VV3 = RDM LOGs SCSI 2:0
VV4 = TempDB SCSI 3:0
VV5 = Log backups (optional) SCSI 0:x (for large SQL systems this may be an RDM otherwise it is a VMDK on the datastore LUN)
This is usually overkill, but provides us a great deal of flexibility. I also then use 3par System Reporter to generate per volume reports that we publish so DBAs can see how any given SQL server and volume is performing so I don;t have to get the "it must be a storage issue" when SQL performance sucks.
Re: SQL Guidance
thanks, that's pretty much the layout I'm going to go for, more for neatness than anything else I think!
Have you experimented with the step size on the CPGs? I thought a VV formatted with a 64kb block size on a CPG with a 64kb step size might yield some better performance but using a range of SQLIO tests I'm not seeing anything that makes me think it's actually worth doing. 64kb block size on the default step size produces very similar results, sometimes marginally better, sometimes worse depending on the SQLIO test being run.
Have you experimented with the step size on the CPGs? I thought a VV formatted with a 64kb block size on a CPG with a 64kb step size might yield some better performance but using a range of SQLIO tests I'm not seeing anything that makes me think it's actually worth doing. 64kb block size on the default step size produces very similar results, sometimes marginally better, sometimes worse depending on the SQLIO test being run.
Re: SQL Guidance
I have not played with CPG step sizes, but we format all SQL volumes to 64K clusters, although I think there is some discussion that log drives and maybe tempdb should be 8K, but for now we make them all 64K
Re: SQL Guidance
Howdy,
Ideas about disk layouts for optimum SQL performance have formed over many years of experience. A significant portion of the tenets of good SQL infrastructure are based on hardware and software that simply doesn't exist anymore, yet they get handed down from the Sr to the Jr DBA.
I've struggled with splitting spindles in the MSA 1000 to give 50% of the IOPS to Data and 50% to logs and I never could find any test results (on my equipment and workload) that supported that route.
I've tried my workload both ways and letting the two LUNs compete over a wider stripe always wins. This workload is a web based transaction processing application with about 250 handheld merge suscribers.
Under normal circumstances, if you split the spindles in such a fashion as you're thinking, the best you can ever hope to do is break even. That would be if there were exactly as many IOs for data as there are for logs.
The kind of application which might justify splitting the disks into two groups will be the sole owner of the disks. This is because mixing sequential and random access can cause a reduction in overall performance. In fact, the performance justification of putting data and logs on different spindle sets is due to this very notion.
This is somewhat old school thinking because a good cache and a really good RAID algorithm smooths a lot of that out. You're building a general purpose cluster that will serve more than just SQL (IIRC) in which case the IO is going to be mixed no matter what you do with SQL. This will all but eliminate any benefit from the split.
My advice is to bet on the 3PAR and its wide striping.
jeff
Ideas about disk layouts for optimum SQL performance have formed over many years of experience. A significant portion of the tenets of good SQL infrastructure are based on hardware and software that simply doesn't exist anymore, yet they get handed down from the Sr to the Jr DBA.
I've struggled with splitting spindles in the MSA 1000 to give 50% of the IOPS to Data and 50% to logs and I never could find any test results (on my equipment and workload) that supported that route.
I've tried my workload both ways and letting the two LUNs compete over a wider stripe always wins. This workload is a web based transaction processing application with about 250 handheld merge suscribers.
Under normal circumstances, if you split the spindles in such a fashion as you're thinking, the best you can ever hope to do is break even. That would be if there were exactly as many IOs for data as there are for logs.
The kind of application which might justify splitting the disks into two groups will be the sole owner of the disks. This is because mixing sequential and random access can cause a reduction in overall performance. In fact, the performance justification of putting data and logs on different spindle sets is due to this very notion.
This is somewhat old school thinking because a good cache and a really good RAID algorithm smooths a lot of that out. You're building a general purpose cluster that will serve more than just SQL (IIRC) in which case the IO is going to be mixed no matter what you do with SQL. This will all but eliminate any benefit from the split.
My advice is to bet on the 3PAR and its wide striping.
jeff
Re: SQL Guidance
Thanks for the advice Indeed I've abandoned the odd/even approach and I'm using all the disks for every CPG. I've stuck with 64kb step size on the SQL VVs however, I kind of feel the option is there to do it so there must be a use for it!