Page 1 of 2

Snapshot space question

Posted: Fri Feb 21, 2014 11:53 am
by slink
quick question, with the way the 3PAR does its snapshots is it the case that one snapshot every 7 days consumes pretty much the same amount of space as 7 snapshots of the same VV taken once a day?

It's just copying in the changes to a PiT area right? so the changes over that 7 day period are the same as what occurred each day? I'm assuming 3PAR does some black magic so that it doesn't duplicate the changes in snapshot 1 and snapshot 2, 3, 4 etc?

So what matters is how long you keep snapshots for, not how many you take?

Re: Snapshot space question

Posted: Mon Feb 24, 2014 11:51 am
by slink
anyone?

Re: Snapshot space question

Posted: Mon Feb 24, 2014 12:31 pm
by hdtvguy
That is a theoretical question and it can vary. Depends on the what blocks churn and how often. My gut tells me that daily snaps would occupy more than weekly, just because you have to track blocs that may change more than once during the 7 day cycle. I do not know that it does any black magic because a changed block is a changed block, I don't think it knows if the block it has in a snap shot already exists, that is de-duplication and the 3par does not dedupe.

Re: Snapshot space question

Posted: Mon Feb 24, 2014 4:49 pm
by slink
Thanks for the reply. I might have to actually do some testing on this to find out for sure.
hdtvguy wrote: I do not know that it does any black magic because a changed block is a changed block, I don't think it knows if the block it has in a snap shot already exists, that is de-duplication and the 3par does not dedupe.

I was basing that remark on this information from the Virtual Copy documentation:
3PAR wrote:Highly Efficient, Thin-aware Implementation

Only fine-grained capacity is consumed for changed data. Copy-on-write operations are thin and non-duplicative; changed data is never duplicated within a snapshot tree
http://www8.hp.com/us/en/products/storage-software/product-detail.html?oid=5044626#!tab=features

Re: Snapshot space question

Posted: Wed Feb 26, 2014 6:11 am
by Cleanur
The snaps are non duplicative, (de duplication by another name) HP may be missing a marketing trick there ;-) So multiple snaps effectively use the same space for shared blocks, only.thel deltas between individual snapshots will consume additional space.

Re: Snapshot space question

Posted: Wed Feb 26, 2014 7:35 am
by slink
hdtvguy wrote: My gut tells me that daily snaps would occupy more than weekly, just because you have to track blocs that may change more than once during the 7 day cycle.
I think you are right here. If I use an excel sheet as an analogy of some block data with the cell contents in A1 being changed daily so that it was all 1's on Monday, all 2's on Tuesday, all 3's on Wednesday etc. then that changed data would have to be copied for each previous day's snapshots and as it is different it is not shared between them.

i.e.,

    Monday evening snapshot = just metadata
    Tuesday working day the cell changes, old data copied to PiT area and Monday snapshot updated to point to it and new data written to "live" VV.
    Tuesday evening snapshot = just metadate
    Wednesday working day the cell changes, old data copied to PiT area and Tuesday snapshot updated to point to it and new data written to "live" VV.

etc.

But a weekly snapshot taken on say a Sunday would only contain one change, i.e., the data written the next day on the Monday which would cause the old data to be copied to the PiT area. The subsequent writes on the following weekdays wouldn't cause a write to the PiT area as well as in this simplified example, it is the same blocks which are changing.
Cleanur wrote:So multiple snaps effectively use the same space for shared blocks, only.the deltas between individual snapshots will consume additional space.

So if the cell contents alternated between 1's and 2's throughout the week then there would only be 2 blocks of changes shared between the daily snapshots. So it would be:

    Monday evening snapshot = just metadata
    Tuesday working day the cell changes to all 2's, old data copied to PiT area and Monday snapshot updated to point to it and new data written to "live" VV.
    Tuesday evening snapshot = just metadate
    Wednesday working day the cell changes back to all 1's, old data copied to PiT area and Tuesday snapshot updated to point to it and new data written to "live" VV.
    Wednesday evening snapshot = just metadate
    Thursday working day the cell changes back to all 2's again, old data NOT copied to PiT area as the "all 1's" data is already there so snapshot is just updated to point to that but new data written to "live" VV.

Does this sound about right? If so, I think I understand it better now.

Sorry if this is massively obvious, just trying to get my head around it as I've never really focused on the actual snapshot mechanism before. It's just something you do isn't it, tell the storage to "remember where you are right now" so you can rewind later but how that actually works I've never really given much thought to until now.

Re: Snapshot space question

Posted: Thu Mar 06, 2014 3:26 pm
by Richard Siemers
To your originaly question, no its not the same, but it depends on your "change rate" or deltas between snaps, and where the change occurs.

Lets say you have 100g VV, and the only deltas during the week are changes to a small 5mb tempdb database with a circular log file that completely rewrites itself about once an hour. In this case, yes, your 7 in one day will take about as much space as 7 in one week.

However, if that same scenario took a whole day to fill/roll the log file... then 7 days of snaps would be considerably larger than 7 snaps in one day.

Thursday working day the cell changes back to all 2's again, old data NOT copied to PiT area as the "all 1's" data is already there so snapshot is just updated to point to that but new data written to "live" VV.


I am not sure that is correct... I do not believe the snapshot will know the new data is a match to an old snapshot and essentially dedupe it... it just knows the block is being changed and the block needs to be copied out to PIT before writing the new block.

Re: Snapshot space question

Posted: Thu Mar 06, 2014 4:10 pm
by slink
Richard Siemers wrote:I am not sure that is correct... I do not believe the snapshot will know the new data is a match to an old snapshot and essentially dedupe it... it just knows the block is being changed and the block needs to be copied out to PIT before writing the new block.


The quote from the 3PAR site I listed above does seem to say it is effectively deduped and Cleanur also seems to think so.

Re: Snapshot space question

Posted: Thu Mar 06, 2014 5:31 pm
by Richard Siemers
I interpreted the document feature being duplication prevention, as is when you have VV with 10 snapshots, and 1 block changes on the original... it won't copy the old block to 10 snap areas... just 1 common area they all share. Typical change block tracking.

Original VV
- Snap1 = Snap of VV today = new deltas
--Snap2 = Snap of VV yesterday = old deltas + pointers to new deltas
---Snap3 = Snap of VV 3 days ago = older deltas + pointers to new(er) deltas.

I would be delighted and surprised to learn that 3PAR snapshots will try to dedupe and re-use blocks inside the snap area that match new incoming blocks. Theoretically, that seems within the scope of reality... the challenge being a catalog/index of blocks inside the shared snap area and how quickly the system could lookup a if new a block headed into the snap area is match for one already existing in there since there is a new write pending while it decides whether to copy out or not.

I hope you're correct and I am misreading the feature. I meet with my 3PAR pre-sales guys tomorrow, I will hit them up with the topic!

Re: Snapshot space question

Posted: Fri Mar 07, 2014 6:56 am
by hdtvguy
I tend to go with Richard, I do not know of any dedupe that happens, I think they means that each snap stands on its own and just pointers back to the blocks so if it does not recreate/copy all the previous changed blocks into subsequent snaps. It is also intelligent enough to remove an intermediary snap without loosing the subsequent snaps information.