HPE Storage Users Group

A Storage Administrator Community




Post new topic Reply to topic  [ 4 posts ] 
Author Message
 Post subject: 3PAR 7200 replacement disk marked as Slow Drive and fails
PostPosted: Thu Feb 17, 2022 3:16 am 

Joined: Thu Feb 17, 2022 2:39 am
Posts: 2
Hi all,

I have this weird issue with 3PAR 7200. I have a failed disk with specs 900GB FC 10K 6G Encrypted HDD.

Each time I replace it, servicemag resume will succeed.
However after a couple of hours, the disk will fail again. Also showing that servicemag start succeeds. I have tried 3 disks already, each with different DOM (2013, 2014, 2015) and it is still the same.

I then further dig into the logs. Each replacement that I have, I noticed that after servicemag completes, the replacement disk is always marked as a candidate for check_slow_disk task.

The IOPS for the replaced disk is between the range of 105 to 135. While the ideal should be 140 for a 10K HDD.

This is the last extract of the check_slow_disk before failing, for the 4th time.

2022-02-05 20:07:01 +08 Updated Executing "check_slow_disk" as 0:29843
2022-02-05 20:07:01 +08 Updated RPM 100 -> Good IOPS 2000
2022-02-05 20:07:01 +08 Updated RPM 10 -> Good IOPS 140
2022-02-05 20:07:01 +08 Updated RPM 150 -> Good IOPS 2000
2022-02-05 20:07:01 +08 Updated RPM 15 -> Good IOPS 180
2022-02-05 20:07:01 +08 Updated RPM 7 -> Good IOPS 60
2022-02-05 20:07:01 +08 Updated Running at interval 840 for 3360 seconds
2022-02-05 20:21:01 +08 Updated
2022-02-05 20:21:01 +08 Updated Starting next iteration
2022-02-05 20:21:01 +08 Updated
2022-02-05 20:21:01 +08 Updated Checking speed 7 drives
2022-02-05 20:21:01 +08 Updated Candidate:PDID: 27, adj_svct: 7.0, idle%: 99.7, iops: 0.5, kbps: 15.4, svct: 7.2
2022-02-05 20:21:01 +08 Updated Next:PDID: 19, adj_svct: 6.6, idle%: 99.8, iops: 0.4, kbps: 12.6, svct: 6.8
2022-02-05 20:21:01 +08 Updated Checking speed 10 drives
2022-02-05 20:21:01 +08 Updated Candidate:PDID: 64, adj_svct: 59.4, idle%: 7.6, iops: 109.7, kbps: 3027.2, svct: 98.3
2022-02-05 20:21:01 +08 Updated Next:PDID: 11, adj_svct: 15.3, idle%: 19.6, iops: 122.1, kbps: 3355.2, svct: 58.6
2022-02-05 20:35:01 +08 Updated
2022-02-05 20:35:01 +08 Updated Starting next iteration
2022-02-05 20:35:01 +08 Updated
2022-02-05 20:35:01 +08 Updated Checking speed 7 drives
2022-02-05 20:35:01 +08 Updated Candidate:PDID: 26, adj_svct: 4.2, idle%: 99.7, iops: 0.8, kbps: 33.2, svct: 4.6
2022-02-05 20:35:01 +08 Updated Next:PDID: 19, adj_svct: 3.9, idle%: 99.9, iops: 0.3, kbps: 11.6, svct: 4.1
2022-02-05 20:35:01 +08 Updated Checking speed 10 drives
2022-02-05 20:35:01 +08 Updated Candidate:PDID: 64, adj_svct: 113.1, idle%: 1.8, iops: 129.2, kbps: 3842.5, svct: 159.6
2022-02-05 20:35:01 +08 Updated Next:PDID: 36, adj_svct: 45.9, idle%: 10.5, iops: 143.5, kbps: 3871.3, svct: 96.7
2022-02-05 20:49:02 +08 Updated
2022-02-05 20:49:02 +08 Updated Starting next iteration
2022-02-05 20:49:02 +08 Updated
2022-02-05 20:49:02 +08 Updated Checking speed 7 drives
2022-02-05 20:49:02 +08 Updated Candidate:PDID: 19, adj_svct: 4.9, idle%: 99.8, iops: 0.4, kbps: 13.5, svct: 5.1
2022-02-05 20:49:02 +08 Updated Next:PDID: 27, adj_svct: 4.1, idle%: 99.8, iops: 0.5, kbps: 17.3, svct: 4.4
2022-02-05 20:49:02 +08 Updated Checking speed 10 drives
2022-02-05 20:49:02 +08 Updated Candidate:PDID: 64, adj_svct: 96.6, idle%: 2.0, iops: 128.5, kbps: 3936.1, svct: 143.0
2022-02-05 20:49:02 +08 Updated Next:PDID: 36, adj_svct: 29.2, idle%: 11.7, iops: 136.4, kbps: 3825.2, svct: 77.8
2022-02-05 21:03:02 +08 Updated
2022-02-05 21:03:02 +08 Updated Starting next iteration
2022-02-05 21:03:02 +08 Updated
2022-02-05 21:03:02 +08 Updated Checking speed 7 drives
2022-02-05 21:03:02 +08 Updated Candidate:PDID: 29, adj_svct: 12.5, idle%: 99.2, iops: 1.6, kbps: 289.4, svct: 13.7
2022-02-05 21:03:02 +08 Updated Next:PDID: 21, adj_svct: 12.5, idle%: 99.2, iops: 1.5, kbps: 282.6, svct: 13.6
2022-02-05 21:03:02 +08 Updated Checking speed 10 drives
2022-02-05 21:03:02 +08 Updated Candidate:PDID: 64, adj_svct: 105.8, idle%: 1.6, iops: 130.9, kbps: 4184.9, svct: 153.4
2022-02-05 21:03:02 +08 Updated Next:PDID: 35, adj_svct: 22.6, idle%: 12.0, iops: 142.0, kbps: 4152.3, svct: 73.5
2022-02-05 21:03:02 +08 Updated
2022-02-05 21:03:02 +08 Updated FOUND SLOW DRIVE: PDID: 64, adj_svct: 105.8, idle%: 1.6, iops: 130.9, kbps: 4184.9, svct: 153.4
2022-02-05 21:03:02 +08 Updated Marking slow disk 64 failed
2022-02-05 21:03:02 +08 Updated Failed PDID 64
2022-02-05 21:03:02 +08 Updated
2022-02-05 21:03:02 +08 Completed.


The latest servicemag start

2022-02-06 00:30:36 +08 Updated Executing "sstart_pd_64" as 1:15777
2022-02-06 00:30:36 +08 Updated servicemag start -wait -pdid 64
2022-02-06 00:30:36 +08 Updated ... servicing disks in mag: 3 0
2022-02-06 00:30:36 +08 Updated ... normal disks:
2022-02-06 00:30:36 +08 Updated ... not normal disks: WWN [5000C5007F6EFABC] Id [64] diskpos [0]
2022-02-06 00:30:36 +08 Updated ... relocating chunklets to spare space...
2022-02-06 00:30:47 +08 Updated ... bypassing mag 3 0
2022-02-06 00:31:27 +08 Updated ... bypassed mag 3 0
2022-02-06 00:31:27 +08 Updated servicemag start -wait -pdid 64 -- Succeeded
2022-02-06 00:31:27 +08 Completed scheduled task.


I noticed that the replacement disk is a candidate for checking for 10 consecutive times then the system will mark it as Failed.

Has anyone experienced this same issue? Is there a way to not make the disk on the specific slot not to be slow?


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR 7200 replacement disk marked as Slow Drive and fail
PostPosted: Thu Feb 17, 2022 4:50 pm 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1388
Location: Europe
Just asking, could the issue be the cage slot and not PDs? Are you seeing SAS errors or such on the slot?

From what I see, the drive has very high svct (service time or latency in plain english) which is probably why it is always a candidate.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR 7200 replacement disk marked as Slow Drive and fail
PostPosted: Wed Mar 09, 2022 12:35 am 

Joined: Thu Feb 17, 2022 2:39 am
Posts: 2
Hi,

Just an update to this.
I have searched and found that HPE actually phased out the 900GB Encrypted HDDs that we are currently using and gave an advisory of using 1.2TB Encrypted HDDs instead

Advisory: (Revised) HPE 3PAR StoreServ 7000 Storage And HPE 3PAR StoreServ 10000 Storage - Transitioning From HCBRE, HCEP, And Certain SLTN HDD Spare Parts To Alternate Replacement HDD Spare Parts

https://support.hpe.com/hpesc/public/do ... 28695en_us

I finally ordered the 1.2TB disk instead which have a DOM of 2018 and now finally works after replacement for 5 days with no signs of being a "slow drive"

It seems those 900GB Encrypted HDDs we were using for replacement were just old and bad. Even though those parts were bought from multiple suppliers.


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR 7200 replacement disk marked as Slow Drive and fail
PostPosted: Wed Mar 09, 2022 3:12 am 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1388
Location: Europe
I was told back in the days that 900 GB drives were discontinued as no vendor continued to make them when they released new series of drives.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 


Who is online

Users browsing this forum: No registered users and 18 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group | DVGFX2 by: Matt