Page 1 of 1

need How-To replacing failed drive in E200

Posted: Fri Aug 09, 2013 11:34 am
by onesoul
Help!

I'm the new hire at my company and we have a 7 year old E200 in service. A drive has failed in cage 0 position 14 and I cannot figure out how to replace it. We have zero maintenance documentation.

I found the HP docs for v2.2.4 (we're running v2.2.2) and it states contact my VAR for drive replacement. Further reading indicates using 'servicemag start cageid pdid' but I haven't gotten that to work.

In my first attempt, I removed the failed drive and inserted a new one expecting it to auto-rebuild.
Failed attempt #1:
- remove failed drive
- insert new drive
- showpd shows the new drive as id --- and the failed drive in cage 0 position 14.

Failed attempt #2:
- servicemag start 0 14
- servicemag status (until it succeeds)
- remove failed drive 0 14 and insert new drive
- showpd shows the new drive as id --- and the failed drive in cage 0 position 14.

Failed attempt #3:
- servicemag start -log -pdid 14

3par1 cli% showversion
Release version 2.2.2.158 (MU5)
Patches: None

Component Name Version
CLI Server 2.2.2 (MU5)
CLI Client 2.2.2 (MU5)
GUI Server 2.2.2 (MU5)
System Manager 2.2.2 (MU5)
Kernel 2.2.2 (MU5)
TPD Kernel Code 2.2.2 (MU5)
Utilities 2.2.2 (MU5)
Software Updater 2.2.2 (MU5)

3par1 cli% showalert
Id : 47
State : New
Time : Sat Jul 13 08:49:26 PDT 2013
Severity : Major
Type : Component state change
Message : PD 14 Failed (Invalid Media, Increased Error Count, Missing, Missing A Port, Missing B Port)

3par1 cli% servicemag status
Cage 0, magazine 14:
The magazine was successfully brought offline by a servicemag start command.
The command completed Thu Aug 8 17:57:09 2013.
The output of the servicemag start was:
... servicing disks in mag: 0 14
... valid disks:
... not valid disks: WWN [2000001862476B3B] Id [14]
... relocating chunklets to spare space...
... relocating chunklets from degraded raid sets to spare space
... bypassing mag 0 14
... bypassed mag 0 14
servicemag start -log -pdid 14 -- Succeeded

3par1 cli% showpd
Id Cage_Pos SizeMB Chunk Free Spare ----Node_WWN---- State APort BPort LdA
--- 0:14:0 476837 0 0 0 20000014C3B8A638 new 0:0:1 1:0:1* Y
14 0:14:0? 476837 1826 669 114 2000001862476B3B missing ----- ----- Y

3par1 cli% dismisspd 14
Error : Pd id 14 has 114 chunklet marked as spare

Re: need How-To replacing failed drive in E200

Posted: Sat Aug 10, 2013 3:12 am
by slu
Hi there,

which version of the service processor do you have installed actually? Is it possible to log in there, go to "Support" and then open "Guided Maintenance"?

This process should guide through the disk replacement process and do all the cli-stuff for you in the correct order, but i don't know if your SP already has this feature.

Re: need How-To replacing failed drive in E200

Posted: Tue Aug 13, 2013 12:17 pm
by onesoul
I can't easily access the GUI. The previous Operations staff set the service processor firewall to only allow ssh and HTTP(s) from a single linux bastion host (and it has no GUI).

How can I change the service processor firewall settings?

1.1 Display SP Version

SP Software Version

tpdSPbase-2.2.2.GA-52:1190871622

tpdSPInFormOS2.1.4.47-2.1.4.47-1:1190871996
tpdSPInFormOS2.2.1.156-2.2.1.156-1:1190872072
tpdSPInFormOS2.2.2.126-2.2.2.126-23:1168442855
tpdSPInFormOS2.2.2.140-2.2.2.140-31:1175891216
tpdSPInFormOS2.2.2.158-2.2.2.158-52:1190872155
tpdSPPI-2.2.2.GA-52:1190871638
tpdSPUI-2.2.2.GA-52:1190871640
tpdSPclmaint-2.2.2.GA-52:1190871641
tpdSPcommon-2.2.2.GA-52:1190871643
tpdSPcommunications-2.2.2.GA-52:1190871645
tpdSPdiag-2.2.2.GA-52:1190871647
tpdSPgdda-2.2.2.GA-52:1190871637
tpdSPgm-2.2.2.GA-52:1190871650
tpdSPlib-2.2.2.GA-52:1190871636
tpdSPnresc-2.2.2.GA-52:1190871656
tpdSPprod-2.2.2.GA-52:1190871657
tpdSPsarge-2.2.2.GA-52:1190871720
tpdSPtpddump-2.2.2.GA-52:1190871789
tpdSPtpdupdate-2.2.2.GA-52:1190871809
tpdSPtraining-2.2.2.GA-52:1190871817
tpdSPtzdata-2.2.2.GA-52:1190871822
tpdSPupdate-2.2.2.GA-52:1190871829
tpdSPweb-2.2.2.GA-52:1190871831
tpdSPwoody-2.2.2.GA-52:1190871911

Base Image Info
Imaged with 3PAR SP build 3.0-21

Re: need How-To replacing failed drive in E200

Posted: Tue Aug 13, 2013 5:46 pm
by onesoul
3par1 cli% showlicense
License key was generated on Tue Jun 26 13:29:57 2007

License features currently enabled:
InForm Suite
System Reporter
Thin Provisioning (100G)

Re: need How-To replacing failed drive in E200

Posted: Tue Aug 13, 2013 6:01 pm
by onesoul
FOUND THE service processor FIREWALL SETTINGS!! Now I can open them up to allow GUI.
ssh spvar@sp (passwd 3parvar)
2 ==> Network Configuration
3 ==> Firewall Manipulation

1 ==> Display Firewall Status
2 ==> Alter Private network firewall rules
3 ==> Alter Public network firewall rules

X Return to previous menu

Holy crap. ipchains from hell.

Back on the disk topic, looks like I cannot dismiss the drive, and servicemag doesn't move the spare chunklets.
3par1 cli% dismisspd 14
Error : Pd id 14 has 114 chunklet marked as spare

Also, looks like I didn't know about admitpd.

Here's what I'll try:
- remove failed drive from cage 0 position 14
- install new drive in cage 0 position 14
- ssh spvar@sp (passwd 3parvar)
- admitpd
- showpd
- (wait for rebuild)
- dismisspd 14 (if it doesn't remove from showpd)?

Re: need How-To replacing failed drive in E200

Posted: Thu Aug 22, 2013 5:52 pm
by trireed
Here is how to replace a failed drive:

showpd -failed -degraded

Make sure you do not have two failed drives in same mag, assuming you dont then proceed.

servicemag start -log -wait -pdid !wait till it succeeds

replace failed drive

servicemag resume 0 14 ! This will bring mag back online along with good drive

servicemag status -d 0 14 ! verify staus of bringing drive back online