Case Analysis ReportSR#: 482626-181966951
Reported Symptom: 3par_pcmke kernel extension causes AIX servers to crash
Reported By: Pier 1 Services Company
Description:
After successfully adding new MPIO paths, then deleting the old ones to an AIX 5.3 server, about an hour later, that server crashed. Crash dumps were collected and sent to IBM and they responded with the following:
Subject: PMR 18921,004,000 3par_pcmke kernel extension
CRASH INFORMATION: CPU 0 CSA F00000002FF47600 at time of crash, error code for LEDs: 30000000
pvthread+03AB00 STACK: [04167B70]3par_pcmke:pcmSelectIoctlPath+0000DC (F1000110104AD350,
F100011010433800)
--
The problem is due to some issue in the 3par_pcmke kernel extension.
The owner of this kernel extension is 3PAR company.
Findings:3PAR investigation shows that a similar kernel extension crash was reported to 3PAR engineering and was determined to be caused by HBA settings related to the dynamic tracking and fast fail attribute settings.
Per 3PAR engineering when dynamic tracking is not enabled in the HBA, the 3PAR MPIO path pointers can get null values which can cause problems similar to what you reported.
Further research with IBM reveals that , for hosts systems that run an AIX® 5.2 or later operating system, the fast fail and dynamic tracking attributes must be enabled.
See link:
IBM Aix Config for Fast Fail and Dynamic TrackingReview of the log lsattr_fscsi.out you provided, we confirmed that the dynamic tracking and fast fail attributes are not enabled on
this host as recommended.
From lsattr_fscsi.out:
### fscsi0
attach switch How this adapter is CONNECTED False
dyntrk no Dynamic Tracking of FC Devices True
fc_err_recov delayed_fail FC Fabric Event Error RECOVERY Policy True
scsi_id 0xa30024 Adapter SCSI ID False
sw_fc_class 3 FC Class for Fabric True
…
### fscsi2
attach switch How this adapter is CONNECTED False
dyntrk no Dynamic Tracking of FC Devices True
fc_err_recov delayed_fail FC Fabric Event Error RECOVERY Policy True
scsi_id 0xa20020 Adapter SCSI ID False
sw_fc_class 3 FC Class for Fabric True
When dynamic tracking of FC devices is enabled, the FC adapter driver can detect when the Fiber Channel N_Port ID of a device changes and re-route traffic destined for that device to the new address while the devices are still online.
The 3PAR Implementation Guide for AIX also additional information on these settings including a list of events when the N_Port ID can change. See section 3.2.5 of the attached AIX implementation guide.
Solution :The dynamic tracking and fast fail commands can be enabled by running these commands.
chdev -l fscsi0 -a fc_err_recov=fast_fail
chdev -l fscsi0 -a dyntrk=yes
Notes:
1. Change the settings on all applicable HBA in the system.
2. A Reboot may be required for these changes to take effect.
***Please follow all necessary pre-cautions before rebooting your host. ***
For further details and other considerations please refer to the IBM documentation on how to implement these changes.
IBM Aix Config Fast Fail and Dynamic Tracking