I've been noticing a strange one in our clusters.
We have a vmfs datastore which a small cluster of x6 hosts writes to for their scratch partition. Each host writes their scratch to a different folder.
What I have been noticing, is that when rebooting the cluster at the same time, they will randomly lose access to the datastore, with the following error in the vmkernel.log..
Failed to open device naa.60002ac0000000000000000200020cb8:1 : Atomic test and set of disk block returned false for equality
Straight after that the logs show other datastores being mounted OK.
I am struggling to fix this, as it seems totally random, but only seems to occur when all 6 hosts are rebooted at the same time. Something is not right if thats all it takes!
Has anyone seen that before? Google comes up with literally nothing.
|