Slow boots and chkdsk errors, bad hard drives that the hardware raid thinks is ok

I had a call a month ago on a Tuesday night about a server that would not reboot. It kept popping up errors and wanted to run chkdsk. It appeared to finally come up and the person who called went home. I tried to remote in to the server and it would not answer. I knew I would have a morning visit. I appeared the next morning and I tried every variation of chkdsk I could come up with. That could be a humorous comment as I know of /f and /r. I called Microsoft tech support and we booted off the install cd. Running chkdsk when there is no OS in the picture nets better results. Well not this time.


This was hardware raid 1 and I was getting array good in the bios boot up. I unplugged one of the drives and tried to run chkdsk again and again. It seemed that I might get a clean /f but the next /r would find issues. I could get the server to boot but not too fast and I never got a clean scandisk twice using chkdsk/f.  Once again it is Raid 1 so you have 5 choices. Bad disk 1, Bad disk 2, both disks bad, bad controller card, bad motherboard. There might be other problems but those choices cover the likely scenarios. I gave up on the one hard drive and started working on the other hard drive. That second hard drive cleaned up nicely and booted happy as can be. We lost some email as the backup did not run correctly over the three day weekend. I recovered files from Tuesday but no luck on email. That was painful. What was also painful was my chice of drives. If I had better binary luck I would have ended the Microsoft call in a 1/2 hour instead of 3 or 4 hours. The support engineer was great. He had about 5 years of support experience.


A month later a fellow who used to work with me calls stating that his old SQL server is slow to boot, appears to be running but users are not able to work. I suggested unplugging one hard drive or the other as he is using raid 1 also. He never called back so I bet one of his hard drives was acting up.


You ask “Would a hardware Raid 1 with hot swap?” The problem was the hardware thought the drives were fine. The operating system did not see the drives as happy. I am guessing that no, a hot swap would not have helped.

