Today I was onsite with one of my clients and experienced a strange problem with one PC that was being ghosted from an image. The client runs a chain of retail outlets and when they open a new outlet they use Ghost to prepare a machine based on an image. They have a run up area down the back of their warehouse as well as two older servers. The problem was that when they went to use the PC to do the build, the ghost image (1.5GB in size) would not come down at all. The target system would logon to the network via the DOS (yes DOS) boot disk, we could connect to the shares on the server in the main server room but the image would not transfer. Ok – so how do you diagnose this problem?
The first question needs to be – has this ever worked before? And if so then when did it last work? The answers to this was "yes it has worked and it last worked 2 weeks ago".
Ok – so next question – what changed in that two weeks? We put in a new Netgear FSM7326P Layer 3 Managed Switch. We did this as part of an overall network upgrade and now have web access to the management consoles on all of the switches in the network. It worked on an old 100MB hub before that. The new switch is connected via a 1GB cable through to the main switches in the server room. Connection to the PC’s in the run up area is via 100MB cables, and we have two servers in this area two – they connect one via 1GB and the other via 100MB to this switch.
So I thought I’d try a few tests. First up I tested copying the 1.5GB file from the main server in the server room (SVR1) down to one of the servers in the run up area (SVR3) – it copied the 1.5GB file in about 1-2minutes. Ok – so that ruled out the connection from the local switch to the remote switch. Next I put my laptop into one of the 100MB ports and copied the file down to it – it came down in about 4 minutes – not bad given I had 100MB port. No problems so far. I then connected my laptop to the port that the target PC was using – same test – same results. Ok – at this point it looked like the entire switch structure was fine and not at fault.
I connected to the web management console and looked at the ports on the switch – nothing sinister there either. An idea struck me… all ports by default are set to automatically detect the port speed and duplex options. The machines we are using for the stores were connecting at 100MB Full Duplex. I suspected that maybe they might not like this. I tried the port on that target machine and changed it to 100MB Half Duplex via the web console. Rebooted the PC and tried again –it worked like a charm. So I used the web console to set all the ports in the run up area (used by other machines like that one) to half duplex as well to prevent future issues.
Hopefully these little tidbits of HOW I diagnosed a problem provide more value than the resolution itself.