Real-Life Debugging, Part 2

On Thursday, I talked a little about the a problem I’m facing with one of my drivers, causing the machine running my code to disappear from the (windows file-sharing) network. My first theory on on this had to do with name resolution.

Windows uses an old, complicated process to make name resolution work, dating waaaay back to the Windows for Workgroups and Lan Manager days. Each computer got (and still gets) a name called a “NetBIOS Name”, which is up to 15 characters in length. Larry Osterman has more information about it in a recent post.

The issus here is how these names are resolved. The original way to make this resolution work was to just broadcast a datagram on the local subnet asking who owned that particular name. This, of course, had the effect of limiting NetBIOS-based networks to a single physical subnet (lmhosts files notwithstanding). In conjunction with this, Microsoft implemented a “browser” protocol that collected names and kept them around for resolution on the subnet. The details are esoteric, but suffice it to say that name resolution can take a while to get working, or to break after a computer goes offline, due to the protocol used to update browsers. All of this was decided in a vacuum, with another engineer and me just kind of thinking about it and wondering what could be up.

So, we went looking through the code to see if we could figure out what was going on. Indeed, we found some old code that had been put in to eliminate 137/UDP datagrams, due to an issue with WINS registration that I’ll go into in another post someday. So, we took the code out of the driver and triumphantly gave it to the affected customer.

And we were wrong.

So much for debugging in a vacuum. Next time: the road to the final solution (we think).

Leave a Reply

Your email address will not be published. Required fields are marked *