Fresh off of two days worth of board meetings at my company, and two days out from leaving town for two weeks on vacation (Switzerland, Germany, and France), I’m exceedingly low on time, so please accept my apologies in advance for being slow to post. During my absence, I have a couple of guest bloggers lined up to discuss Interesting Things(TM) to tide you over until I get back.
Today I want to talk about the checked build. As you may know, there are two different builds of the OS, the free build and the checked build. The difference is that the checked build has some additional checks compiled in (usually in the form of ASSERT() macros) and lots of debug logging (via KdPrint()). You can get a good feel for how checked build code differs from free build code simply by reading the Microsoft-supplied samples in the DDK. Obviously, these extra checks can help much during the development of driver projects.
The problem with the checked build is that it’s a pain to deal with. It is hard to find, although it is available on all MSDN subscriptions from “Operating Systems” on up. Also, all service packs are released with checked build counterparts that can typically be downloaded from microsoft.com. Once you have the build, you have a couple of options for how to install it. I prefer to run with a full checked build at this point on one of my test boxes, but the full checked build has the disadvantage of being considerably slower than the free build, so it takes a lot of horsepower to run. The other problem with the full checked build is that debug messages can get to be amazingly verbose, which is oftentimes not helpful.
The solution to these problems is to use a partial checked build. This means using a checked kernel and hal (the kernel and the hal must always go together – they’re a matched pair), and any additional checked kernel components that are relevant. For example, when I’m developing an NDIS driver, I typically run with checked versions of ndis.sys, tdi.sys, tcpip.sys, and afd.sys. FSD and FS filter development call for checked versions of ntfs and fastfat. Use your head; the right checked binaries are usually obvious.
Getting the checked build onto the system is another matter, however. Due to the Fantastic Miracle of System File Protection, driver development has become slightly more painful in this area. To get Windows to allow you to replace the free files with checked ones, you must disable SFP and reboot with a debugger attached. The good news is that the kind folks at OSR have a tool that twiddles the registry keys for SFP automatically, and seems to work across lots of versions of the OS.
Once you have SFP disabled, simply back up ntoskrnl, hal, and whatever other binaries you’re replacing, and copy over the new ones. Keep a debugger attached to the system at all times, as things just won’t work right without it. ASSERT() macros will crash the system with a bugcheck if there is no debugger, which is seldom helpful. There’s always some idiotic driver (VMWare, are you listening?) that ASSERTs in ntio during boot-up, so you’ll need the debugger to dismiss the assert.
I find that developing and testing very early on with a checked build can be a big help in preventing the introduction of bugs in your drivers. And, what’s more, the newer you are at driver development, the bigger the payoff.
Note that you don’t have to overwrite ntoskrnl.exe and hal.dll, boot.ini is happy to let you specify an alternate kernel and HAL. Just extract the checked versions to %SystemRoot%\System32\ntoskrnl.chk and %SystemRoot%\System32\hal.chk and add "/kernel=ntosknrl.chk /hal=hal.chk" to your boot.ini. You can find more detailed info under "Installing Just the Checked Operating System and HAL" in the DDK.
By doing it this way you have an easy way to back out if you happened to grab the wrong checked images (which happens to me a lot nowadays with all of the hotfixes).
-scott
OSR
You know, I completely forgot to point to that article, but it is a great one. I believe it appeared in the 50+ page NT Insider I got last month too. The whole thing is totally worth a read.
I’m just too lazy (for whatever reason) to set up a boot.ini, but as I think about it, it’s like 15 seconds of extra work, so that’s probably short-sighted on my part.
Thanks for the feedback!
Yes, VMWare drivers sucks… They do a VERY STUPID thing, they fill the non used dispatch table entries with NULL.
dfg E AFG AFSDHG AFS GA