My Blog is Moving

moving-day

Hi folks – It’s that time again. Time to move my blog. Currently, my block is hosted on the MSMVPs site. It’s been a good place overall for my blog, but lately I’ve decided that I really want to take blogging more seriously, and the only way to do that is to do it right. The previous host seemed to have some stabilities issues, but I was happy there. Now, my blog is on my own domain, http://vcsjones.com. For now I will continue to cross post onto my old blog, but if you do have RSS feeds of Bookmarks to my blog – people update them!

Writing a Managed Internet Explorer Extension: Part 5 – Working with the DOM

time-warpInternet Explorer is known for having a quirky rendering engine. Most web developers are familiar with with concept of a rendering engine. Most know that Firefox uses Gecko, and Chrome / Safari use WebKit. WebKit itself has an interesting history, originally forked from the KHTML project by Apple. However pressed, not many can name Internet Explorer’s engine. Most browsers also indicate their rendering engine in their User Agent. For example, my current Chrome one is “Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.7 (KHTML, like Gecko) Chrome/7.0.517.44 Safari/534.7” Not as many web developers could name Internet Explorer’s, it was simply referred to as “Internet Explorer”. The actual name of IE’s rendering engine is Trident. It’s been part of Internet Explorer since 4.0 – it was just deeply integrated into Internet Explorer. At it’s heart, Trident lives in the mshtml.dll and shdocvw.dll libraries in the system32 directory. Earlier, you referenced these libraries as a COM type library.

When accessing IE’s DOM from a BHO, it’s in some regards very similar to doing it from JavaScript. It has the oh-so-familiar getElementById, and the rest of the gang. You’re also constrained, like JavaScript, by the minimum version of IE you plan to support with your BHO. If your BHO is going to be commercial, it isn’t unreasonable to still support IE6. In many respects, you will be using OLE Automation to manipulate the DOM.

Like JavaScript, it is desirable to know what version of IE you are working against. Many JavaScript developers will tell you it’s poor practice to code against versions of a browser, but rather test if the feature is available in a browser. That keeps the JavaScript agnostic to the browser. However, we know we are just coding against IE. I have no strong recommendation one way or the other, but I’ll show you both. This is probably the simplest way to just get IE’s version:

var version = Process.GetCurrentProcess().MainModule.FileVersionInfo;

That provides a plethora of information about IE’s version. The ProductMajorPart will tell you if it’s 6, 7, or 8. There are many other details in there – it can tell you if it’s a debug build, the service pack, etc. You may have surmised that if JavaScript can do it, then we can do it the same way JavaScript does using the appVersion property. Before you start going crazy looking for it on the IWebBrowser2 interface though – I’ll tell you it’s not there. Nor is it on any of the HTMLDocument interfaces. It has it’s own special interface, called IOmNavigator. That interface is defined in mshtml.dll – so since you have already referenced that Type Library you should already have access to it – but how do I get an instance of that thing?

It isn’t difficult, but there is where the interface complexity has it’s disadvantages. IOmNavigator is on the window, and the IHTMLDocument2 interface can provide a path to the window.

   1:  var document = (IHTMLDocument2) _webBrowser2;
   2:  var appVersion = document.parentWindow.navigator.appVersion;

However, if we wanted to do the right thing and test for feature availability rather than relying on version numbers, how do we do that?

The most straightforward is determining which interfaces an object supports. Most of your DOM work is going to be done through the Document property off of WebBrowser2. This is of type HTMLDocument, but there are several different interfaces available. Every time a change was made to the Document API, a new interface was created to maintain backward compatibility (Remember COM uses Interface Querying, so it makes more sense in that respect.)

In .NET we can do something similar using the “is” keyword.

   1:  private void _webBrowser2Events_DocumentComplete(object pdisp, ref object url)
   2:  {
   3:      if (!ReferenceEquals(pdisp, _pUnkSite))
   4:      {
   5:          return;
   6:      }
   7:      if (_pUnkSite.Document is IHTMLDocument5)
   8:      {
   9:          //IHTMLDocument5 was introduced in IE6, so we are at least IE6
  10:      }
  11:  }

j

There are a several IHTMLDocumentX interfaces, currently up to IHTMLDocument7 which is part of IE9 Beta.

WAIT! Where is IHTMLDocument6?

The MSDN Documentation for IHTMLDocument6 says it’s there for IE 8. Yet there is a good chance you won’t see it even if you have IE 8 installed.

This is a downside of the automatically generated COM wrapper. If you look at the reference that says MSHTML, and view it’s properties, you’ll notice that its Path is actually in the GAC, something like this: C:\Windows\assembly\GAC\Microsoft.mshtml\7.0.3300.0__b03f5f7f11d50a3a\Microsoft.mshtml.dll

Microsoft Shipped a GAC’ed version of this COM wrapper, which is used within the .NET Framework itself. However, the one in the GAC is sorely out-of-date. We can’t take that assembly out of the GAC (or risk a lot of problems).

What to do?

We are going to manually generate a COM wrapper around MSHTML without the Add Reference Dialog. Pop open the Visual Studio 2010 Command Prompt. The tool we will be using is part of the .NET Framework SDK, called tlbimp.

The resulting command should look something like this:

tlbimp.exe /out:mshtml.dll /keyfile:key.snk /machine:X86 mshtml.tlb

This will generate a new COM wrapper explicitly and write it out to mshtml.dll in the current working directory. The keyfile switch is important – it should be strong name signed, and you should already have a strong name key since it is required for regasm. mshtml.tlb is a type library found in your system32 directory. This new generated assembly will contain the IHTMLDocument6 interface, as we expect. If you have IE 9 beta installed, you will see IHTMLDocument7 as well. NOTE: This is a pretty hefty type library. It might take a few minutes to generate the COM Wrapper. Patience.

If you are happy just being able to access the DOM using IE 6’s interfaces, then I wouldn’t bother with this. There are advantages to using the one in the GAC (smaller distributable, etc).

In summary, you have two different means of detecting a browser’s features. Using the version by getting the version of the browser, or testing if an interface is implemented. I would personally recommend testing against interfaces, because there is always a tiny chance that Microsoft may remove functionality in a future version. It’s doubtful for the IHTMLDocument interfaces, however for other things it’s a reality.

Now that we have a way of knowing what APIs are at our disposal, we can manipulate the DOM however you see fit. There isn’t much to explain there – if you think it’s hard, it’s probably because it is. It’s no different that trying to do it in JavaScript.

This is an extremely resourceful page when trying to figure out which interface you should be using based on a markup tag: http://msdn.microsoft.com/en-us/library/aa741322(v=VS.85).aspx

Writing a Managed Internet Explorer Extension: Part 4–Debugging

 Mosquito Isolated on White 2Picking up where we left of with Writing a Managed Internet Explorer Extension, debugging is where I wanted to go next. I promise I’ll get to more “feature” level stuff, but when stuff goes wrong, and it will, you need to know how to use your toolset. .NET Developers typically write some code and press F5 to see it work. When an exception, the debugger, already attached, steps up to the plate and tells you everything that is wrong. When you write an Internet Explorer Extension it isn’t as simple as that. You need to attach the debugger to an existing process, and even then it won’t treat you like you’re use to. Notably, breakpoints aren’t going to launch the debugger until the debugger is already attached. So we have a few options, and some tricks up our sleeves, to get the debugger to aide us.

Explicit “Breakpoints”iebreak

The simplest way to emulate a breakpoint is to put the following code in there:

System.Diagnostics.Debugger.Break()

Think of that as a breakpoint that is baked into your code. One thing to note if you’ve never used it before is that the Break method has a [Conditional(“DEBUG”)] attribute on it – so it’ll only work if you are compiling in Debug. When this code gets hit, a fault will occur. It will ask you if you want to close, or attach a debugger. Now is your opportunity to say “I want a debugger!” and attach.

It’ll look like just a normal Internet Explorer crash, but if you probe at the details, “Problem Signature 09” will tell you if it’s a break. When working on a BHO, check this every time IE “crashes” – it’s very easy to forget that these are in there. It’s also important that you compile in Release mode when releasing to ensure none of these sneak out into the wild. The user isn’t going to look at the details and say, “Oh it’s just a breakpoint. I’ll attach and hit ‘continue’ and everything will be OK”. Once that’s done, choose Visual Studio as your debugger of choice (more on that later) and you should feel close to home.

debuggerattach

This is by far one of the easiest ways to attach a debugger, the problem with it is it requires a code change to get working, meaning you need to change the code, close all instances of IE, drop in the new DLL, restart Internet Explorer, and get it back into the state it was. A suggestion would be to attach on SetSite when the site isn’t null. (That’s when the BHO is starting up. Refresher here.) That way, your debugger is always attached throughout the lifetime of the BHO. The disadvantage of that is it’s get intrusive if you like IE as just a browser. You can always Disable the extension or run IE in Safe Mode when you want to use it as an actual browser. If you take this approach, I recommend using Debugger.Launch(). I’ll leave you to the MSDN Documents to understand the details, but Launch won’t fault the application, it will skip straight to the “Which debugger do you want to use?” dialog.

Attaching to an Existing Process

attachYou can just as well attach to an existing process like you normally would, but there is one drawback: “Which process do I want to attach to?” In IE 8 that is a question that can be difficult to answer. Each tab has it’s own process (a trend in new generation browsers – IE was the first to support it). You will have at minimum of two IE processes. One for each tab, and one per actual instance of IE acting as a conductor for the other processes. Already, with just a single tab open, you have a 50/50 chance of getting it right if you guess. Visual Studio can give us some help though. If you pull up the Attach to Process Dialog, you should see your two instances of IE. The “Type” column should give it away. We want the one with Managed code in it (after all, the title of this blog series is "Writing a Managed Extension”).

Once you’re attached, you can set regular breakpoints the normal way and they’ll get hit. Simple!

bphit

It isn’t quite as easy when you have multiple tabs open – sometimes that’s required when debugging a tricky issue. You have a few options here:

  1. When building a UI for your BHO (It’s a catch 22 – I know I haven’t gotten there yet) have it display the PID of the current process. That’s easy enough to do using the Process class. You can dumb it down a little more though and write a log file in a safe location (IE is picky where BHOs write to the File System Refresher here).
  2. Attach to all tab processes. That can lead to a lot of confusion of which tab you are currently in, because if you have two tabs open – and a breakpoint gets hit – which tab did it? The Threads Window should help you there if that is the route you choose.
  3. Always debug with a single tab, if you can.
Power Debugging

There is a trick you can do in Visual Studio to gain access to some additional debugging features. Hopefully this isn’t brand new material to everyone, but for some I would suspect it is. If you manually choose what you want to attach, include managed code and Native. Attaching to Native is very helpful if you are trying to debug a COM Marshaling issue, and plenty of other issues. We can start a the Managed Debugger Extension to diagnose issues at the CLR level, and even poke at the CLR’s native memory and objects. Once attached in Visual Studio with Native code, get to a breakpoint or pause, and launch the Immediate Window. Type .load sos and hit enter. If it worked, you should get a message like extension “C:\Windows\Microsoft.NET\Framework\v4.0.30319\sos.dll loaded”. There are many blogs out there about SOS (Son of Strike). I may blog on it later. Some useful commands are:

  • !help
    Pretty self explanatory. Shows you some available commands.
  • !dso (!DumpStackObjects)
    Does a dump of all CLR objects that are in the stack.
  • !dumpheap
    Dumps the entire heap. Careful! – That really means the entire heap. A more generally useful use of dumpheap is to specify the -stat flag (!dumpheap -stat). This gives you general statistics of the heap. It will tell you which objects are in memory, and how many of them there are. This is useful starting point if you believe there is a memory leak – this can at least tell you what you are leaking.
  • !soe (!StopOnException)
    Again, I feel that the name of this is pretty self explanatory. The usage of it is a little tricky to beginners. A simple example would be, “I want to stop whenever there is an OutOfMemoryException”. This is useful for some types of exception, OOM is a good example. The problem with debugging an OOM in a purely managed way is the CLR is dying by the time the exception happens, so you will get limited information by slapping a debugger on the managed code. For an OOM, a !dumpheap -stat is a good place to start. Other examples where this is useful are Access Violations (more common when doing Marshaling or Platform Invoke), Stack Overflows, and Thread Aborts. The usage is !soe -create System.AccessViolationException.
  • !CLRStack
    This will dump the CLR’s stack only. The Native stack is left out. This is the same as normal managed stacks that you have looked at. It has some cool parameters though. The -p parameter will tell you what the values of the parameters that were passed into it. Often, it will be the address of what was passed in. Use !DumpObject and pass in the address to figure out exactly what it was. The -l flag dumps the locals in the frame, and –a dumps both parameters and locals.
  • !DumpStack
    This is like CLRStack but on steroids. It has the managed stack like CLRStack, but also has the Native stack. It’s useful if you use Platform Invoke. This command is best used outside of Visual Studio and instead in something like WinDbg – more on that further down.

That’s the tip of the iceberg though. The complete documentation on MSDN is here. That lists commands that !help doesn’t list – so have a look. However, you’re not getting your money’s worth by doing this in Visual Studio. Visual Studio is great for managed debugging and using SOS, but when you want to use the Native commands, such as !analyze Visual Studio falls short. In addition, SOS is limited to the debugging functionality that Visual Studio provides it – you may often see a message like this: “Error during command: IDebugClient asked for unimplemented interface” Visual Studio doesn’t fully implement these features that SOS is asking for.

Other debuggers, like WinDbg, are significantly more powerful at the cost of they aren’t as simple to use. If there is demand for further details, I’ll post them. Using WinDbg is fairly similar, once you are attached, run .load C:\Windows\Microsoft.NET\Framework64\v4.0.30319\sos.dll. In WinDbg, you need to specify the full path. In addition, you will want to get symbols from Microsoft’s Symbol Server. There are symbols for mscoree and mscorwks. Having symbols for these can significantly help diagnose native to managed (and vice-versa) transitions.

Happy Debugging!

A Really Super Light and Simple IoC Container for Windows Phone 7

I finally managed to get the Windows Phone 7 tools installed. I’m not going to vent on that anymore because I feel like I’ve done that enough already – and once they did install correctly they’ve been very pleasurable to use. I started working on an application, and old habits die hard. I like Inversion of Control, it’s a major need for me simply because I’ve forced my mind to work like that. I’ve previously worked with Unity and Windsor. Unity has grown on me a lot, and I like it. However, none of them seem to work in Windows Phone 7, or at least they weren’t designed to. So I wrote my own really super simple IoC container for Windows Phone 7. I wanted the following features, and nothing else (for now):

  1. Able to register types such that arguments of it’s constructor are resolved
  2. Able to register an interface and type such that if the interface is resolved, the component is resolved and it’s constructor arguments are resolved
  3. Assume that there is either a default constructor or a single constructor that takes parameters
  4. Everything will have a singleton lifetime.

Super simple requirements. This will be running on phone hardware, so it needs to be lightweight, too. It fits in a single class which is about 65 lines. I’ve split it into two different files using a partial class to keep the registration separate. I can imagine some additional features that other people might want, such as a transient lifestyle, use WeakReferences if you are registering a lot of things, etc.

For the source, see the Gist on GitHub.

My Netbook Vaio P running Chrome OS – The Experience

photoI have a small little netbook, a Vaio P. It’s a cool little laptop, and I took it in favor of a clunker laptop. It handles 95% of my needs, and it’s actually pretty fast. a 128 GB solid state drive, and 2 GB of (non upgradable) RAM. It ran Windows 7 perfectly, only hiccupping on performance when it came to video and medium to large Visual Studio solutions. But I didn’t get it for development, I got it for a browser and a chat client (this was pre-iPad days). I figured if all I used it for was that, most of the time just a browser, I owed it to myself to try installing Chrome OS.

I’d previously played with it in a Virtual Machine, and was very impressed by the boot time (it boots faster than the BIOS can do a POST). It’s borderline an instant-on operating system. That’s typically what I use it for – open it up, use a browser for 30 seconds to figure something out, and then I am done. For Windows, it’ll go into standby, but ends up in hibernate pretty quick because my battery desperately needs to be replaced. I’m impatient, I want it on now. That’s what Chrome OS seemed to offer.

Installing Chrome OS is not nearly as simple as installing another distribution. In fact, it has no installer. Going even further, there isn’t an official build of it yet. I ended up getting the Flow version from Hexxeh. Basically you are given an IMG file, which is a drive image. You have to image it to something, I used a USB Key per the instructions. Having another Linux distribution is close to required – you need grub to boot, and an easy way to copy partitions (dd). First, in Windows 7, I used Computer Management to get rid of all of the partitions I didn’t need (like the recovery one) and the other one that Windows 7 likes to use for BitLocker purposes. I just used BCDEdit to make the main partition bootable. Doing this from Windows 7 is highly preferable because while GPart can resize an NTFS Partition, it will often make Windows unhappy and require that you repair the boot loader. I ended up with a drive that had 7 GB free in the front, then the NTFS volume for Windows 7. I shrunk the Windows 7 partition by 3 GB and left those 3 GB unallocated. I installed Ubuntu on the 7 GB space in the front (which works great by the way, more on that later), and the other 3 GB would be for Chrome. I won’t go into the details of installing Chrome because I think the documentation is pretty good, but if you fear spending an angering hour with grub when something might go wrong or don’t want to risk destroying all of your partitions, this may not be for you.

In the end, I got everything working correctly. The first thing you will notice is that Chrome OS requires an internet connection to login. But you need to login to configure WiFi. Fortunately, I have the Ethernet / VGA adapter for it so I just plugged in, logged in, then configured WiFi. Alternatively, you could use the username “facepunch” and the password “facepunch” to login an configure your WiFi… which leads me to my first point. Hexxeh, as he goes by, is a 17-year-old kid. A brilliant one at that. However, I have reservations about the safety of these distributions. I’m not doubting him, but he could, like any other person including me, make a mistake. He could be doing worse and harvesting passwords. The 17-year-old me would find “facepunch” funny (and the current me does, too) but it sends a bit of a mixed message.

After getting logged in, you’ll go through a few EULAs, all of which are for Java. After that, you’re done! Unfortunately, that’s as far as I got really. It was dirt slow. Actually unusable. This isn’t Hexxeh’s fault either. I’d put most of the blame, if not all of it, on the lack of proper video drivers for the Intel GMA 500. If you installed a Linux distribution on your Vaio P you know you have to do a few magic tricks to get your video card working at something other than awful. With Chrome OS, that isn’t really an option due to the partition layout. It was disappointing, quite a bit a work for a net loss. When Chrome OS goes live, hopefully I can try again. This little laptop will always have a home in my tech closet.

On a related topic, Ubuntu 10.10 works great on it. There is one trick to getting it installed. I could only get the installer to work correctly was from a NetInstall using UNetbootin. Directly from a USB Key resulted in the error “(initramfs) Unable to find a medium containing a live file system.” Basically even though it just booted from USB, it couldn’t read from it. On top of that, once I was booted into the setup, it couldn’t figure out my WiFi so everything had to be downloaded via LAN. Once I got that far, everything went smooth. Take the time to straighten out the video drivers and you’ll be happy. My feeling on performance was basically “Ubuntu boots faster than Windows 7 but runs slower (noticeably) once booted.” I’m sure there are a lot more tweaks I can do to get this working better. We’ll see! I know, “real” Linux guys will tell you, “If you are after performance, why the hell did you install Ubuntu?” Yeah, I know. Maybe I’ll try another one later.

The verdict on Chrome OS is don’t bother with Hexxeh’s flow build. Wait for a newer one. Ubuntu is worth a shot though.

The vCard

My MVC site will have a little bit of social networking in it. One of the requirements is to be able to export a contact into a vCard. vCard is an exchange format used like an electronic business card. It’s the format that is used by Outlook, Thunderbird, or even Apple’s iPhone when exchanging contacts. It’s an open standard, the latest version is 3.0 specified in RFC 2426. It’s been around for a while, too. The 3.0 specification was put out in 1998. It’s been 12 years, and not much as changed with the vCard. This specification was put out before XML came to true fruition, so it’s got a bit of an odd format. Here is an example of a vCard, with the bare minimum specified.

BEGIN:VCARD
VERSION:3.0
N:Jones;Kevin;George
FN:Kevin Jones
REV:2010-08-01T07-24-04Z-04:00
END:VCARD

It’s a set of “types” and values that are delimited by a colon. There can be optional parameters specified with the type as well, such as an indicator for special encoding rules. The types can be looked up in the documentation, but these are:

  • N: The name components.
  • FN: The Full Name. Possibly could be considered the Display Name.
  • REV: The revision. At minimum, the Date portion must be specified, such as “2010-08-01”. The Time portion and Offset portion are optional. The format requires that dates are padded with zeros. Like “08” instead of “8”.
  • BEGIN, END: Pretty obvious. The card must end and begin with these values.
  • VERSION: Also obvious. The version of the specification you are sticking to. See the RFC for notes on differences between 3.0 and 2.1.

My implementation sticks to the RFC’s design as close as possible, and tries to implement all features supported by vCard, including embedding binary objects (like a picture or sound clip). There are two important classes, the VCard class and the RFC2426VCardFormatter. I used the formatting engine built into .NET since people are familiar with it; and we can implement some cool features doing that. The VCard class is the card itself, and RFC2426VCardFormatter implements IFormatProvider and ICustomFormatter and converts a VCard instance to a string. Here is a sample:

   1: VCard card = new VCard

   2:                 {

   3:                     NameFamily = "Jones",

   4:                     NameGiven = "Kevin",

   5:                     NameMiddle = "George",

   6:                     NameFormatted = "Kevin Jones",

   7:                     JobTitle = "Team Lead",

   8:                     JobCompany = "Thycotic Software Ltd",

   9:                     JobRole = "Programmer",

  10:                     DateBirth = new DateTimeOffset(1987, 8, 7, 4, 30, 0, new TimeSpan(-5, 0, 0)),

  11:                     NameSort = "Jones",

  12:                     NameNickname = "Kev",

  13:                     Url = "http://www.thycotic.com/",

  14:                     Mailer = "Outlook 2010",

  15:                     Note = "Is a connoisseur of cheese." + Environment.NewLine + "And poor speller."

  16:                 };

  17: Console.Out.WriteLine(card);

the output will be:

BEGIN:VCARD
VERSION:3.0
N:Jones;Kevin;George
FN:Kevin Jones
SORT-STRING:Jones
NICKNAME:Kev
BDAY:1987-08-07T04-30-00Z-05:00
TITLE:Team Lead
ORG:Thycotic Software Ltd
ROLE:Programmer
MAILER:Outlook 2010
NOTE:Is a connoisseur of cheese.\nAnd poor speller.
URL:http://www.thycotic.com/
REV:2010-08-01T09-21-16Z-04:00
END:VCARD

The formatting engine supports just portions of the vCard object as well. Changing the Console.Out.WriteLine to

   1: Console.Out.WriteLine("{0:BEGIN;VERSION;N;FN;REV;END}",card);

Will only output the BEGIN, VERSION, N, FN, REV, and END portions of the vCard. A link to the code library is here. It’s not 100% implemented yet, but supports all of the features that Outlook and most other mailing software are going to care about, including the PHOTO type, addresses, emails, phone numbers, etc. The “AGENT” type may be the only one I never get around to implementing. See the XML comments for additional usage and functionality. Let me know if there are any bugs!

NOTE: If your browser insists that the ZIP file is a TGZ extension, just change it back to ZIP. Not sure why the MIME type is wrong.

The Power Struggle of FilterAttribute

I’ve been doing a lot of MVC2 work lately, and have been indescribably thrilled with how easy it is to write clean code with it (or at least what I consider clean code). Being able to Unit Test my Controllers and have separation from everything else is like magic. OK, maybe I am a little late to this ballgame. I discovered a very cool feature of MVC2, and that is the FilterAttribute. When reading documentation about how to ensure controller actions could only be run if the user was Authenticated, I naturally came to the AuthorizeAttribute. It was that simple! I read the documentation, and to my delight it is extensible to make your own FilterAttributes. It becomes more powerful when you put the IAuthorizationFilter interface on your attribute, too. Now I can preemptively short circuit the use of an action.

I wanted an attribute that would allow me to say, “Hey, if you are already logged in, just go here instead.” It doesn’t make sense to show a Sign Up page if the user is already logged in, just take them Home. Here is what I ended up with:

   1: public class RedirectToActionIfAuthenticatedAttribute : FilterAttribute, IAuthorizationFilter

   2: {

   3:     private readonly string _controller;

   4:     private readonly string _action;

   5:  

   6:     public RedirectToActionIfAuthenticatedAttribute(string actionName, string controllerName)

   7:     {

   8:         _controller = controllerName;

   9:         _action = actionName;

  10:     }

  11:  

  12:     public void OnAuthorization(AuthorizationContext filterContext)

  13:     {

  14:         var authenticationService = IoC.Resolve<IAuthenticationService>();

  15:         if (authenticationService.GetCurrentUser().HasValue)

  16:         {

  17:             filterContext.Result = new RedirectToRouteResult

  18:                     (

  19:                         new RouteValueDictionary

  20:                             {

  21:                                 {"controller", _controller},

  22:                                 {"action", _action}

  23:                             });

  24:         }

  25:     }

  26: }

The implementation is easy: the OnAuthorization method comes from IAuthorizationFilter. It seems a bit odd to be using this for things that really aren’t purely authorization related, but the internal attributes in the MVC2 kit also use it this way, so I was a bit relaxed. At this point, filterContext has a property called Result. If you leave it null, the attribute has no affect. If you set it to something by the time OnAuthorization exits, then that will trump the execution of your controller action. In this case, I am assuming you are using the out-of-the-box default route and populating the controller and action.

It has a verbose name, but I tend to like these kind of names. Regardless, I can now throw this attribute on controller actions to redirect them wherever I want if they are already signed in. It’s usage is like so:

   1: [RedirectToActionIfAuthenticated("MyAccount", "Home")]

   2: public ActionResult SignUp()

   3: {

   4:     return View();

   5: }

If a user tries to view the SignUp view and they are already authenticated, then just take them to the MyAccount action, which in this case is a view, on the Home controller.

At this point, there was a slew of application I could think of for these kind of attributes. However…

Is it Metadata Abuse?

I was debating, and leaning towards no. One of the things that always came back to mind with this is that attributes are just metadata – or so I was taught originally. I can easily see this begin taken to extents that exceed their purpose. I only need this attribute twice (so far) in my code base. SignUp and LogOn views. It’s not a security things, more of a usability thing, so I am not worried about putting them on the HttpPost actions – just the HttpGet for now. Would it be more correct to just make the decision inside of the action itself? For some applications, like the AuthorizeAttribute – I can easily see the need. If you have a few dozen controller actions – and you probably do – duplicating the authentication logic that many times would be a big criticism. The other part is, I can test the attribute using my favorite testing tools, but testing the controller is now a bit different – do I test that the controller has the attribute? Can I test it without it being an integration test? A test that just asserts and attribute is present doesn’t give much meaning. I know the attribute is there – I can see it.

I’m still not 110% sure on my “style” of using MVC2 and a pattern that I can stick to. I really like that MVC2 has all of the extension points that I need. There is always a popular saying, “Just because you can, doesn’t mean you should”.

Fading Controls with Mouse Movement in WPF

This is an off-topic post from my IE Extension Writing (which I am working on, I promise!). I was playing with a WPF app;It’s a simple photo viewer. I wanted the UI to be “control-less” and only show the picture. However I also wanted some user interface elements to it as well. I decided to take the approach of using controls that will sit overtop of the image, and only fade them in when there is mouse movement, and then fade them out when the mouse is idle. Sort of like how Windows Media Player works when viewing a video in full screen.

It’s pretty quick, but it might save someone some time. This can be done easily and purely in XAML using Event Triggers. Here is the markup I used:

   1: <EventTrigger RoutedEvent="Mouse.MouseMove">

   2:     <BeginStoryboard HandoffBehavior="Compose">

   3:         <Storyboard AutoReverse="False">

   4:             <DoubleAnimation To="1" Storyboard.TargetName="ScaleSlider" Storyboard.TargetProperty="Opacity" Duration="0:0:0.1875" />

   5:             <DoubleAnimation To="0" Storyboard.TargetName="ScaleSlider" Storyboard.TargetProperty="Opacity" BeginTime="0:0:3.0" Duration="0:0:0.1875" />

   6:         </Storyboard>

   7:     </BeginStoryboard>

   8: </EventTrigger>

and this EventTrigger is on Window.Triggers. ScaleSlider is a slider control on the current window. You can use whatever control you’d like. When the mouse moves, it fades the control in, and fades it out after 3 seconds of the mouse being idle.

This is a quick and dirty app, but it should work for most people. The caveat to this simple example is all you are doing is hiding the control: users can still interact with it even though it isn’t visible, say through the keyboard. My work around for this is bind the IsEnabled of the control to it’s own Opacity. Here is the markup for my slider:

   1: <Slider Orientation="Vertical" Margin="0, 20, 30, 20" Maximum="5" Minimum="0.05" Value="1" TickFrequency="0.3"

   2:         Name="ScaleSlider" HorizontalAlignment="Right" Opacity="0" TickPlacement="BottomRight" ValueChanged="ScaleSlider_ValueChanged"

   3:         IsEnabled="{Binding Mode=OneWay, RelativeSource={x:Static RelativeSource.Self}, Path=Opacity}" />

This works nicely, it disables itself when the Opacity is zero.

Writing a Managed Internet Explorer Extension: Part 3

I’m debating where to take this little series, and I think I am at a point where we need to start explaining Internet Explorer, and why writing these things can be a bit tricky. I don’t want to write a blog series where people are blindly copying and pasting code and not knowing what IE is doing.

wayback-machine[1] I am not a professional at it, but I’ve written browser extensions for most popular browsers. IE, Chrome, Firefox, and Safari. In terms of difficulty, IE takes it. That’s probably why there isn’t a big extension community for IE. Let’s go in the Way Back Machine…

IE  at it’s pinnacle, IE was 95% by web surfers with IE 5 and IE 6. If you are a developer, you probably hear a lot of criticisms for IE 6, and rightly so. Back then, IE supported a plug in model with that notorious name ActiveX. It was criticized for allowing web pages to just ship run arbitrary code. Of course, all of that changed and now IE really gets in your face before one of those things run. In fact, it is one of the reasons why intranet apps still require IE 6. Regardless, the message was clear to Microsoft. We need security!

Security was addressed in IE 7, and even more so in IE 8 with the help of Windows Vista and Windows 7.

Hopefully by now you’ve had the opportunity to play around with writing IE Add Ons, but you may have noticed some odd behavior, such as accessing the file system.

UAC / Integrity Access

UAC (User Access Control) was introduced in Windows Vista. There was a lot of noise over it, but it does make things more secure, even if that lousy dialog is turned off. It’s just transparent to the user. The purpose of UAC is the Principle of Least Privilege. Don’t give a program access to a securable object, like a file, unless it needs access to it. Even if your application will never touch a specific file, another application might figure out a way to exploit your application into doing dirty deeds for it. UAC provides a mechanism for temporarily giving access to securable object the application would normally not have permission to. UAC introduced the concept of Elevated and Normal. Normal is what the user normally operates under until a UAC prompt shows up.

Those two names are just used on the surface though… there are actually three Integrity Access Levels. Aptly named, they are called Low, Medium, and High. Medium is Normal, and High is Elevated.

IE is a program that use Low by default. Low works just like threads and process tokens. In theory, you could run your own application in “Low”. Low is it’s own SID: “S-1-16-4096”. If we start a process using this SID, then it will be low integrity. You can see this article for a chunk of code that does that. It’s hard to do this in managed code, and will require a good amount of platform invoke. You can also use this technique with threads.

Ultimately, Low mode has some really hard-core security limitations. You have no access to the File System, except a few useful places

  • %USERPROFILE%\Local Settings\Temporary Internet Files\Low
  • %USERPROFILE%\Local Settings\Temp\Low
  • %USERPROFILE%\AppData\LocalLow
  • %USERPROFILE%\Cookies\Low
  • %USERPROFILE%\Favorites\Low
  • %USERPROFILE%\History\Low

That’s it. No user documents, nada. Some of those directories may not even exist if a Low process hasn’t attempted to create them yet. If your extension is going to only be storing settings, I recommend putting them into %USERPROFILE%\AppData\LocalLow. This directory only exists in Windows Vista and up. Windows XP has no UAC, and also it has no protected mode, so you are free to do as you please on Windows XP!

To determine that path of LocalLow, I use this code. A domain policy might move it elsewhere, or it might change in a future version of Windows:

   1: public static class LocalLowDirectoryProvider

   2: {

   3:     private static readonly Lazy<string> _lazyLocalLowDirectory = new Lazy<string>(LazyGetLocalLowDirectory, LazyThreadSafetyMode.ExecutionAndPublication);

   4:  

   5:     public static string LocalLowDirectory

   6:     {

   7:         get

   8:         {

   9:             return _lazyLocalLowDirectory.Value;

  10:         }

  11:     }

  12:  

  13:     private static string LazyGetLocalLowDirectory()

  14:     {

  15:         var shell32Handle = LoadLibrary("shell32.dll");

  16:         try

  17:         {

  18:             var procAddress = GetProcAddress(shell32Handle, "SHGetKnownFolderPath");

  19:             if (procAddress == IntPtr.Zero)

  20:             {

  21:                 return null;

  22:             }

  23:         }

  24:         finally

  25:         {

  26:             FreeLibrary(shell32Handle);

  27:         }

  28:         var localLowSavePath = IntPtr.Zero;

  29:         try

  30:         {

  31:             if (SHGetKnownFolderPath(new Guid("A520A1A4-1780-4FF6-BD18-167343C5AF16"), 0, IntPtr.Zero, out localLowSavePath) != CONSTS.S_OK)

  32:             {

  33:                 return null;

  34:             }

  35:             return Marshal.PtrToStringUni(localLowSavePath);

  36:         }

  37:         finally

  38:         {

  39:             if (localLowSavePath != IntPtr.Zero)

  40:             {

  41:                 Marshal.FreeCoTaskMem(localLowSavePath);

  42:             }

  43:         }

  44:     }

  45:  

  46:     [DllImport("shell32.dll", CallingConvention = CallingConvention.StdCall, EntryPoint = "SHGetKnownFolderPath")]

  47:     private static extern uint SHGetKnownFolderPath([MarshalAs(UnmanagedType.LPStruct)] Guid rfid, uint dwFlags, IntPtr hToken, out IntPtr pszPath);

  48:  

  49:     [DllImport("kernel32.dll", CallingConvention = CallingConvention.StdCall, EntryPoint = "GetProcAddress", CharSet = CharSet.Ansi)]

  50:     private static extern IntPtr GetProcAddress([In] IntPtr hModule, [In, MarshalAs(UnmanagedType.LPStr)] string lpProcName);

  51:  

  52:     [DllImport("kernel32.dll", CallingConvention = CallingConvention.StdCall, EntryPoint = "LoadLibrary", CharSet = CharSet.Auto)]

  53:     private static extern IntPtr LoadLibrary([In, MarshalAs(UnmanagedType.LPTStr)] string lpFileName);

  54:  

  55:     [DllImport("kernel32.dll", CallingConvention = CallingConvention.StdCall, EntryPoint = "FreeLibrary")]

  56:     private static extern IntPtr FreeLibrary([In] IntPtr hModule);

  57:  

  58: }

It returns null if there is no LocalLow directory. The Lazy<T> class provides some cool thread-safe caching for this value as it will never change (at least it shouldn’t).

However, if you need to access the file system outside of one of these white listed directories, you have a couple of options later on down our journey:

  1. Use IE’s built in Open File and Save File dialogs. They will give you access to the file.
  2. Use a broker process / COM server. We’ll discuss this one later.

Loose Coupling

This tends to trick managed developers. Starting with IE 8, each tab is it’s own process. That tends to break what developers get comfortable with, like the fact that a static / shared variable are unique per tab. That was one of the design goals of decoupling tabs – they can only talk to each other through securable means, like RPC. Even in IE 7 which does not have a process per-tab, it still isolates the BHO instances from one another. As far as the BHO knows, a tab is a window.

Every time a new tab is opened, that tab gets it’s own instance of the BHO. This was originally done to keep IE 7 as backward compatible with BHO’s as possible. In IE 6, each Window was it’s own process. BHO’s got comfortable assuming there would only be one instance of itself running. This loose coupling will also change the behavior of how dialogs might be shown from a BHO. We’ll get into that when we discuss UI design and interaction.

Part 4, we will back back up making a BHO do useful things. I just felt I had to get this off my chest.

Writing a Managed Internet Explorer Extension: Part 2.5

When we last discussed wiring events in part 2, we discussed how events work and how to wire them, and more importantly how to unwire them. I also mentioned that we could use attachEvent and detachEvent rather than the events on interfaces. This is useful if you don’t know what type of element you are attaching an event to.

attachEvent and detachEvent

attachEvent is part of the IHTMLElement2 interface, and fortunately all elements and tags implement this interface, so long as you are targeting Internet Explorer 5.0+. attachEvent takes two parameters, a string indicating which event to attach to, and the actual handler itself. It’s signature looks like this:

attachEvent(
BSTR event, //in
IDispatch *pDisp, //in
VARIANT_BOOL *pfResult //out, retval
);

The IDispatch is the event handler. Unfortunately this means it isn’t as simple as passing a delegate to it and “it just works”. We need to implement a class that will marshal the handler to COM correctly.

Also note, that like Part 2, we need to call detachEvent to stop from memory leaking. Rather than implement the IDispatch interface ourselves, we can use the IReflect interface to give COM marshalling all of the help that it needs. When IDispatch (in this case) invokes the event, it will use the name “[DISPID=0]”. We can handle that in the InvokeMember implementation of IReflect, otherwise we just pass it to our own type. The nice approach to this is it uses a common delegate that we are probably already fimiliar with, EventHandler. In this case, I’ve called the class EventProxy.

public class EventProxy : IReflect
{
private readonly string _eventName;
private readonly IHTMLElement2 _target;
private readonly Action<CEventObj> _eventHandler;
private readonly Type _type;

public EventProxy(string eventName, IHTMLElement2 target, Action<CEventObj> eventHandler)
{
_eventName = eventName;
_target = target;
_eventHandler = eventHandler;
_type = typeof(EventProxy);
}

public IHTMLElement2 Target
{
get { return _target; }
}

public string EventName
{
get { return _eventName; }
}

public void OnHtmlEvent(object o)
{
InvokeClrEvent((CEventObj)o);
}

private void InvokeClrEvent(CEventObj o)
{
if (_eventHandler != null)
{
_eventHandler(o);
}
}

public MethodInfo GetMethod(string name, BindingFlags bindingAttr, Binder binder, Type[] types, ParameterModifier[] modifiers)
{
return _type.GetMethod(name, bindingAttr, binder, types, modifiers);
}

public MethodInfo GetMethod(string name, BindingFlags bindingAttr)
{
return _type.GetMethod(name, bindingAttr);
}

public MethodInfo[] GetMethods(BindingFlags bindingAttr)
{
return _type.GetMethods(bindingAttr);
}

public FieldInfo GetField(string name, BindingFlags bindingAttr)
{
return _type.GetField(name, bindingAttr);
}

public FieldInfo[] GetFields(BindingFlags bindingAttr)
{
return _type.GetFields(bindingAttr);
}

public PropertyInfo GetProperty(string name, BindingFlags bindingAttr)
{
return _type.GetProperty(name, bindingAttr);
}

public PropertyInfo GetProperty(string name, BindingFlags bindingAttr, Binder binder, Type returnType, Type[] types, ParameterModifier[] modifiers)
{
return _type.GetProperty(name, bindingAttr, binder, returnType, types, modifiers);
}

public PropertyInfo[] GetProperties(BindingFlags bindingAttr)
{
return _type.GetProperties(bindingAttr);
}

public MemberInfo[] GetMember(string name, BindingFlags bindingAttr)
{
return _type.GetMember(name, bindingAttr);
}

public MemberInfo[] GetMembers(BindingFlags bindingAttr)
{
return _type.GetMembers(bindingAttr);
}

public object InvokeMember(string name, BindingFlags invokeAttr, Binder binder, object target, object[] args, ParameterModifier[] modifiers, CultureInfo culture, string[] namedParameters)
{
if (name == "[DISPID=0]")
{
OnHtmlEvent(args == null ? null : args.Length == 0 ? null : args[0]);
return null;
}
return _type.InvokeMember(name, invokeAttr, binder, target, args, modifiers, culture, namedParameters);
}

public Type UnderlyingSystemType
{
get { return _type.UnderlyingSystemType; }
}
}

Instances of EventProxy can be handed as the IDispatch to attachEvent. We would use it like so:

private void _webBrowser2Events_DocumentComplete(object pdisp, ref object url)
{
var document = (HTMLDocument)_webBrowser2.Document;
var htmlElements = document.getElementsByTagName("input").OfType<IHTMLElement2>();
foreach (var htmlElement in htmlElements)
{
Action<CEventObj> handler = s => MessageBox.Show("Clicked: " + s.srcElement.id);
var proxy = new EventProxy("onclick", htmlElement, handler);
htmlElement.attachEvent("onclick", proxy);
}
}

Like in part 2, we need to keep a running dictionary of all of the events that we attach, then call detachEvent on BeforeNavigate2. I’ll leave that to you, but at the end of the series I will post the entire working solution in a VS 2010 project.

DocumentComplete Fired Multiple Times

If you use the code from above or from part two, you may notice that clicking an input element causes a dialog to show several times. That is because DocumentComplete is being called more than once. DocumentComplete is fired whenever any document completes, not just the whole thing. So content from an <iframe> will cause the DocumentComplete to get fired again. Sometimes this behavior is desirable, but in this case it is not. How do we ensure it’s only called once for the main document?

The DocumentComplete gives use two things: the URN of the document that was loaded, and a pointer to a dispatch object. Simply put, if the pdisp is the same reference as your IWebBrowser2 instance you setup from SetSite, then it’s the “root” document. This ensures DocumentComplete is only fired once, when the main document is complete:

   1: if (!ReferenceEquals(pdisp, _webBrowser2))

   2: {

   3:     return;

   4: }

As I mentioned in part 2, part 3 will be about manipulating the DOM. I just wanted to cover this first.