Writing a Managed Internet Explorer Extension: Part 2

Continuing my miniseries from Writing a Managed Internet Explorer Extension: Part 1, we discussed how to setup a simple Internet Explorer Browser Helper Object in C# and got a basic, but somewhat useless, example working. We want to interact with our Document Object Model a bit more, including listening for events, like when a button was clicked. I’ll assume that you are all caught up on the basics with my previous post, and we will continue to use the sample solution.

Elements in the HTMLDocument can be accessed by getElementById, getElementsByName, or getElementsByTagName, etc. We’ll use getElementsByTagName, and then filter that based on their “type” attribute of “button” or “submit”.

objexplorerAn issue that regularly comes up with using the generated .NET MSHTML library is its endless web of delegates, events, and interfaces. Looking at the object explorer, you can see that there are several delegates per type. This makes it tricky to say “I want to handle the ‘onclick’ event for all elements.” You couldn’t do that because there is no common interface they all implement with a single onclick element. However, if you are brave you can let dynamic types in .NET Framework 4.0 solve that for you. Otherwise you will have a complex web of casting ahead of you.

Another issue that you may run into is conflicting member names. Yes, you would think this isn’t possible, but the CLR allows it, I just don’t believe C# and VB.NET Compiles allow it. For example, on the interface HTMLInputElement, there is a property called “onclick” and an event called “onclick”. This interface will not compile under C# 4:

   1: public interface HelloWorld

   2: {

   3:     event Action HelloWorld;

   4:     string HelloWorld { get; } 

   5: }

However, an interesting fact about the CLR is it allows methods and properties to be overloaded by the return type. Crazy, huh? Here’ is some bare bones MSIL you can compile on your own using ilasm to see it in action:

   1: .assembly extern mscorlib

   2: {

   3:   .publickeytoken = (B7 7A 5C 56 19 34 E0 89 )

   4:   .ver 4:0:0:0

   5: }

   6:  

   7: .module MislExample.dll

   8: .imagebase 0x00400000

   9: .file alignment 0x00000200

  10: .stackreserve 0x00100000

  11: .subsystem 0x0003

  12: .corflags 0x0000000b

  13:  

  14: .class interface public abstract auto ansi MislExample.HelloWorld

  15: {

  16:   .method public hidebysig newslot specialname abstract virtual 

  17:           instance void  add_HelloWorld

  18:             (class [mscorlib]System.Action 'value') cil managed

  19:   {

  20:   }

  21:  

  22:   .method public hidebysig newslot specialname abstract virtual 

  23:           instance void  remove_HelloWorld

  24:             (class [mscorlib]System.Action 'value') cil managed

  25:   {

  26:   }

  27:  

  28:   .method public hidebysig newslot specialname abstract virtual 

  29:           instance string  get_HelloWorld() cil managed

  30:   {

  31:   }

  32:  

  33:   .event [mscorlib]System.Action HelloWorld

  34:   {

  35:     .addon instance void MislExample.HelloWorld::

  36:             add_HelloWorld(class [mscorlib]System.Action)

  37:     .removeon instance void MislExample.HelloWorld::

  38:             remove_HelloWorld(class [mscorlib]System.Action)

  39:   }

  40:   .property instance string HelloWorld()

  41:   {

  42:     .get instance string MislExample.HelloWorld::get_HelloWorld()

  43:   }

  44: }

That MSIL isn’t fully complete as it lacks any sort of manifest, but it will compile and .NET Reflector will be able to see it. You might have trouble referencing it from a C# or VB.NET project.

You can work around this issue by being explicit in this case: cast it to the interface to gain access to the event or do something clever with LINQ:

   1: void _webBrowser2Events_DocumentComplete(object pDisp, ref object URL)

   2: {

   3:     HTMLDocument document = _webBrowser2.Document;

   4:     var inputElements = from element in document.getElementsByTagName("input").Cast<HTMLInputElement>()

   5:                     select new { Class = element, Interface = (HTMLInputTextElementEvents2_Event)element };

   6:     foreach (var inputElement in inputElements)

   7:     {

   8:         inputElement.Interface.onclick += inputElement_Click;

   9:     }

  10: }

  11:  

  12: static bool inputElement_Click(IHTMLEventObj htmlEventObj)

  13: {

  14:     htmlEventObj.cancelBubble = true;

  15:     MessageBox.Show("You clicked an input element!");

  16:     return false;

  17: }

This is pretty straight forward: whenever the document is complete, loop through all of the input elements and attach on onclick handler to it. Despite the name of the interface, this will work with all HTMLInputElement objects.

Great! We have events wired up. Unfortunately, we’re not done. This appears to work at first try. However, go ahead and load the add on and use IE for a while. It’s going to start consuming more and more memory. We have written a beast with an unquenchable thirst for memory! We can see that in Son of Strike, too.

MT Count TotalSize Class Name
03c87ecc 3502 112064 mshtml.HTMLInputTextElementEvents2_onclickEventHandler
06c2aac0 570 9120 mshtml.HTMLInputElementClass

This is a bad figure, because it is never going down, even if we Garbage Collect. With just a few minutes of use of Internet Explorer, there is a huge number of event handles. The reason being because we never unwire the event handler, thus we are leaking events. We need to unwire them. Many people have bemoaned this problem in .NET: event subscriptions increment the reference count. Many people have written Framework wrappers for events to use “Weak Events”, or events that don’t increment the reference count. Both strong and weak reference have their advantages.

I’ve found the best way to do this is to keep a running Dictionary of all the events you subscribed to, and unwire them in BeforeNavigate2 by looping through the dictionary, then removing the element from the dictionary, allowing it to be garbage collected.

Here is my final code for unwiring events:

   1: [ComVisible(true),

   2: Guid("9AB12757-BDAF-4F9A-8DE8-413C3615590C"),

   3: ClassInterface(ClassInterfaceType.None)]

   4: public class BHO : IObjectWithSite

   5: {

   6:     private object _pUnkSite;

   7:     private IWebBrowser2 _webBrowser2;

   8:     private DWebBrowserEvents2_Event _webBrowser2Events;

   9:     private readonly Dictionary

  10:         <

  11:             HTMLInputTextElementEvents2_onclickEventHandler,

  12:             HTMLInputTextElementEvents2_Event

  13:         > _wiredEvents

  14:         = new Dictionary

  15:         <

  16:             HTMLInputTextElementEvents2_onclickEventHandler,

  17:             HTMLInputTextElementEvents2_Event

  18:         >();

  19:  

  20:     public int SetSite(object pUnkSite)

  21:     {

  22:         if (pUnkSite != null)

  23:         {

  24:             _pUnkSite = pUnkSite;

  25:             _webBrowser2 = (IWebBrowser2)pUnkSite;

  26:             _webBrowser2Events = (DWebBrowserEvents2_Event)pUnkSite;

  27:             _webBrowser2Events.DocumentComplete += _webBrowser2Events_DocumentComplete;

  28:             _webBrowser2Events.BeforeNavigate2 += _webBrowser2Events_BeforeNavigate2;

  29:         }

  30:         else

  31:         {

  32:             _webBrowser2Events.DocumentComplete -= _webBrowser2Events_DocumentComplete;

  33:             _webBrowser2Events.BeforeNavigate2 -= _webBrowser2Events_BeforeNavigate2;

  34:             _pUnkSite = null;

  35:         }

  36:         return 0;

  37:     }

  38:  

  39:     void _webBrowser2Events_BeforeNavigate2(object pDisp, ref object URL, ref object Flags,

  40:         ref object TargetFrameName, ref object PostData, ref object Headers, ref bool Cancel)

  41:     {

  42:         foreach (var wiredEvent in _wiredEvents)

  43:         {

  44:             wiredEvent.Value.onclick -= wiredEvent.Key;

  45:         }

  46:         _wiredEvents.Clear();

  47:     }

  48:  

  49:     void _webBrowser2Events_DocumentComplete(object pDisp, ref object URL)

  50:     {

  51:         HTMLDocument document = _webBrowser2.Document;

  52:         var inputElements = from element in document.getElementsByTagName("input").Cast<HTMLInputElement>()

  53:                             select new { Class = element, Interface = (HTMLInputTextElementEvents2_Event)element };

  54:         foreach (var inputElement in inputElements)

  55:         {

  56:             HTMLInputTextElementEvents2_onclickEventHandler interfaceOnOnclick = inputElement_Click;

  57:             inputElement.Interface.onclick += interfaceOnOnclick;

  58:             _wiredEvents.Add(interfaceOnOnclick, inputElement.Interface);

  59:         }

  60:     }

  61:  

  62:     static bool inputElement_Click(IHTMLEventObj htmlEventObj)

  63:     {

  64:         htmlEventObj.cancelBubble = true;

  65:         MessageBox.Show("You clicked an input!");

  66:         return false;

  67:     }

  68:  

  69:     public int GetSite(ref Guid riid, out IntPtr ppvSite)

  70:     {

  71:         var pUnk = Marshal.GetIUnknownForObject(_pUnkSite);

  72:         try

  73:         {

  74:             return Marshal.QueryInterface(pUnk, ref riid, out ppvSite);

  75:         }

  76:         finally

  77:         {

  78:             Marshal.Release(pUnk);

  79:         }

  80:     }

  81: }

After performing the same level of stress as before, there were only 209 instances of HTMLInputTextElementEvents2_onclickEventHandler. That is still a bit high, but it’s because the Garbage Collector done it’s cleanup. The Garbage Collector makes it a bit subjective to counting how many objects are in memory. If we really cared we could check and see which of those have a reference count greater than zero to get the full result, but I think that’s above the call of duty, right?

There are alternative ways to wire events. If the strong typing and plethora of interfaces is getting to you, it’s possible to use attachEvent and detachEvent albeit it requires converting these events into objects that COM can understand.

Part 3 we will look into manipulating the DOM.

One thought on “Writing a Managed Internet Explorer Extension: Part 2”

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>