Summary of ETW Support in .NET

Wow, what a year for tracing, what a month, what a week!

If you’re in the midst of what’s happening in tracing, your head is spinning with how much the teams have accomplished. If you aren’t in the midst of it, I suspect you’re either unaware or you feel lost.

Trust me, in the midst of the decade of change in tracing, this last week was the tipping point. In the last seven days, we’ve seen the release of Microsoft.Diagnostics.EventSource, Microsoft.Diagnostics.TraceEvent and the WPR/WPA announcement for Windows 8.1.

There’s a lot of work still to accomplish for a truly mature diagnostics story. But going forward I think the big questions are guidelines for EventSource style tracing, what consumers should look like for specific tasks, and alleviating pain points. This week the supporting tools became adequate.

It is nearly impossible to keep straight what component is contributing which key piece of the puzzle. In this post, I’ll summarize and provide what I think are the most important links. If I make a mistake, comment please and I’ll update. If I left out one of your favorite links, please add it in the comments.

At the end of the post, I’ll describe channels and manifests becaues I can’t find concise descriptions.

System.Diagnostics (except System.Diagnostics.Tracing)

· All usage of Trace and TraceSource classes should be reevaluated going forward

Event Tracing for Windows (ETW) (info here and here)

· Very fast, common, tracing subsystem

· Strongly typed events

· Component model with interface driven controllers, providers and consumers

· By implication independent core (Silverlight potential)

System.Diagnostics.Tracing.EventSource (.NET 4.5) (info here)

· Easy placement of your trace events into the ETW debug channel independent of System.Diagnostics overhead

· Strongly typed events available in .NET

· In-line manifests

· In-proc EventSource collection (not the common case)

Semantic Logging Application Block (SLAB) (info here and here)

· Alternate output to ETW allowing a natural logging experience during development that morphs to production ETW tracing via configuration (debug channel)

· Tools to test correctness of EventSource classes

Microsoft.Diagnostics.EventSource (info here)

· Adds EventSource support for alternate channels: including admin channel support which means the ability to write to the Event Log with strongly typed events

· Provides EventSource support in .NET 4.0

· Installed manifest support for EventSource (manifest creation and installer created as build step): allows manifest installation for Event Viewer support

· Build time validation of EventSource classes

WPA/WPR Support (info here)

· Contained in the Windows Assessment and Deployment Kit (Windows ADK) for Windows 8.1 Preview

· Support for in-line manifests in mainstream tools as controllers and consumers

Microsoft.Diagnostics.TraceEvent (info here)

· Out of proc (the common case) ETW support from within .NET

· Library to create ETW events (including EventSource/in-line manifests) controllers (start/stop)

· Library to ETW events (including EventSource/in-line manifests) consumers (filter, load, evaluate, display)

· While you may not use this directly, it’s a critical step in providing better tools, particularly for consuming ETW

A Few Definitions

If you’re not familiar with ETW, the discussion above may require a few definitions.

Channel

ETW supports four channels as the highest level of filtering breakdown according to target audience. The definitions from here:

Admin type channels support events that target end users, administrators, and support personnel. Events written to the Admin channels should have a well-defined solution on which the administrator can act. An example of an admin event is an event that occurs when an application fails to connect to a printer. These events are either well-documented or have a message associated with them that gives the reader direct instructions of what must be done to rectify the problem.

Operational type channels support events that are used for analyzing and diagnosing a problem or occurrence. They can be used to trigger tools or tasks based on the problem or occurrence. An example of an operational event is an event that occurs when a printer is added or removed from a system.

Analytic type channels support events that are published in high volume. They describe program operation and indicate problems that cannot be handled by user intervention.

Debug type channels support events that are used solely by developers to diagnose a problem for debugging.

The admin and diagnostics channels appear to be the most common. The debug channel is sometimes called the diagnostic channel.

Manifest

To fulfill the goal of superfast tracing, ETW produces a binary stream containing a limited amount of human in-decipherable data. Strongly typed events cannot be used without a definition of the contents. This definition is contained in a manifest.

Historically this manifest was created as an XML document and installed onto the computer containing the ETW consumer. This model presented a host of issues, including being brittle, difficulties when multiple manifest versions existed on different production machines, and requiring complex installation. Some tools, including Event Viewer, support only installed manifests.

EventSource introduced in-line manifests. The manifest is simply another event in the ETW stream and is installed in memory only when it appears. This is a much more flexible model and will be important going forward.

3 thoughts on “Summary of ETW Support in .NET”

  1. It seems to me I don’t completely understand the mean of this method and the “relatedActivityId” parameter. Correct me please if I’m wrong.

    Assume I have two components – A and B. Component A is an infrastructure component and has some entry point (receives messages from some transport and dispatches them to component B. Because it’s real entry point of a whole application component A assigns ActivityIDs at the begining of its message processing pipeline. Assume component B wants for some reasons to have its own ActivityIDs for its tasks.

    Assume that component B works in the same thread with component A.

    1. It should get and store component A’s ActivityId, set new ActivityId to the thread and write some transfer event:

    Guid relatedActivityId;
    ComponentBEventSource.SetCurrentThreadActivityId(Guid.NewGuid(), out relatedActivityId);
    ComponentBEventSource.Log.WriteEventWithRelatedActivityId(eventId, relatedActivityId, …)

    2. When component B’s part of work is completed it should simply restore component A’s ActivityID:

    ComponentBEventSource.SetCurrentThreadActivityId(relatedActivityId)

    Is it correct usage scenario?

    In the case when components A and B work in different threads or even in different processes on different computers I should simply send original ActivityId (may be “over the wire”) to the component B and write transfer event on the conponent B’s side in order to provide end-to-end tracing. Of course step 2 is not required in this case.

    Is it correct?

    So I initially consider “relatedActivityId” as activityId of the “parent” or previous component’s activity. But when I was looking for examples in .NET BCL I found TPL is instrumented with EventSource. And I see you use “relatedActivityId” as activityId of the child activities. So the parent component writes transfer events. In TPL it’s possible thanks to the fact that parent component knows child activityId (it’s explicitly constructed from TaskId). Child component still have to replace current thread’s activityId and then restore it.

    And so I came to the conclusion that both possible. And usage style depends on the specific solution context. From the events consumer’s point of view it will be not too complex to recognize what pattern was used by component’s developers and to perform end-to-end trace analysis.

    Are there any other considerations?

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>