Performance in Tracing

 

I talk about tracing in other posts, including here and here. I talk about semantic tracing here. I also have a Pluralsight video on tracing with ETW and EventSource here.

Bill Chiles has a great article here. His article is based on the Roslyn team’s experience. The Roslyn compilers needed massive tuning so that the managed compilers would have similar performance to the older unmanaged compilers. His article is great on overall application performance, and I use it here in relation to tracing.

Bill lists four issues, and then a few miscellaneous items:

  • Don’t prematurely optimize – be productive and tune when you spot problems.
  • Profiles don’t lie – you’re guessing if you’re not measuring.
  • Good tools make all the difference – download PerfView and look at the tutorials.
  • Allocations are king for app responsiveness – this is where the new compilers’ perf team spent most of their time.

When tracing adversely impacts performance, it’s often due to I/O or I/O binding. I/O issues occur when you are writing to trace storage on the main thread – on Windows, you can avoid this by using ETW. In .NET, you use ETW by using EventSource. The rest of this article covers adverse performance impacts of tracing unrelated to I/O issues.

Premature optimization and profiling

Outside tracing, I almost always agree with the recommendation that premature optimization wastes programmer time and results in unnecessarily complex applications.

But tracing is different. Optimizing tracing is not premature and considering performance throughout your discussions of trace strategies makes sense. If your application is well traced, you have a lot of tracing calls. You never want a programmer to consider whether adding a trace call will hurt performance.

One of the goals of tracing is to discover problems as quickly as possible – so you don’t want to implement a strategy you’re afraid to leave on in production.

Tracing is part of your profiling strategy and tracing that perceptibly slows your application in production may lead you astray in evaluating profile results.

Profiles don’t lie. Knowing the performance metrics of your application with tracing turned off and with various sets of traces turned on via level (error, warning, information), keywords, etc. is important. Unless I/O is slowing down your traces, you’ll find that tracing is a very, very small percentage of your application’s effort. If it’s a significant portion, your tracing strategy is flawed and you need to fix it.

Because the impact is so small, it’s unlikely that you can use profiling to improve performance of your tracing.

Assuming a fast trace infrastructure (ETW with EventSource or out-of-proc SLAB in .NET) the goal of improving trace performance is not to improve overall application performance. The goal is to provide confidence that you can turn tracing on whenever you want, or ideally leave a set of traces on at all times in a “flight recorder” mode. “Flight recorder” mode means traces are recorded in a rolling fashion that’s made permanent when an interesting event happens.

You always want tracing to be as fast as possible, as long as it’s not causing undo complexity or extra effort.

Happily, with ETW and SLAB using ETW you can have high performance tracing. And happier still, performance considerations will improve your tracing design by pushing you more firmly into the semantic tracing camp.

Performance considerations for tracing

Performance considerations make for better tracing. Here are a few examples:

  • Isolate all trace calls behind semi-permanent signatures with no artifacts of the underlying technology
    • Allows evolution in response to technology improvements (such as implementing an out-of-proc strategy
    • Also helps discoverability
  • Make these methods very granular and very specific to (and descriptive of) the action you are tracing
    • Allows more granular enabling of trace events
    • Less code to run within individual trace methods
    • Also documents application actions
    • Also provides consistency in details like level, keywords and channel
    • Also simplifies usage (IntelliSense)
  • Use strongly typed parameters for these methods
    • Avoids boxing (heap allocations and resulting GC load)
    • Also simplifies usage (IntelliSense) and documents actions
  • Avoid large structs as parameters
    • Avoids copying on the stack
    • Also simplifies usage (IntelliSense) and documents actions
  • Avoid concatenating strings and string.Format() as well as Concat(), Split(), Join() and Substring()
    • Avoids allocations
    • Also results in a trace that can be interpreted without parsing
  • Avoid retrieving data or any expensive operations
    • Obviously, avoids spending that time
    • Also ensures trace is just recording current conditions
  • Get to know the information provided by the trace system for free
    • Avoids tracing things that are already available
    • Also allows for fewer trace calls and a simpler system
  • Consider having tracing on during testing and not using DI (Dependency Injection) for tracing
    • Avoids running DI code, and possibly allocations
    • Also simplifies your application
    • Also allows programmers to use tracing during initial debugging
    • Also gets programmers in the habit of using traces

Semantic tracing is the style of tracing that easily fulfills these goals. Semantic tracing is a style, it can be used on any platform in any language.

Semantic tracing fills many of these performance focused goals

Semantic tracing is by its nature strongly typed. This has enormous benefits in clarity of purpose and IntelliSense support. It can also have a significant impact on performance because value types are not boxed. And, by isolating your tracing code, additional optimization can be done later, and only if needed. For example, you may need to do extra work to avoid the object overload of EventSource.WriteEvent(). But this code is a pain and adding it in all cases would be a premature optimization.

If you are using .NET and you are not yet using EventSource and ETW, no other performance improvement will be as great as moving to ETW. You can use ETW when you use EventSource directly without alternate listeners, and when you use out-of-proc SLAB (Semantic Logging Application Block from Microsoft Patterns and Practices which is an enhancement of EventSource). Isolating your calls allows you to make the change to ETW at your convenience.

I talked about semantic tracing here. The rest of this article explores details in Bill Chile’s article from a semantic perspective and assumes you are tracing outside the current process (EventSource or out-of-proc SLAB in NET).

Avoiding boxing allocations with semantic tracing

Heap allocations themselves have a tiny impact. The bigger problem is that each allocation must eventually be garbage collected. Garbage collection is also relatively fast, but it occurs in spurts. In most normal application code, allocations occur at a rate that can be smoothly garbage collected and you don’t need to worry about this issue, unless profiling shows that you have excessive allocations or garbage collection.

Since a well-traced application has lots and lots of trace calls, any allocations you have will occur a large, possibly massive, number of times. Avoiding unnecessary allocations in tracing is a very good thing.

Bill shows a great example of the allocation problems tracing can cause with unnecessary boxing:

public class BoxingExample
{
public void Log(int id, int size)
{
var s = string.Format("{0}:{1}", id, size);
Logger.WriteLine(s);
}
}



I see five heap allocations in this code. The string s is allocated. The two integers are boxed to object to pass to the Format() method. A string is then created for each to perform the concatenation.



These allocations occur whether or not tracing is turned on.



Replace this code with the following:



public class SemanticLoggingExample() : EventSource
{
public void ProcessStarted(int ProcId, int SampleSize)
{
WriteEvent(ProcId, SampleSize);
}
}



This code has zero allocations.



It is also more readable, discoverable, IntelliSense friendly.



Boxing and logging with strings



Semantic tracing discourages the use of strings in tracing. If you’re trace technology requires them, you can create the strings within your semantic trace method. The advantage is that you can avoid the resulting allocations when tracing is turned off, and you can replace your technology with one that does not require string creation when it’s available.



EventSource() in .NET does not require any string creation. You might want a message reported to consumers of your trace, but you can do this with the Message property of the Event attribute (the message parameter to the constructor). This is similar to a format string and is included in the manifest for the events. The ETW consumer application can build the string for the human user; the common consumers already use this string to display the message to the user. Your application does not ever build this string.



This saves I/O load, CPU cycles, boxing, and GC load.



As a side bonus, the Message property of the Event attribute can be localized. See my course or the ETW specification.



Bill’s advice on using ToString() prior to calling String.Format() is good when you are using the String.Format() method in .NET. But if you are building strings as arguments in calls to your trace system, you are almost certainly doing something wrong. Instead, maintain value types as value types throughout as much of the tracing pipeline as possible, and always to the point you can check whether tracing is turned on. And then, use the ToString() trick . It’s a cheap improvement since extra ceremony at that point is not distracting – the point of the method is to create the trace.



Avoiding other allocations



Bill’s article is a great source of other ways to avoid allocations and other performance tips. These issues are rare in tracing, but worth considering when they occur:



  • GetHashCode: cast enums to the underlying type
    • You might use a hash code as part of a strategy to hide sensitive information; hashing an enum might rarely be part of this, and it’s easy enough to do the cast in that scenario
  • HasFlag: boxes – use bitwise comparison in frequent calls
    • HasFlag with tracing is almost certainly in a section of a semantic method that doesn’t run unless tracing is turned on, and at that location, the extra clarity of HasFlag is probably not that important
  • String operations – watch Format(), Concat(), Split(), Join() and Substring() in frequent calls
    • Avoid strings in tracing as much as possible, and at least ensure ensure it’s in a section of a semantic method that doesn’t run unless tracing is turned on
  • Unnecessary literal allocations – see Bill’s WriteFormattedDocComment sample
    • If you have literals – like a table name – create it once
  • StringBuilder – still requires an allocation
    • Try to avoid creating any string that’s complex enough to make string builder appropriate
    • If you think you need a string builder in tracing, at least ensure it’s in a section of a semantic method that doesn’t run unless tracing is turned on.
  • Clever caching – Bill’s example is caching a StringBuilder
    • Avoid creating new object instances in your tracing
  • Closures in lambdas – a class is generated on compile and an instance allocated
    • Avoid lambda closures in tracing
  • LINQ – in addition to common lambda closures, there are extra allocations for delegates
    • Avoid LINQ in tracing
  • Inefficient async caching – cache the task, not just the result
    • Use tracing that is fast and let the technology (ETW) get off the thread – avoid async in tracing. You can’t build an async strategy that is as fast as the one Microsoft built with ETW (assuming .NET)
  • Dictionaries – often mis-used when simpler structures would work
    • You might use a lookup to sensitive information in a few cases (keep sensitive information out of traces, regardless of technology) and use a dictionary only if there will be many lookups and the list isn’t small
  • Class vs struct – classic space/time tradeoff
    • Consider this in tracing calls, generally pass specific data items rather than large structs, although this can undermine the ideal of semantic tracing
  • Caching without a disposal plan (like a capacity limit) – avoid this because it’s also called a memory leak
    • This can happen if you create a lookup for sensitive information, create a cache plan


You can read Bill’s article for more information on each of these issues.



Boxing and the EventSource.WriteEvent() object array overload



Boxing and an allocation occur whenever you pass a value type as an object parameter, so anytime you avoid calling an object overload it’s a good thing. But like all other methods, EventSource.WriteEvent() method has a finite number of type specific overloads. The problem is, creating additional overloads for WriteEvent() is a rather ugly operation requiring an unsafe code block.



You can determine if you’re using the object parameter overload through IntelliSense or Go To Definition.



I think you can avoid crating this extra overloads unless you either know you’re hammering a particular overload pattern with a very large number of calls, or you know from profiling that you have an allocation or GC problem. In that case, you should absolutely create the alternate overload to avoid the object parameter array version, to avoid boxing the value types in the call to a semantic trace.



In my Pluralsight video, I discuss that creating your own overload has about a 2 fold performance impact. That was a casual test that just measures the boxing in the worst case (a very simple call) and probably didn’t run into generation 2 GC blocking. I’m not confident that I captured the full impact, but it is small for traces that happen less than hundreds (possibly thousands) of times per second.



See the ETW specification for information on building extra overloads.



Tracing on, tracing off



Relying on checks of whether tracing is turned on develops a sense that you don’t care about trace performance when tracing is turned on. I’ve even heard developers state that position. But there is no value to tracing until you turn it on!



If you’re using a slow trace technology that touches a resource on your main application thread (in .NET that’s every trace strategy I know of other than ETW and out-of-proc SLAB using ETW) every optimization in this article is trivial compared to the cost of touching the resource when tracing is turned on. Use strongly typed parameters to semantic trace methods and guard your semantic trace methods with a check of whether tracing is turned on, then ignore all the other optimizations until you can switch to a more efficient form of tracing.



If you’re using EventSource, the first thing the WriteEvent() method does is check IsEnabled(). So using IsEnabled() has a trivial advantage if you are just calling WriteEvent() from your semantic trace method. If you use the IsEabled() method and tracing is turned off, you avoid an extra method call. If you use the IsEnabled() method, and tracing is turned on, there are two method calls. Method calls are very fast, the difference is trivial.



Use IsEnabled() when:



  • You’re doing extra work before calling WriteEvent()
  • You’re passing a non-trivial structure to WriteEvent()
  • You’re using the object overload of WriteEvent()


Summary



If you’re not using semantic tracing with strongly typed parameters, move to it to improve your overall trace strategy and performance.



If you’re not using a high performance out-of-proc tracing infrastructure, improve trace performance by moving to one.



If you’re using a high performance semantic trace strategy, several further tweaks are simple to do. These improvements are especially important if they increase confidence in your ability to using tracing during production.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>