Optimization, BLOB caching and HTTP 304s

There’s been an interesting mini-debate going on recently in terms of where to store static assets used by your site – images, CSS, JS files and so on. Broadly the two approaches can be characterized as:

  • Developer-centric – store assets on the filesystem, perhaps in the 12 hive
  • Author-centric – store assets in the content database, perhaps in the Style Library which comes with publishing sites

Needless to say these options offer different pros and cons depending on your requirements – Servé Hermans offers a good analysis in To package or not to package: that is the question. However, I want to throw another point into the debate – performance, specifically for anonymous users. Frequently, this is an audience I care deeply about since some of the WCM sites I work on often have forecast ratios of 80% anonymous vs. 20% authenticated users. Recently I was asked to help optimize an under-performing airline site built on MOSS – as usual the problem was a combination of several things, but one of the high-impact items was this decision to store assets in one location over the other. In this post I’ll explain what the effect on performance is and why you should consider this when building your site.

The problem

Once they’ve been loaded the first time, most of the static files a website uses should be served from the user’s local browser cache ("Temporary internet files") – without this, the internet would be seriously slow. Consider how much slower a web page loads when you do a hard refresh (ctrl+F5) compared to normal – this is because all the images are forced to be re-downloaded rather than served from the browser cache. Unfortunately, for files stored in some common SharePoint libraries/galleries (i.e. the author-centric approach) SharePoint doesn’t deal with this quite right in some scenarios – most of the gain is there, but despite having the image locally, the browser still makes a request for the image – the conversation goes like this (for EACH image on the page!):

Browser: I need this image please – I cached it last time I came at [date/time], but for all I know it’s changed since then.
Server: No need dude, it’s not changed so just use your local copy (in the form of a HTTP 304 – "Not modified")
Browser: Fair enough, cheers.

This essentially happens because the file was not served with a "cacheability" HTTP header to begin with. Needless to say, this adds significant time to the page load when you have 30+ images/CSS/JS files referenced on your page – potentially several seconds in my experience (under some circumstances), which of course is a huge deal. If say, the user is in Europe but the servers are in the U.S., then suddenly this kind of network chatter is something we need to address. Needless to say, in the majority of cases we’re happy to cache these files for a period since they don’t all change too often, and we get better performance as a result.

The Solution (for some SharePoint libraries *)

Mike Hodnick points us to part of the solution in his highly-recommended article Eliminating "304" status codes with SharePoint web folder resources. Essentially, SharePoint’s BLOB caching feature saves the day since it serves the image with a "max-age" value on the HTTP header, meaning the browser knows it can use it’s local copy of the file until this date. This only happens when BLOB caching is enabled and has the max-age attribute like this (here set to 84600 seconds = 24 hours):

<BlobCache location="C:\blobCache" path="\.(gif|jpg|png|css|js|aspx)$" maxSize="10" enabled="true" max-age="86400" />

When we configure the BLOB cache like this we are, in effect, specifying that it’s OK to cache static files for a certain period, so the "cacheable" header gets added. HOWEVER, what Mike doesn’t cover is that this only happens for authenticated users – files served out of common content DB locations such as the Style Library and Master Page Gallery still do not get served correctly to anonymous users. Note this isn’t all SharePoint libraries though – so we need to be clear on exactly when this problem occurs.

* Scope of this problem/solution

Before drilling down any deeper, let’s stop for a moment and consider the scope of what we’re discussing – a site with:

  • Anonymous users
  • Files stored in some libraries – I’m not 100% sure of the pattern but discuss it later – the Style Library and Master Page Gallery are known culprits however. Other OOTB libraries such as SiteCollectionImages do not have the problem.

If you don’t have this combination of circumstances, you likely don’t have the problem. For those who do, we’re now going to look closer at what’s going on, before concluding with how we can work around the issue at the end.

Drilling deeper

For a site which does have the above combination of circumstances, we can see the issue with Fiddler – as an anonymous user browsing to page I’ve already visited, I see a stack of 304s meaning the browser is re-requesting all these files:

BlobCachingDisabled_304s

However, if I’m authenticated and I navigate to the same page, I only see the HTTP 200 for the actual page, no 304s:

BlobCachingEnabled_No304s

Hence we can conclude it works fine for authenticated users but not for anonymous users.

So what can we do for our poor anonymous users (who might be in the majority) if we’re storing files in the problematic libraries? Well, here’s where I draw a blank unfortunately. Optimizing Office SharePoint Server for WAN environments on TechNet has this to say on the matter:

Some lists don’t work by default for anonymous users. If there are anonymous users accessing the site, permissions need to be manually configured for the following lists in order to have items within them cached:

  • Master Page Gallery
  • Style Library

Aha! So we need to change some permissions – fine. This seems to indicate that it is, in fact, possible to get the correct cache headers added to files served from these locations. Unfortunately, I simply cannot find what permissions need to be changed, and nobody on the internet (including the TechNet article) seems to detail what. The only logical setting is the Anonymous Access options for the list – these are all clear by default, but adding the ‘View Items’ permission (as shown below) does not change anything:

AnonPermissions

As a sidenote, the setting above is (I believe) effectively granting read permissions to the identity which is used for anonymous access to the associated IIS site. So in IIS 7.0, I’m fairly sure you’d achieve the same thing by doing this:

AddPermsIUsr

So the problem does not go away when anonymous users are granted the ‘View Items permission, and what I find interesting about this is that a closer look with Fiddler reveals some inconsistencies. The image below shows me browsing to a page anonymously for the first time, and to save you the hassle we can derive the following findings:

  • Files served from the ‘SiteCollectionImages’ library are given the correct max-age header (perhaps expected, since not one of the known ‘problem libraries’ e.g. Style Library)
  • Files served from the ‘_layouts’ folder are given a different max-age header (expected, settings from the IIS site are used here)
  • Some files in the Style Library are in fact given a the correct max-age header! (not expected) 

MixedHeaders_Anonymous

So the 2 questions which strike me here are:

  • Why are some files being served from ‘Style Library’ with the correct header when most aren’t?
  • Why can SharePoint add the ‘max-age’ header to files in the ‘SiteCollectionImages’ library but not the ‘Style Library’?

The first one is a mystery to me – it’s perhaps not too important, but I can’t work it out. The second one might be down to how the libraries are provisioned – the ‘Style Library’ is provisioned by declarative XML in the ‘PublishingResources’ Feature, whereas the ‘SiteCollectionImages’ library is provisioned in code using the same Feature’s activation receiver. Could this be the key factor? I don’t know, but I’d certainly be interested if anyone can put me straight – either on this or the mystery "permissions change" required to make BLOB caching deal with libraries such as the ‘Style Library’.

Conclusion

The key takeaway here is that for sites which want to take advantage of the browser caching for static files (for performance reasons) and have anonymous users, we need to be careful where we put our images/CSS/JS files as per Mike Hodnick’s general message. If we want to use the author-centric approach and store things in SharePoint libraries, we need to consider which libraries (and test) if we will have the 304 problem. Alternatively, we can choose to store these files on the filesystem (the developer-centric approach) and use a virtual directory with the appropriate cacheability settings to suit our needs. My suggestion would be to use a custom virtual directory for full control of this, since the default settings on the ‘_layouts’ directory ("cache for 1 year") are unlikely to be appropriate.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>