Editing .cmp files to fix lookup field issues

I ran into an interesting (well, it’s all relative) content deployment issue the other day, which I’m pretty sure will apply to both SharePoint 2007 and SharePoint 2010. In preparation for some SharePoint training I was delivering at my current client, I wanted to move some real data from production into the training environment to make the training more realistic. To do this, I used my Content Deployment Wizard tool, which uses SharePoint’s Content Deployment API to export content into a .cmp file. (Quick background – the tool does exactly the same thing as ‘STSADM –o export’ and out-of-the-box content deployment, but allows more control. A .cmp file is actually just a renamed cab file i.e. a compressed collection of files, similar to a .wsp). However, when importing the .cmp file containing my sites/documents etc., the operation failed with the following error:

The element ‘FieldTemplate’ in namespace ‘urn:deployment-manifest-schema’ has invalid child element ‘Field’ in namespace ‘http://schemas.microsoft.com/sharepoint/’. List of possible elements expected: ‘Field’ in namespace ‘urn:deployment-manifest-schema’.

So clearly we have a problem with a field somewhere, and it’s an issue I was vaguely aware of – cross-web lookup fields deployed with a Feature break content deployment. Michael Nemtsev discusses the issue here, saying “There are several samples how to deploy lookup fields via feature (http://www.sharepointnutsandbolts.com/2007/04/feature-to-create-lookup-fields-on.html) but all of them are not suitable for the Content Deployment Jobs. Because you will get the exception…”

Oops that’s a link to an old article of mine. So effectively *my* code to create lookup fields doesn’t work with *my* content deployment tool – don’t you just love it when that happens?! However, I actually have a clear conscience because I know both utilities are doing valid things using only supported SharePoint APIs – this is simply one of those unfortunate SharePoint things. As Michael says, all of the cross-web lookup field samples would have this issue. So what can we do about it?

For fields yet to be created

In this scenario my recommendation would be to use the technique Michael suggests in his post, which is to strip out the extraneous namespace at the end of our code which creates the lookup.

For fields which are already in use (i.e. the problem I ran into)

If your lookup fields have already been deployed, then you have 2 options:

  • develop and test a script to retrospectively find and fix the issue across your web/site collection/farm/whatever scope you need
  • fix the issue in the .cmp file you were trying to import in the first place, so this particular import will succeed

Clearly your decision might depend on how much content deployment you want to do to the site/web/document library or list which has the problem. If you’re anticipating doing it all the time, you should fix the underlying issue. If, as in my scenario, you just need to get an ad-hoc import to succeed, here’s how..

Hacking the .cmp file

The process is effectively to fix-up the .cmp file, then rebuild it with the updated files. I noticed an unanswered question on Stack Overflow about this process, so clearly it’s something that can occasionally arise. Of course even in these WSPBuilder/SP2010 tools days, all SharePoint devs should know you can use makecab.exe with a .ddf file to build a cab file – but what happens when you have hundreds of files? That’s hundreds of lines you’ll need in your .ddf file? Certainly you could write some code to generate it for you, but chances are you’re looking for a quick solution. 

The first process I came up with was:

  1. Rename .cmp file to .cab, extract files to other directory.
  2. Fix-up files (more detail shortly).
  3. Use WSPBuilder to generate the .ddf file from files on filesystem – edit this to ensure paths are correct.
  4. Use makecab.exe + the generated .ddf file to build the .cab file.
  5. Rename extension to .cmp.

However, a slightly better way is to use Cab File Maker, like this:

  1. Rename .cmp file to .cab, extract files to other directory.
  2. Fix-up files – to do this, edit manifest.xml, remove all instances of following string – “xmlns=http://schemas.microsoft.com/sharepoint/
  3. Open Cab File Maker 2.0, drag files in and set options for where to generate ddf file and cmp file, like so:


  4. Voila – your cmp file is now ready to go and should import successfully.

My favorite SharePoint 2007 development techniques (with an eye on SP2010)

As we home in on the release of SharePoint 2010, I wanted to write down a couple of thoughts for posterity on SharePoint 2007 development, mainly for my benefit. One of the reasons for doing this is because I’ve been working with the SP2010/VS2010 Tech Previews recently, and whilst I’ve not done a full “compare and contrast” exercise, I can certainly see that in the future I will want to reference back to how I liked to handle something in the SharePoint 2007 world, and more importantly, why. My experience is that transitioning to a new platform brings on a certain amnesia, where for some reason it’s difficult to remember just how the equivalent thing worked in the last version (CMS2002 anyone?) – undoubtedly we need to avoid restricting our thinking with irrelevant practices and constraints, but sometimes the old way is definitely useful as a reference point.

This isn’t a comprehensive list – many of my points below came out of a “developer sunset review” of my last project (special thanks to ex team-mate Jaap Vossers for some of the ideas/discussions we had around this stuff). Some techniques are in there because we used them and they worked great, others because we didn’t and I thought we suffered from not having them. A couple of others are just “things I’ve still not implemented but think would be the best option” – some of which could still be appropriate under SP2010. Many are what I believe to be the established “baseline approach” by many teams implementing SharePoint 2007 – it’s perhaps stretching it to say “best practice” for some, so I won’t. Even so, I’m *sure* so folks will have different views/techniques, these are the ones I wanted to capture – by all means leave a comment or point me to something else if you have better thoughts:

Visual Studio solution/project structure

  • Every VS project which deploys to the 12 folder contains a 12 folder structure 
  • Use the “class library” project template to start with a clean base – edit the .csproj file to add back menu options for ‘Add user control’ 
  • WSPBuilder as WSP-generation tool of choice
  • One main ‘consolidation’ project which is used to generate a single .wsp (where appropriate) – usually this is my [Company].[Client].Project.Web project
  • Use post-build command on each project to XCOPY 12 hive files into the consolidation project, so that avoid having one .wsp for each VS project – fewer .wsps are preferred to reduce versioning/dependency issues
  • User controls – for publishing sites, consider implementing HTML markup in user controls, not page layouts (as discussed in my Top 5 WCM tips post and by Waldek in Leveraging ASP.NET User Controls in SharePoint development)
  • User controls (if not using the above technique) – to get design surface, consider using a separate web project for initial development of user controls, then either using post-build events to copy the .ascx to your main project or using the ‘Add as link’ technique. (As far as I remember this is the only way to have a functioning design surface for user controls?)
    • Remember that many .ascx artifacts cannot exist in a subfolder of CONTROLTEMPLATES (e.g. custom field controls), they must be at the root for SharePoint to load them
  • Use Visual Studio post-build events to re-GAC assemblies and copy 12 folder – so that default action on compile is the “quick deploy” option. This happens so often in dev that I’d rather have any other option require an explicit action on my part, we since rarely want to compile but NOT load new assemblies
  • Consider creating custom VS build types e.g. “DebugNoDeploy”, “ReleaseNoDeploy”
    • Additionally, create a build type to re-provision your site in dev (if this step happens frequently in development of your implementation)
  • Leverage custom VS tool options where appropriate (e.g. “Tools > MyScript.bat”)
  • Re-bin is much faster than re-GAC (for code which can be tested outside of the GAC) – custom tool script or custom build type. This is useful for the highly-iterative part of development.

SharePoint coding tidbits

A selection of random thoughts I want to hold on to, as I think they’ll likely be relevant in the 2010 world:

  • The impact of app pool resets in dev should always be minimized by disabling the Certificate Revocation List check – I notice significant gains when doing this
  • Think carefully before storing data in SPWeb property bags – data stored here is not accessible to a query!
  • Use constants for SharePoint names, in particular field names
    • This is critical for consistency across project teams, and for providing name changes via Visual Studio refactoring
    • On balance, better to have separate DisplayName & InternalName constants
  • Logging – My preferred logging framework is the excellent log4net
    • If you have a requirement to log to a SharePoint list, creation of a custom log4net appender is the way to go. I haven’t done this yet, and bizarrely it seems no-one else has (publicly). Would be fairly trivial to do though
    • Fellow MVP Dave Mann pointed out to me that log4net causes problems when used in SharePoint workflows, as the logger class cannot be serialized when the workflow sleeps. It might be possible to mitigate this by not storing the logger as a private variable, but instantiating every time used (likely log4net returns same object, but performance unlikely to be critical in a workflow anyway)
  • Managing web.config modifications in dev (when you just don’t have time to SPWebConfigModification yet):
    • I don’t have a good story here – the best I’ve come up with is to have a ‘reference’ web.config stored in source control which can be used to sync changes between devs. As a sidenote, perhaps this issue can be avoided if the first coding week on the project lays down the plumbing code for SPWebConfigModification Feature Receivers as a “mandatory setup task”, so that it’s minimal friction when a new web.config change is required – otherwise I think it’s common to skip this and go into “technical debt” until such a time when the team can catch up on such things. And we all know what can happen there..

So whether you start to look at SharePoint 2010 immediately or your day job remains focused on 2007 for another year, I hope this list has been useful. Speaking personally, I know that in the interests of Kaizen (continual improvement), it will be illuminating to look back and see what’s still relevant and what isn’t. Looking forward, like many other MVPs this blog will now focus more on SP2010 going forward – I’ll most likely revisit some of this in view of my experiences with the VS2o1o Tools for SharePoint in the next couple of weeks (after the NDA is lifted). Stay tuned!

My checklist for optimizing SharePoint sites

Optimization is probably one of my favorite SharePoint topics – it seems there’s always a new trick to learn, and I love the fact that a minor tweak can have a dramatic impact on how well your site works. There are some great resources out there now for information on how to boost performance, but it strikes me that there isn’t one single paper or article which contains all the aspects I personally consider or have been relevant to me on past projects. Hence this article is essentially an aggregation of some of these sources. The aim isn’t for it to be “complete” (if there is such a thing) or authoritative in any way, and I know I wouldn’t be the first person to blog on this subject – it really is just a reminder list for me to refer to, but if it helps you also that’s great.

Key resources:

Before I break into the list, remember that different optimization measures will have different effects depending on your circumstances – having the best checklist in the world is really no substitute for the SharePoint architect’s ability to identify what is likely to be the bottleneck for a given implementation. Additionally my list is quite “developer-focused”, but if the infrastructure aspects of the implementation is the limiting factor (e.g. latency between web servers and SQL, amount of usable RAM etc.) you can probably optimize code until you’re blue in the face without solving your performance problem.

My list is broken down into sections, and some of the items have links to some useful resource or other which you might find helpful if the topic is new to you.


Items in this section typically have  a big impact – in particular output caching, which is the first thing to consider on WCM sites for example.


  • Test with SPDisposeCheck to ensure no memory leaks
  • Measure page payload weight (again, YSlow useful here) – aim to reduce as far as possible
  • Eliminate unnecessary ViewState! A good dev technique may be to turn off ViewState in web.config and then override it in controls/pages which really do need it (haven’t tried this, but keep meaning to).
  • Ensure not downloading core.js etc for anonymous users/delay-loading for authenticated.
  • Ensure image sizes are small/images are optimized for web (still comes up!)
  • Always ensure height/width attributes specified on images
  • Ensure custom CSS is factored correctly
  • Don’t forget client-side code efficiency e.g. CSS, jQuery selectors
  • Consider using a code-profiling tool such as ANTS profiler or dotTrace to identify sub-optimal code
  • Ensure general good coding practice e.g. your optimization work may well be undone if your code is badly-written. Accidentally doing unnecessary processing in a loop (in server-side OR client-side code is one example I’ve seen many times)
  • Be wary of performance impact of custom HTTP modules – I frequently see huge perf gains when running load tests with such modules enabled/disabled. Removing core.js as a post-processing step within a HTTP module is NOT the right approach kids! (No really, I’ve seen it done..)

Code/development – advanced

These tips might only really be necessary if you’re implementing a high-traffic site, but are worth considering:

  • Consider factoring custom JS/CSS into fewer files – fewer HTTP requests is better
  • Consider “minifying” JS
  • Consider using CSS to cluster background images (all images stitched into one for fewer HTTP requests) – AC might have mentioned somewhere an online service to automate this..


You should consider this section absolutely incomplete – hopefully there’s enough here to dissuade anybody of the notion that it’s only about code though! The first two in particular are key.

  • 64-bit/sufficient available RAM etc.
  • Latency between web and SQL – MS suggest a ping response of less than 1 millisecond
  • Consider a CDN solution (e.g. Akamai, Limelight etc.) if users are geographically far away from servers (and y0ur client can afford it!)
  • Ensure not using web garden with BLOB cache (known BLOB cache issue)
  • Are app pool settings correct – check for excessive app pool recycling by enabling logging to the Windows event log using – cscript adsutil.vbs Set w3svc/AppPools/ <YourAppPoolName> /LogEventOnRecycle 255. Once or twice per day is probably the maximum you’re hoping for.
  • Is storage correctly architected e.g. are OS, data, logs on different disk spindles?
  • Is storage correctly configured (e.g. SAN configuration) – are your LUNS giving you sufficient IOPS?!

Using YSlow to help optimize websites

One particular tool worthy of discussion when we’re talking about optimizing websites is YSlow – if you haven’t come across it yet, this is a Firefox plugin which extends Firebug) developed by Yahoo! developers which uses their optimization ruleset (customizable) to report on and help further streamline your site. The tool focuses on communication between the server and browser – clearly a browser tool like this can’t identify any back-end issues, but it’s always interesting to run a site through the checks.

The first tab gives a ‘grade’ for your site based on the ruleset, helping you identify areas for improvement (I’ve highlighted where the page stops and the YSlow console starts with a red box):

The next tab provides a breakdown of files downloaded by the request, and allows me to see their size and headers etc. In particular I can see the expiry date and Etag and so on which the files are being tagged with so they can be cached locally:


The statistics tab provides some good analysis on the page weight and also illustrates the difference between a first visit and subsequent visits where appropriately tagged files will be served from the browser cache:


Finally the Net tab in Firebug is also interesting, as this shows me how files were downloaded (sequential or in parallel) – the most recent browsers do a much better job of this, with IE8 and FF3 being able to open 6 channels per URL domain to download files in parallel, but note that IE7 could only open 2.


From this, I also see the http://sharepoint.microsoft.com site (great site btw) I’ve been analyzing also displays the BLOB cache issue I discussed recently (now confirmed as a bug) where incorrect headers are added to files stored in the Style Library, causing lots of unnecessary HTTP 304s to go across the wire. So YSlow does give a great insight – however I do agree with Jeff Atwood’s reminder that some of the reported “issues” may not be the biggest concern for your site. As always, apply common sense.


Optimization is a deep topic, and this checklist is simply my reminder of some of the things to consider, though in reality a solid understanding of the nuts and bolts is required to really architect and develop high-performing sites. Tools like YSlow can also help with some aspects of optimization.

So which optimization nuggets did I miss which you consider? Feel free to leave a comment and add to this list..

More on optimization, HTTP 304s etc. – a solution?

In my last post Optimization, BLOB caching and HTTP 304s, I did a fairly lengthy walk-through on an issue I’d experienced with SharePoint publishing sites. A few people commented, mainly saying they’d noticed the same thing, but there have been further developments and findings I wanted to share!

Quick recap

Under certain circumstances some files in SharePoint are always re-requested by the browser despite being present in the browser cache (“Temporary internet files”). Specifically this is observed for files stored in the Style Library and Master Page Gallery, for anonymous users. Although SharePoint responds with a HTTP 304 to say the cached file can indeed be used (as opposed to sending the file itself again), we effectively have an unnecessary round-trip to the server for each file – and there could be many such files when all the page’s images/CSS/JS files are considered. This extra network traffic can have a tangible impact on site performance, and this is magnified if the user is geographically far away from the server.

A solution?

Waldek and I have been tossing a few development matters around recently over e-mail, and he was curious enough to investigate this issue for himself. After reproducing it and playing around for some time, Waldek discovered that flushing the disk-based cache seems to cause a change in behaviour – or in layman’s terms, fixes everything. To be more specific, we’re assuming it’s a flush of the BLOB cache which is having the affect – in both Waldek’s test and my subsequent validation, the object cache was also flushed as well:


After the OK button is hit on this page, the problem seems to go away completely, so now when the page is accessed now for the first time as an anonymous user, the correct ‘max-age’ header is added to the files (as per the BLOB cache declaration in web.config) – contrast the ‘max-age=86400’ header on the Style Library files with what I documented in my last post:


This means that on subsequent requests, the Style Library files are served directly from the browser cache with no 304 round-trip:


This is great news, as it means the issue I described is essentially a non-issue, and there is therefore no performance penalty for storing files in the publishing Style Library.

So what gives?

I’m now wondering if this is just a ‘gotcha’ with BLOB caching and publishing sites. I know other people have run into the original issue due to the comments on my previous post, and interestingly enough one poster said they use reverse proxy techniques specifically to deal with this issue. Could it really be that everybody who sees this behaviour just didn’t flush the BLOB cache somewhere along the way, when it’s actually a required step? Or is the testing that Waldek and I did flawed in some way? Or indeed, was my initial investigation flawed despite the fact others reported the same issue?

I am interested to hear from you on this – if you can reproduce the problem I’ve described with a publishing site you’ve developed, does flushing the BLOB cache solve it for you as described here? Leave a comment and let us know!

Good work Waldek 🙂

Optimization, BLOB caching and HTTP 304s

There’s been an interesting mini-debate going on recently in terms of where to store static assets used by your site – images, CSS, JS files and so on. Broadly the two approaches can be characterized as:

  • Developer-centric – store assets on the filesystem, perhaps in the 12 hive
  • Author-centric – store assets in the content database, perhaps in the Style Library which comes with publishing sites

Needless to say these options offer different pros and cons depending on your requirements – Servé Hermans offers a good analysis in To package or not to package: that is the question. However, I want to throw another point into the debate – performance, specifically for anonymous users. Frequently, this is an audience I care deeply about since some of the WCM sites I work on often have forecast ratios of 80% anonymous vs. 20% authenticated users. Recently I was asked to help optimize an under-performing airline site built on MOSS – as usual the problem was a combination of several things, but one of the high-impact items was this decision to store assets in one location over the other. In this post I’ll explain what the effect on performance is and why you should consider this when building your site.

The problem

Once they’ve been loaded the first time, most of the static files a website uses should be served from the user’s local browser cache ("Temporary internet files") – without this, the internet would be seriously slow. Consider how much slower a web page loads when you do a hard refresh (ctrl+F5) compared to normal – this is because all the images are forced to be re-downloaded rather than served from the browser cache. Unfortunately, for files stored in some common SharePoint libraries/galleries (i.e. the author-centric approach) SharePoint doesn’t deal with this quite right in some scenarios – most of the gain is there, but despite having the image locally, the browser still makes a request for the image – the conversation goes like this (for EACH image on the page!):

Browser: I need this image please – I cached it last time I came at [date/time], but for all I know it’s changed since then.
Server: No need dude, it’s not changed so just use your local copy (in the form of a HTTP 304 – "Not modified")
Browser: Fair enough, cheers.

This essentially happens because the file was not served with a "cacheability" HTTP header to begin with. Needless to say, this adds significant time to the page load when you have 30+ images/CSS/JS files referenced on your page – potentially several seconds in my experience (under some circumstances), which of course is a huge deal. If say, the user is in Europe but the servers are in the U.S., then suddenly this kind of network chatter is something we need to address. Needless to say, in the majority of cases we’re happy to cache these files for a period since they don’t all change too often, and we get better performance as a result.

The Solution (for some SharePoint libraries *)

Mike Hodnick points us to part of the solution in his highly-recommended article Eliminating "304" status codes with SharePoint web folder resources. Essentially, SharePoint’s BLOB caching feature saves the day since it serves the image with a "max-age" value on the HTTP header, meaning the browser knows it can use it’s local copy of the file until this date. This only happens when BLOB caching is enabled and has the max-age attribute like this (here set to 84600 seconds = 24 hours):

<BlobCache location="C:\blobCache" path="\.(gif|jpg|png|css|js|aspx)$" maxSize="10" enabled="true" max-age="86400" />

When we configure the BLOB cache like this we are, in effect, specifying that it’s OK to cache static files for a certain period, so the "cacheable" header gets added. HOWEVER, what Mike doesn’t cover is that this only happens for authenticated users – files served out of common content DB locations such as the Style Library and Master Page Gallery still do not get served correctly to anonymous users. Note this isn’t all SharePoint libraries though – so we need to be clear on exactly when this problem occurs.

* Scope of this problem/solution

Before drilling down any deeper, let’s stop for a moment and consider the scope of what we’re discussing – a site with:

  • Anonymous users
  • Files stored in some libraries – I’m not 100% sure of the pattern but discuss it later – the Style Library and Master Page Gallery are known culprits however. Other OOTB libraries such as SiteCollectionImages do not have the problem.

If you don’t have this combination of circumstances, you likely don’t have the problem. For those who do, we’re now going to look closer at what’s going on, before concluding with how we can work around the issue at the end.

Drilling deeper

For a site which does have the above combination of circumstances, we can see the issue with Fiddler – as an anonymous user browsing to page I’ve already visited, I see a stack of 304s meaning the browser is re-requesting all these files:


However, if I’m authenticated and I navigate to the same page, I only see the HTTP 200 for the actual page, no 304s:


Hence we can conclude it works fine for authenticated users but not for anonymous users.

So what can we do for our poor anonymous users (who might be in the majority) if we’re storing files in the problematic libraries? Well, here’s where I draw a blank unfortunately. Optimizing Office SharePoint Server for WAN environments on TechNet has this to say on the matter:

Some lists don’t work by default for anonymous users. If there are anonymous users accessing the site, permissions need to be manually configured for the following lists in order to have items within them cached:

  • Master Page Gallery
  • Style Library

Aha! So we need to change some permissions – fine. This seems to indicate that it is, in fact, possible to get the correct cache headers added to files served from these locations. Unfortunately, I simply cannot find what permissions need to be changed, and nobody on the internet (including the TechNet article) seems to detail what. The only logical setting is the Anonymous Access options for the list – these are all clear by default, but adding the ‘View Items’ permission (as shown below) does not change anything:


As a sidenote, the setting above is (I believe) effectively granting read permissions to the identity which is used for anonymous access to the associated IIS site. So in IIS 7.0, I’m fairly sure you’d achieve the same thing by doing this:


So the problem does not go away when anonymous users are granted the ‘View Items permission, and what I find interesting about this is that a closer look with Fiddler reveals some inconsistencies. The image below shows me browsing to a page anonymously for the first time, and to save you the hassle we can derive the following findings:

  • Files served from the ‘SiteCollectionImages’ library are given the correct max-age header (perhaps expected, since not one of the known ‘problem libraries’ e.g. Style Library)
  • Files served from the ‘_layouts’ folder are given a different max-age header (expected, settings from the IIS site are used here)
  • Some files in the Style Library are in fact given a the correct max-age header! (not expected) 


So the 2 questions which strike me here are:

  • Why are some files being served from ‘Style Library’ with the correct header when most aren’t?
  • Why can SharePoint add the ‘max-age’ header to files in the ‘SiteCollectionImages’ library but not the ‘Style Library’?

The first one is a mystery to me – it’s perhaps not too important, but I can’t work it out. The second one might be down to how the libraries are provisioned – the ‘Style Library’ is provisioned by declarative XML in the ‘PublishingResources’ Feature, whereas the ‘SiteCollectionImages’ library is provisioned in code using the same Feature’s activation receiver. Could this be the key factor? I don’t know, but I’d certainly be interested if anyone can put me straight – either on this or the mystery "permissions change" required to make BLOB caching deal with libraries such as the ‘Style Library’.


The key takeaway here is that for sites which want to take advantage of the browser caching for static files (for performance reasons) and have anonymous users, we need to be careful where we put our images/CSS/JS files as per Mike Hodnick’s general message. If we want to use the author-centric approach and store things in SharePoint libraries, we need to consider which libraries (and test) if we will have the 304 problem. Alternatively, we can choose to store these files on the filesystem (the developer-centric approach) and use a virtual directory with the appropriate cacheability settings to suit our needs. My suggestion would be to use a custom virtual directory for full control of this, since the default settings on the ‘_layouts’ directory ("cache for 1 year") are unlikely to be appropriate.

Fix to my Config Store framework and list provisioning tips

Had a couple of reports recently of an issue with my Config Store solution, which provides a framework for using a SharePoint list to store configuration values. If you’re using the Config Store this article will definitely be of interest to you, but I’ve also picked up a couple of general tips on list provisioning which I want to pass on. I have to thank Richard Browne (no blog) of my old company cScape, as the fix and several of the tips have come from him – as well as alerting me to the problem, he also managed to fix it before I did, so many thanks and much kudos mate 🙂

Config Store problem

Under some circumstances, fields in the Config Store list were not editable because they no longer appeared on the list edit form (EditForm.aspx). So instead of having 4 editable fields, only the ‘Config name’ field shows in the form:


I’ve not fully worked out the pattern, but I think the problem may only appear if you provision the list on a server which has the October or December Cumulative Update installed – either that or it’s a difference between Windows 2003 and Windows 2008 environments (which would be even more bizarre). Either way, it seems something changed in the way the provisioning XML was handled somewhere. This is why the problem was undetected in the earlier releases.

I had seen this problem before – but only when the list was moved using Content Deployment (e.g. using the Content Deployment Wizard) – the original ‘source’ list was always fine. We managed to work around this by writing some code which ‘re-added’ the fields to the list from the content type, since they were always actually present on the content type and the data was still corrected stored. Having to run this code every time we deployed the list was an irritation rather than critical, but something I wanted to get to the bottom of – however, on finding some folks were running into this in ‘normal’ use meant that it became a bigger issue.

The cause

I always knew the problem would be down to a mistake in the provisioning XML, but since I’d looked for it on previous occasions I knew it was something I was seeing but not seeing. In my case, Richard spotted that I was using the wrong value in my FieldRef elements under the ContentType element – I was mistakenly thinking that the ‘Name’ attribute needed to match up with the ”StaticName’ attribute given to the field; the documentation says this attribute contains the internal name of the field. So my FieldRefs looked like this:

<ContentType ID="0x0100E3438B2389F84cc3965600BC16BF32E7" Name="Config item" 
Group="Config Store content types" Description="Represents an item in the config store." Version="0">
<FieldRef ID="{33F5C8B4-A6BB-41a4-AB24-69F2152974C5}" Name="ConfigCategory" Required="TRUE" />
<FieldRef ID="{BD413479-48AB-41f5-8040-918F32EBBCC5}" Name="ConfigValue" Required="TRUE" />
<FieldRef ID="{84D42C64-D0BD-4c76-8ED3-0A9E0D261111}" Name="ConfigItemDescription" />

..to match up with fields which looked like this:

<Field ID="{33F5C8B4-A6BB-41a4-AB24-69F2152974C5}"
Name="Config category"
DisplayName="Config category"


The CORRECTED version looks like this (note the change in value for the Name attribute of FieldRefs):

<ContentType ID="0x0100E3438B2389F84cc3965600BC16BF32E7" Name="Config item"
Group="Config Store content types" Description="Represents an item in the config store." Version="0">
<FieldRef ID="{33F5C8B4-A6BB-41a4-AB24-69F2152974C5}" Name="Config category" Required="TRUE" />
<FieldRef ID="{BD413479-48AB-41f5-8040-918F32EBBCC5}" Name="Config value" Required="TRUE" />
<FieldRef ID="{84D42C64-D0BD-4c76-8ED3-0A9E0D261111}" Name="Config item description" />

So, the main learning I got from this is to remember that the ‘Name’ of the FieldRef attribute needs to match the ‘Name’ of the Field attribute – that simple. Why did it work before? No idea unfortunately.

However, I also picked up a few more things I didn’t know about, partly from Richard (this guy needs a blog!) and partly from some other reading/experimenting..

Some handy things to know about list provisioning

  • To make a field mandatory on a list, the ‘Required’ attribute must be ‘TRUE’. Not ‘True’ or ‘true’ – this is one of the cases where the provisioning framework is pernickety about that 6-choice boolean 😉
  • FieldRefs need an ID and Name as a minimum (which must match the values in the ‘Field’ declaration), but you can override certain other things here like the DisplayName – this mirrors what is possible in the UI.
  • You don’t have to include the list .aspx files (DispForm.aspx, EditForm.aspx and NewForm.aspx) in your Feature if you use the ‘SetupPath’ attribute in the ‘Form’ element in schema.xml (assuming you don’t need to associate custom list forms).
  • You can use the ‘ContentTypeRef’ element to associate your content type with the list (specify just content type ID), rather than using the ‘ContentType’ element which needs to redeclare all the FieldRefs.
  • It’s safe to remove all the default ‘system’ fields from the ‘Fields’ section of schema.xml

Going further than these tips, the best thing I found on this is Oskar Austegard’s MOSS: The dreaded schema.xml which shows how you can strip a ton of stuff out of schema.xml. I’ve not tried it yet, but I’m sure that will be my starting point for the next list I provision declaratively. If you’re interested in the nuts and bolts of list provisioning, I highly recommend you read it.

Happy XML’ing..

Slide deck from my deployment talk at Best Practices Conference

Had a great time presenting at the European SharePoint Best Practices Conference last week. I’ve been trying to put my finger on what made it such a good conference and I’m actually not sure, but I notice that other speakers and attendees have also been full of praise, so it’s not just me. The event itself was extremely well-organized with excellent content, and Steve Smith and his team did a great job of looking after us speakers.

Highlights for me on the dev track were sessions from AC, Todd Bleeker, Eric (or "Uncle Eric" as I like to think of him, with his wise words on high-performance coding :-)) and Andrew Woody, but whenever I did stray from developer content I seemed to run into a great session like Mike Watson‘s on SQL Server in relation to SharePoint. Similarly I heard good things about speakers like Dan McPherson doing innovative sessions on the Information Worker track which I was disappointed to miss.

Another highlight was being on the two dev panel sessions we did, and having an interesting debate in one of them with Todd on approaches for provisioning – declarative (Features) vs. programmatic (code/PowerShell etc.). This was probably a good lead-in to my talk the next day, and some folks came up to say they really liked this conversation and that we covered it from angles they hadn’t considered, which was good to hear.

So all in all, a top conference, and fantastic to catch up with so many friends. Here’s the link for my deck:

Slide deck – Approaches and best practices for deploying SharePoint sites through multiple environments (dev, QA, UAT, production)


Command-line support for Content Deployment Wizard now available

I’m pleased to announce I’ve now completed initial development on the next version of the Content Deployment Wizard – this is a beta release for the next few weeks so if you need it "just work", you should continue to use the previous version (1.1), but I’m hoping some people out there are happy to test this beta. The tool has become fairly popular as a ‘handy tool to have in the SharePoint toolbox’, and hopefully this release extends it’s usefulness significantly for some scenarios. If you’re not familiar with the tool, it provides a way to import/export site collections, webs, lists, and files or list items, either between farms or between different sites in the same farm – the Codeplex site has more details. As previously mentioned, the key new additional functionality in this release is:

  • Command-line support
  • Support for saving of import/export settings to a file (in the Windows Forms app) for later re-use
  • An installer

Having command-line support for the Wizard means that it can now be used in an automated way. Some key scenarios I think this might be useful in are:

  • Continuous integration/automated builds – if your site relies on SharePoint content, you can now move ‘real’ data as part of a build process, copying selected content from ‘dev’ to ‘build’ or ‘test’ for example. I often see static data (perhaps from an XML file or Excel spreadsheet) used in this way in nAnt/CruiseControl/MSBuild scripts, but for frequently changing data (config values, lookup lists etc.), this doesn’t work so well as there is always a static file to maintain separately. 
  • Deployment scripts – if you have deployment scripts to ‘bootstrap’ a website on developer machines, again pulling real data from a central ‘repository site’ can help here.
  • As part of a production ‘Content Deployment strategy’ – since out-of-the-box Content Deployment is restricted to deploying a web as the smallest item, the Wizard could be used to deploy selected lists/list items/files

Obviously you might have your own ideas about where it could slot into your processes too.

How it works

  1. First, we select the content to move as we would normally using the Wizard..


  2. ..and select the options we want to use for this export..


  3. On the final screen, the new ‘Save settings..’ button should be used to save your selections to an XML file: 

    This will then give you an XML file which looks like this:

  4. <ExportSettings SiteUrl="http://cob.publish.dev" ExcludeDependencies="False" ExportMethod="ExportAll" 
                    IncludeVersions="LastMajor" IncludeSecurity="None" FileLocation="C:\Exports" 
        <DeploymentObject Id="b0fd667b-5b5e-41ba-827e-5d78b9a150ac" Title="Blog" Url="http://cob.publish.dev/Blog" Type="Web" IncludeDescendants="All" />
        <DeploymentObject Id="cfcc048e-c516-43b2-b5bf-3fb37cd561be" Title="http://cob.publish.dev/_catalogs/masterpage/COB.master" Url="_catalogs/masterpage/COB.master" Type="File" IncludeDescendants="None" />
        <DeploymentObject Id="670c1fb3-12f3-418b-b854-751ba80da917" Title="http://cob.publish.dev/_catalogs/masterpage/COBLayoutSimple.aspx" Url="_catalogs/masterpage/COBLayoutSimple.aspx" Type="File" IncludeDescendants="None" />

  5. So we now have an XML ‘Wizard deployment settings file’ which has the IDs of the objects we selected and the export options. We’ll go ahead and show how this can be used at the command-line, but note also these settings can also be loaded into the Wizard UI on future deployments to save having to make the selections again – the key is the ‘Load settings..’ button on the first page (which we didn’t show earlier):


  6. For command-line use of the Wizard a custom STSADM command is used. We pass the settings file in using the -settingsFile switch. To run the export operation we showed above, our command would look like:
    stsadm -o RunWizardExport -settingsFile "C:\DeploymentSettings\ExportBlogSubwebAndTemplates.xml" -quiet

    The -quiet parameter is optional, and suppresses some of the progress messages which are returned during the operation.

  7. For an import operation, we follow the same process – go through the Wizard and select the settings for the import operation, then click ‘Save settings..’ at the end to get the file (N.B. note the ‘Import settings’ screen has been simplified slightly from previous versions):


  8. The command to import looks like this:
    stsadm -o RunWizardImport -settingsFile "C:\DeploymentSettings\ImportBlogSubwebAndTemplates.xml" -quiet

    So that’s both sides of it.

Using it for real

In real use of course, you may be deploying from one SharePoint farm to another. In this case, you also need to deal with copying the .cmp file from the source environment to the target if you’re going across farms – if you have network access between farms (e.g. you’re using it internally for automated builds/CI), a simple XCOPY in your scripts is the recommended way to do this. For production Content Deployment scenarios with no network connectivity, what I’m providing here will need to be supplemented with something else which will deal with the file transport. Clearly something web service based could be the answer.


Using the Wizard at the command-line may prove extremely useful if you need to move any SharePoint content regularly in an automated way. In contrast with other ways you might approach this, the XML definition file allows you to choose any number of webs/lists/list items/files to move in one operation, which may suit your needs better than shipping items around separately.

This is very much a beta release, but as a sidenote I’m expecting the initial issues to mainly be around the installer rather than core code – hence I’m providing a ‘manual’ install procedure which will get you past any such issues (see the readme). Needless to say, all the source code is also there for you on Codeplex if you’re a developer happy to get your hands dirty. I’m hoping a couple of friendly testers will try it out and help me iron out the wrinkles – please submit any issues to the Codeplex site linked to below.

You can download the 2.0 beta release of the Wizard (and source code) from:

Update on next version of Content Deployment Wizard

Generally I only ever talk about SharePoint tools I’m working on once they’re 100% complete and ready for use, but recently I had a conversation with someone at a user group which made me think about a policy change. Regular readers will know the main tool I’m associated with is the SharePoint Content Deployment Wizard which has become fairly popular (over 7000 downloads) – occasionally I’ve mentioned that one goal was to implement a command-line version, since this opens up all sorts of deployment possibilities. However I’ve not talked about this for a while, and just recently I’ve spoken to a couple of people who assumed I dropped this/didn’t have the time to look at it, so here I am to tell you this is not the case!

For anybody that cares, the good news is I’ve actually been working on this since December interspersed with blogging, and am nearly done. The yicky refactoring work is complete, and I got chance to write the custom STSADM command on the front of it on the flight to the MVP summit last week. I need to do more testing first, but I’m hoping to release a beta to Codeplex over the next couple of weeks – if you’re interested in the idea of scripted deployment of specific sites/webs/lists/list items between sites or farms (remember MOSS Content Deployment only does sites/webs and requires HTTP(S) connectivity), I’m hoping some friendly beta testers will help me screw the last bits down. The key aspects of this release are:

  • Command-line support
  • Support for saving of import/export settings to a file (in the Windows Forms app) for later re-use

Shortly after this release, I’m hoping to add support for incremental deployments (so only the content which has actually changed in the sites/webs/lists/you select will be deployed), but that’s not going to make into this next cut unfortunately.

Keep tuned for further updates 🙂

Other stuff

Whilst I’m at it, other things in the pipeline from me include:

Needless to say, there are plenty of other blog articles on my ‘ideas list’ too.

Sidenote – reflecting on 2 years of SharePoint blogging

Bizarrely, I’m into my 3rd year of SharePoint blogging now. I’ve no idea how this happened. Having done some interesting work with SharePoint’s Feature framework, the initial idea was to write 4 or 5 articles I had material for – as a record for myself more than anything – and be done with it. Since then, although I do write the odd ‘easy’ post (like this one), generally my articles seem to take a long time to get completed, but I know they could be better. Occasionally I get reminded of this! So there’s a long way to go for me to become a better blogger, but I’m fully hoping to still be at it in another 2 years time – and I’ll have plenty more to say when the next version of SharePoint approaches 🙂

UK user group meeting in London this Thursday, with Q & A panel

Just a quick note to remind UK-based folks within reach that there is a UK SharePoint user group meeting in London this Thursday. There are two sessions, one of which is an open Q & A for you to bring your trickiest SharePoint questions – I’ll be amongst those on the panel representing the developer side of the house, but the line-up will cover all the bases. Needless to say, if you don’t get chance to ask your question during the main session, there’ll probably be ample opportunity in the pub afterwards. Michael Noel’s session also looks extremely interesting, with a whole host of architecture/infrastructure knowledge condensed into one easily-digestible chunk.

Details from the suguk.org – to sign-up, use the link at the bottom of this post:

Session 1 – Building the Perfect SharePoint Farm: A Walkthrough of Best Practices from the Field – Michael Noel (see books written by Michael)

SharePoint 2007 has proven to be a technology that is remarkably easy to get running out of the box. On the flipside, however, some of the advanced configuration options with SharePoint are notoriously difficult to setup and configure, and a great deal of confusion exists regarding SharePoint best practice design, deployment, disaster recovery, and maintenance. This session covers best practices encompassing the most commonly asked questions regarding SharePoint infrastructure and design, and includes a broad range of critical but often overlooked items to consider when architecting a SharePoint environment. In short, all of the specifics required to build the ‘perfect’ SharePoint farm are presented through discussion of real-world SharePoint designs of all sizes.
• Learn from previous real world deployments and avoid common mistakes.
• Plan a checklist for architecture of SharePoint environments of any size.
• Build the ‘perfect’ SharePoint farm for your organization.

Session 2 – SharePoint Q & A Session

Following the session from last year we thought it would be a good idea to have a session where you can bring your SharePoint problems and hassles to and we can debate them as a group. We’ll have a whiteboard, a laptop, and lots of clever people to discuss your questions and issues – so bring along your best and toughest!

The meeting is hosted at Microsoft in Victoria – arrive 6pm for a 6:30pm start:

Microsoft London (Cardinal Place)
100 Victoria Street
London SW1E 5JL
Tel: 0870 60 10 100

To register, simply reply to this thread leaving your full name – http://suguk.org/forums/thread/16904.aspx

Look forward to your questions 🙂