Recipe for successful use of Content Deployment Wizard

So my tool, the SharePoint Content Deployment Wizard has been available for some time now and I’ve been monitoring the feedback and issues people have raised closely. The current version is labelled ‘beta 2’ but I’m happy with the stability of the current codebase, so will probably re-label it as ‘release 1.0’ soon (following some feedback on the psychological aspect of the beta label :-)).

Only a small number of people have raised issues, and any problems have almost exclusively been related to the underlying Microsoft code used by the tool rather than the Wizard itself. I should probably be happy about this, but in reality if some people get errors from the tool it doesn’t really matter why it happens. The good news is that it seems Microsoft are finally getting some issues with the Content Deployment API sorted at their end. This is a key point in my list of guidance I’d give to anybody running into any errors from the Wizard. Note that the first two apply to use of standard Content Deployment using Central Admin also:

Tip 1 – Service Pack 1 and hotfixes matter

Service Pack 1 fixed many issues with Content Deployment. Unfortunately it also broke some things which had previously been fixed with pre-SP1 hotfixes. It took me a while to realize this, but it’s definitely the case. Probably the most common issue in this area is the ‘Violation of Primary Key‘ error. There are reports of being able to work around this by modifying versioning settings on certain libraries, but MS have now released a hotfix very recently which seems to solve the problem for good on SP1 environments. At the moment this is by special request only – the KB to ask for is KB950279. This forum thread discusses this, and it worked for us. Interestingly I spoke to Tyler Butler (Program Manager for Content Deployment) at SPC2008, and he indicated Content Deployment in SharePoint is likely to get “significantly more stable in the next 30-60 days”. I’m guessing this hotfix is what he was referring to, or at least part of it.

Tip 2 – always start from a blank site template an empty site created from STSADM -o createsite on the destination

The official guidance currently states that Content Deployment requires that the target site has been created from the ‘blank’ site template – this is detailed in KB article 923592. However, a better way detailed by Stefan in the comments below is to create an empty site using the STSADM -o createsite command. This is not the same as a site created from the blank template, and is the safest way to create sites which will use Content Deployment or the Wizard. What this means is that even if you’re creating a site based on say, the publishing site template in development, any other environments which you wish to deploy content to should be created in this way. Notably, for publishing sites the publishing Feature should also not be enabled for the first deployment – this will be taken care of for you when the first deployment happens. You’ll receive the same ‘object already exists’ error otherwise.

Tip 3 – pay attention to the ‘retain object IDs’ option

Generally the right option here is to select that you do want to retain the object IDs, and this should be done from the very first deployment – the only exception is when moving webs/lists to a different part of the site structure (reparenting). However, it’s important to note that mixing use of Content Deployment or the Wizard with STSADM export/import is likely to cause problems as noted by Stefan in his recommended ‘Content Deployment and Migration API – avoiding common problems‘ post.

A more comprehensive write-up of options available with the Wizard is available at ‘Using the SharePoint Content Deployment Wizard‘. Also note that’s not it as far as the tool goes – in addition to extra functionality such as item-level reparenting and incremental deployment, I hope to refactor the code so that the Wizard would be scriptable from the command-line.

And special thanks go to my colleague Nigel Price for working through the hotfix situation, much appreciated 🙂

Using the SharePoint Content Deployment Wizard

So if you’ve read the earlier posts about the tool (Introducing the SharePoint Content Deployment Wizard and When to use the SharePoint Content Deployment Wizard) and figure this is a useful tool, let’s go onto the next level of detail. Generally speaking the Content Deployment Wizard ‘just works’, but if you want to know more about the different options, read on. This post contains reference information and a guide to some deployment scenarios at the end.

Firstly, let’s remind ourselves of some of the fundamental things to remember when moving content using the Content Migration API (the underlying SharePoint API used by the tool):

  • dependencies of selected content (e.g. referenced CSS files, master pages) can be evaluated – in the tool they are automatically included in the export – check ‘Exclude dependencies of selected objects’ to disable this
  • all required content types, columns etc. are automatically included in the export
  • in contrast to STSADM export, it is possible to retain GUIDs during deployment (where objects are not being reparented) – check ‘Retain object IDs and locations’ to enable this
  • no filesystem files (assemblies, SharePoint Solutions/Features etc.) are deployed – these must already be present on the target for the import to succeed)
  • the following content does not get captured by the Content Migration API – alerts, audit trail, change log history, recycle-bin items, workflow tasks/state

In particular it’s the 2nd and 3rd points which make the API (and the Wizard) a good way to deploy content in SharePoint.

What can be deployed?

The Content Deployment Wizard allows any content to be selected for export – site collections, webs, lists/document libraries, folders, right down to individual list items and files. Objects in the treeview can be added to the export by right-clicking them, which for a web, brings up a menu shown below:

These are explained:-

  • ‘include all descendents’ – exports the container and anything beneath it
  • ‘exclude descendents’ – exports the container only
  • on webs only, the ‘include content descendents’ option is shown – this will include all immediate content such as lists/libraries, but will exclude all child webs of the web.

Note that on the import, the Wizard will bring in all the contents of the selected .cmp file(s) – there is no option to partially import a package. Hence if different import options are required for different content, the exports should be broken into separate chunks.

Export options

On the export settings screen, numerous options can be applied to exports:

  • ‘Exclude dependencies of selected objects’ – by default the Content Migration API will automatically include dependent objects of whatever you select. This can include CSS files, master pages, images and the like, but also list items which are displayed on a page included in the export. This can be turned off with this checkbox so only the objects you select are exported.
  • ‘Export method’ (options are ‘ExportAll’, ‘ExportChanges’) – for now ExportAll is the option to select, ability to export changes only will come in a future release
  • ‘Include versions’ (options are ‘LastMajor’, ‘CurrentVersion’, ‘LastMajorAndMinor’, ‘All’) – should be self-explanatory
  • ‘Include security’ (options are ‘None’, ‘WssOnly’, ‘All’)’ – note that since security is defined at the level of a web, selecting one of the include security options for a smaller object (e.g. list) actually exports security for the entire web. Both ‘WssOnly’ and ‘All’ export SharePoint item-level object permissions, so if you’re using SharePoint groups to manage security for example, both the actual permissions and groups will be carried over, and you can add a different set of users/AD groups on the destination. See Migrating Security Information on MSDN for more details.

Import options

On the import we also have several options, some of which correspond to options selected on the export:

  • ‘Import web URL’ (actually shown on the ‘Bind to site’ screen) – this is used for reparenting operations only. If you are just moving content from source to destination but are not changing the location in the structure, this textbox can remain blank. Alternatively, for operations where a web or list is being imported but the parent web will not be the exact same web on the destination, the URL of the new target web URL should be entered.

    Note that the later option to ‘Retain object IDs and locations’ should not be selected when reparenting, since we are changing the location in this case.

  • ‘From single file’/’From multiple files’ options – the Wizard always exports with file compression enabled, so when exporting content over 25MB, files are split into several files at this threshold. When importing from such an export, select the ‘From multiple files’ option and browse to the folder. In the textbox, enter the ‘base filename’ – this should be the name of the first file without the number e.g. ‘MyExport.cmp’ rather than ‘MyExport1.cmp’. 
  • Retain object IDs and locations’ – this setting requires particular consideration. Duplicate GUIDs are not permitted in one database (i.e. SP web application), so the choice often depends on what you are importing. If you are taking a site from development to production, the object GUIDs will not yet exist on the destination, so I check the box to ensure the objects are assigned the same IDs in both environments, and all linkages are preserved. If you are reparenting a list or web, you will leave the box unchecked, so that new GUIDs are assigned are the location can therefore be changed.

    I highly recommend reading the content listed in the ‘Useful links’ section at the end of the article to properly understand this setting.

  • ‘Include security’ – this allows security information in a package to be imported, assuming one of the options to include security was selected on the export
  • ‘Version updates’ – allows control over whether new versions should be added to existing files, or whether the existing version should be replaced etc.
  • ‘User info update’ – allows control over whether ‘last modified’ information should be imported. Often this only makes sense if the same set of users exist in the source and destination

Scenarios quick reference

The following table lists the most common settings for a given deployment task:

Deployment item

Typical settings

Entire site collection
  • Site collection should first be created on the destination.
  • When exporting, select ‘include all descendents’.
  • When importing for the first time, ensure ‘retain object IDs and locations’ is checked.
  • Select one of the ‘include security’ options if you wish to deploy object permissions and users
Web
  • When exporting, select ‘include all descendents’.
  • When importing for the first time, ensure ‘retain object IDs and locations’ is checked if web will have same parent as on source.
  • If web will have a different parent, do not check retain object IDs and locations’ and ensure ‘import web URL’ is specified
Document library/list
  • When exporting, select ‘include all descendents’.
  • When importing for the first time, ensure ‘retain object IDs and locations’ is checked if list will have same parent as on source (i.e. not reparenting).
  • On subsequent imports, ensure ‘import web URL’ is specified if not importing to the root web, and do not check ‘retain object IDs and locations’
File/list item
  • Ensure the parent library/list exists on the destination
  • Do not check ‘retain object IDs and locations’ if the item already exists on the destination

 

So hopefully that’s some useful reference information. On a final note, the next beta version with much improved treeview performance should be ready over the next week or so!

Useful links

When to use the SharePoint Content Deployment Wizard

Following my introduction to the tool last time, today I want to try to help position the tool for people who aren’t sure if it could be useful to them or for what scenarios – if you only take one thing away from my postings on the Content Deployment Wizard it should be this.

I see the ‘value-add’ of the Content Deployment Wizard over existing deployment methods such as STSADM export/content deployment in Central Administration to be:

  • ability to “cherry-pick” content to deploy using a treeview – this is from entire site collection down to individual list item or file. (This is the big one since the standard SharePoint tools do not supply a method to do this)
  • ability to control whether object GUIDs are retained – this is required for scenarios where the destination should be a mirror-image of the source, such as staging/production environments for the same site
  • ability to move certain objects (limited to webs and lists in the initial release) to a new location on the import target, known as ‘reparenting’

I would suggest the tool could well have a place in your SharePoint toolbox, but it’s likely to be something you use every now and then, rather than all the time. The two main scenarios where I use the tool are:

  • at the end of the development phase when I need to move a site from development to staging/production. Here, the tool allows me to be sure that all relationships/linkages between objects will be preserved (so no problems with ListViewWebParts/DataViewWebParts/InfoPath forms for example)
  • any odd occasions where I have a need to move a particular document library/list, or a particular set of files (e.g. master page, page layouts, CSS etc.). This assumes by the way, that the files were not deployed with a feature – I wouldn’t recommend mixing the deployment methods like that.

N.B. It should also be possible to use the tool for ongoing updates to specific files/list items through different environments in a development/test/staging/production situation. An example of this is deployment of just master pages, page layouts and CSS files on a WCM site (meaning all other content authored by the client [e.g. in the ‘Pages’ library] does not get overwritten on the target) but I haven’t had the opportunity to try this on a real project yet.

Some areas where the tool cannot be used (i.e. the tool does not yet support this usage) are:

  • exporting only changes since a certain timestamp (change token) from a site
  • importing individual list items/files to a new location on the target (reparenting)

On the last point, the ‘new location’ would be a document library or list, since these items have to be in such a container – they cannot exist at the root of a web. Currently the tool supports reparenting webs and lists, but not individual list items/files. What is currently possible with individual list items/files, is moving selected items from a source to a target where the structure is the same (e.g. move Doc2 and Doc5 from “Team documents” on the source to “Team Documents” on the target). Usefully, whenever a file is exported/imported using the tool, the associated list item is also deployed, meaning metadata updates to column values are deployable.

Hopefully that might help you understand where the Wizard fits in. If you’re thinking the Wizard could be useful to you from time to time, stay tuned for my next post which will have more detailed ‘usage’ information. 

Change a SharePoint site’s URL

Something you may find yourself tasked with at some point, is changing the URL of an existing SharePoint 2007 site. This is a fairly interesting scenario, and it’s fair to say the relationship between SharePoint and IIS makes this more complex than for a standard .Net site. However, there are several possible solutions. The first things many of us would think of as potential approaches would probably be:

  • extending the web application onto another URL
  • using Alternate Access Mappings somehow

Depending on your site both could be valid methods, but as with anything SharePoint-related, there are different things to watch out for with the different approaches. As an example, extending the web application wasn’t the right approach for our scenario for the following reasons:

  • the site shouldn’t actually exist at the old URL, but a redirect was required
  • InfoPath forms don’t seem to deal well with the ‘extended web application’ configuration. (Problem detail – on one URL everything will be fine, but if the two web applications use separate site collections, on the other you’re likely to see security errors when opening forms. This is because the form templates are referenced in the other site collection – a document library can only have one URL to the document template, and publishing a form to a content type stores an absolute link in the content database.)

Additionally some quick tests with Alternate Access Mappings didn’t seem to give the expected results for me, so I decided on another approach since I knew it would work and didn’t have much time for experimentation. So this was my process:

Changing a site’s URL by recreating the site (downtime required)

  1. Stop old IIS site.
  2. Create new web application in SharePoint, bind to new IP address in IIS.
  3. Apply SSL certificate if appropriate.
  4. Create new site collection for this web application using the blank site template.
  5. Export content using the SharePoint’s content migration API (I have a tool which does this, which will shortly be on Codeplex) ensuring all security data is exported. Alternatives to this step include STSADM -O BACKUP and STSADM-O EXPORT. *
  6. Import content into the new site collection, ensuring to include security.
  7. Amend any absolute URLs in .udcx data connection files used by InfoPath.
  8. Republish any InfoPath forms to the new site.
  9. Configure search:
    1. Ensure new URL is a content source.
    2. Update any crawl rules which use absolute URLs.
    3. Update ‘authoritative pages’ as appropriate.
    4. Start full crawl.
    5. Update scopes.
    6. Go to Site Settings > Site collection administration > Search scopes, add any custom scopes to search dropdown (if using standard search web parts).
    7. Ensure search web parts use relative URLs/do not reference old site URLs.
  10. If a redirect from old URL is required, create new IIS site to implement this:
    1. Create new site in IIS and bind to old IP address.
    2. On ‘Home directory’ tab, specify content should come from ‘A redirection to a URL’ and enter the URL.
  11. Ensure DNS/firewalls are configured appropriately for new URL, remembering to allow appropriate time for DNS propagation.
  12. Perform testing.

* N.B. Between the content migration API and STSADM export, I prefer the former since this allows control over whether object GUIDs are retained (more information in STSADM export, Content Deployment, Content Migration API, Features/Solutions – deployment options compared). STSADM backup/restore is discussed in next section.

Considerations to this approach

  • When using the content migration API or STSADM backup/restore, the following items are not included – alerts, workflows, recycle bin state or site collection properties. These must be migrated/recreated separately.
  • Regression testing is absolutely required since the site is effectively recreated

As a way to improve on the first consideration, another option would be STSADM backup/restore (though I’ve not tried this approach). Notably this method does collect data for the items which the other approaches exclude, however due to the nature of our site, none of these were significant problems.

So this method was successful, and hopefully this information allows folks to see some of the pros and cons without having to spend the time going through it themselves. However, I also note an approach based on Alternate Access Mappings suggested by Faraz Khan. Since this was only published in the few days before this article it was too late for my scenario, though I’d encourage you to take a look. Note that Faraz also points out considerations such as certain links not being updated to new URL without fix-ups, though this doesn’t seem to be a major issue. It does echo my point about there being different things to watch out for with the different approaches, but both methods provide valid techniques for changing a SharePoint site’s URL.