What’s the biggest blocker in migrations?

Journal wrap.

What is journal wrap?  And why do we care about journal wrap in SBS?  Normally we don’t.  We are a single DC minding our own business and don’t need to replicate with ANYONE. 


So us little single DCs go along our merry way and because we don’t talk to another DC we don’t need to have our AD journal …or database…fully functional.  As long as we’re not sharing information with another DC we can be DEAD WRONG in our databases and it doesn’t matter….until the date we need to migrate.  Then it matters.  So our single DCs are over in the corner chattering away being DEAD WRONG about the state of affairs.

The impact of this on an affected DC is that FRS will not set the IsSysvolReady registry key to indicate to the Netlogon service that all is well, Sysvol will therefore not be shared out and the DC will not be able to authenticate users fully until the Journal Wrap condition has been resolved.

Now you’d think that with the inability to get to the sysvol folder that our DCs would freak out but amazingly enough they don’t.  You can be in this journal wrap state on a single DC and you won’t see issues with two exceptions.

If you drill down to the NtFrs folder you’ll see an error message….

Event Type: Error
Event Source: NtFrs
Event Category: None
Event ID: 13568
Date:  12/25/2007
Time:  2:43:19 PM
User:  N/A
The File Replication Service has detected that the replica set “DOMAIN SYSTEM VOLUME (SYSVOL SHARE)” is in JRNL_WRAP_ERROR.
 Replica set name is    : “DOMAIN SYSTEM VOLUME (SYSVOL SHARE)”
 Replica root path is   : “c:\windows\sysvol\domain”
 Replica root volume is : “\\.\C:”
 A Replica set hits JRNL_WRAP_ERROR when the record that it is trying to read from the NTFS USN journal is not found.  This can occur because of one of the following reasons.
 [1] Volume “\\.\C:” has been formatted.
 [2] The NTFS USN journal on volume “\\.\C:” has been deleted.
 [3] The NTFS USN journal on volume “\\.\C:” has been truncated. Chkdsk can truncate the journal if it finds corrupt entries at the end of the journal.
 [4] File Replication Service was not running on this computer for a long time.
 [5] File Replication Service could not keep up with the rate of Disk IO activity on “\\.\C:”.
 Setting the “Enable Journal Wrap Automatic Restore” registry parameter to 1 will cause the following recovery steps to be taken to automatically recover from this error state.
 [1] At the first poll, which will occur in 5 minutes, this computer will be deleted from the replica set. If you do not want to wait 5 minutes, then run “net stop ntfrs” followed by “net start ntfrs” to restart the File Replication Service.
 [2] At the poll following the deletion this computer will be re-added to the replica set. The re-addition will trigger a full tree sync for the replica set.
WARNING: During the recovery process data in the replica tree may be unavailable. You should reset the registry parameter described above to 0 to prevent automatic recovery from making the data unexpectedly unavailable if this error condition occurs again.
Click on Start, Run and type regedit.
Click down the key path:
Double click on the value name
   “Enable Journal Wrap Automatic Restore”
and update the value.  
[to 1]
If the value name is not present you may add it with the New->DWORD Value function under the Edit Menu item. Type the value name exactly as shown above.

This is the one time you can follow the event error exactly and it will fix your issue (see http://msmvps.com/blogs/bradley/archive/2009/11/27/burflags-and-journal-wrap.aspx for more details.

The second way is running the www.sbsbpa.com and you’ll see it flagged as well

But unless you do either one of the two, the SBS box chugs along just fine….. until along comes either a temp Domain controller (in the case of the www.sbsmigration.com swing to a temp DC method) or in the case of the Microsoft method where you join the SBS 2008 to the domain via the migration install method.  Suddenly that second DC starts talking to the first and the first DC says “I’m right, you need to follow me!”  and the second DC says “Huh?  You are crazy you have no idea what you are talking about, I’m not going to follow you!” and the two refuse to talk to each other… or to use the proper technical terms, they don’t replicate.

So you go to migrate and you don’t.  It’s that simple.  Fix the journal wrap with that registry key and then the SBS 2003 will allow the replication of it’s info to the other DC.

So how do they get into journal wrap condition you ask?  While there are many reasons there’s ONE reason in particular that has probably burned us more than any other issue. 

Running out of room on the C drive.

Every SBS 2003 that you ran out of room on the C drive and lost licenses, probably has a journal wrap problem.

Every SBS 2003 that someone used to print out gobs of color photos and the print spooler would explode and use up all the free room, probably has a journal wrap problem.

Every SBS 2003 that had a small c drive and you got close to running out of room on the C drive, probably has a journal wrap problem.

So think back to every SBS box that ran out of room on the C drive.

Now go run the www.sbsbpa.com on it and fix your journal wrap problem.


One Response to What’s the biggest blocker in migrations?

  1. Chris Knight says:

    Make sure you’ve got a copy of the SYSVOL folder beforehand, preferably with a System State backup. Because Automatic Restore can blow away all your GPOs, which isn’t fun if you’ve invested heavily in GPO configuration. MSKB 315457 can be a useful guide in getting back your SYSVOL contents.

    Thankfully a Domain Functional Level set to Windows Server 2008 means that DFSR is used for SYSVOL replication instead of the NtFrs piece of junk. But this does require a DFSR migration of SYSVOL though.