How to Recover a Journal Wrap Error (JRNL_WRAP_ERROR) and a Corrupted FRS SYSVOL from a Good DC – What option do I use, D4 or D2? What’s the Difference between D4 and D2?

Original: 11/21/2013
Updated 8/30/2014

Errata

Ace here again. I’ working on updating all of my blogs. If you see any inconsistencies, please let email me and let me know.

Prologue

Are you seeing Event ID 13508, 13568, and anything else related to SYSVOL, JRNL_WRAPS, or NTFRS?

Note – I will not address Event ID 2042 or 1864. That’s an issue with replication not working beyond the AD tombstone. If you are seeing them, you’re best bet is to forcedemote the machine, run a metadata cleanup, and re-promote it, and make sure you configure your firewall and/or AV to allow replication traffic or stop using the ISP’s or router as a DNS address, or disable IP routing and WINS Proxy, to prevent this in the future. And while you’re at it bump up your AD tombstone to 180 days,

As for the NTFRS, after talking to numerous folks whether directly assisting a customer, or through the TechNet forums, there seems to be some confusion associated with how to handle Journal Wrap errors, what caused them, and what are the differences between the D2 and D4 options. I’ll try to quell this confusion in this blog, as well as provide an easy step-step and providing an explanation for the steps, to get out of this error. Note: The steps are from Microsoft KB290762. I just thought to further break it down so a layman will understand them.

Reference KB: Using the BurFlags registry key to reinitialize File Replication Service Replica Sets
http://support.microsoft.com/kb/290762

For Windows 2008/2008 R2/2012/2012 R2 with DFSR

Follow this KB to fix it:

How to force an authoritative and non-authoritative synchronization for DFSR-replicated SYSVOL (like “D4/D2” for FRS)
http://support.microsoft.com/kb/2218556

Backing Up and Restoring an FRS-Replicated SYSVOL Folder
http://msdn.microsoft.com/en-us/library/windows/desktop/cc507518(v=vs.85).aspx 

What Caused the Journal Wrap?

First you have to ask yourself, what caused this error on my DC? What did I do to get here? In a nutshell, JRNL_WRAPS are caused by SYSVOL corruption.

The usual culprit can be a number of things:

  • Abrupt shutdown/restart. I don’t usually see this unless there are power issues in the building with not power protection or UPS battery system.
  • Disk errors – corrupted sectors. This is a common issue with a DC on older hardware.
  • AV not configured to exclude SYSVOL, NTDS and the AD processes. This is the typical culprit I’ve seen in many cases.

Ok, So what do I have to do to fix this?

To get yourself out of this quandary, it’s rather simple. Yea, you might say yea, right, this is not so simple, but it really isn’t that hard. It just requires a little understanding of what you have to do, which is all it’s doing is simply copying a good SYSVOL folder and subfolders from a good DC to the bad DC (the one with the errors.

Basically, you first choose which DC is the good DC to be your “source” DC for the SYSVOL folder. Then you you stop the NTFRS service on all DCs. Yes, NTFRS must to be stopped on all DCs to perform this. Then set the registry key on the good DC and the bad DC. That’s it. The process will take care of itself and reset the keys back to default after it’s done.

  • If you only have one DC, such as an SBS server, and SYSVOL  appears ok, or restore just the SYSVOL from a backup. Then just follow the “Specific” steps I’ve outlined below.
  • If more than one DC, but not that many where you can’t shutdown the NTFRS on all of them, such as if you have 40 DCs, pick and choose the best one and set Burflags to D2 on the bad and D4 on the good.
  • If there are numerous DCs, such as a large infrastructure, simply run dcpromo /forcedemote the DC with the error, run a metadata cleanup, then re-promote to a DC back into the domain. If you unplug the DC and run a metadata cleanup, then you will have to rebuild the DC from scratch. The forcedemote switch removes the AD binaries off the machine allowing you to re-promote it.

 

To summarize:

You have two choices as to a restore from a good DC using FRS:

  1. D2 is set on the bad DC: Non-Authoritative restore: Use the D2 option on the DC with the empty SYSVOL folder, or the SYSVOL folder with the incorrect data. This way it will get a copy of the current SYSVOL and other folders from the good DC that you set the BurFlags D4 option on.
  2. D4 is set on the good DC: Authoritative restore: Use the BurFlags D4 option on the DC that has a copy of the current policies and scripts folder (a good, not corrupted folder).

 

The BurFlags option – D4 or D2? What do I use?

The steps refer to changing a registry setting called the BurFlags value. If the BurFlags key does not exist, simply create it. It’s a DWORD key.

More importantly, it references change the BurFlags to one of two options: D4 or D2. Therefore, before going further, I would like to squelch the confusion on what the D2 and D4 settings mean:

D2/D4 – Which is which?

  • D2, also known as a Nonauthoritative mode restore – this gets set on the DC with the bad or corrupted SYSVOL
  • D4, also known as an Authoritative mode restore – use this on the DC with the good copy of SYSVOL.
  • You must shut the NTFRS service down on ALL DCs while you’re doing this until instructed to start it.
  • You’ll probably want to copy the current SYSVOL structure on the good DC to another folder as a backup prior to doing this.

The D2 option on the bad DC will do two things:

  1. Copies the current stuff in the SYSVOL folder and puts it in a folder called “Pre-existing.” That folder is exactly what it says it is, it is your current data. This way if you have to revert back to it, you can use the data in this folder.
  2. Then it replicates (copies) good data from the GOOD DC (D4) to the bad guy (D2).

Once again, simply put:

  • The BurFlags D4 setting is “the Source DC” that you want to copy its good SYSVOL folder from, to the bad DC.
  • The bad DC BurFlags is set to D2, which tells it to pull from the source DC, the one you set D4 on.

 

Here are the steps summarized:

  1. For an Authoritative Restore you must stop the NTFRS services on all of your DCs
  2. In the registry location: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\Backup/Restore\Process
    1. Set the BurFlags setting to HEX “D4” on a known DC that has a good SYSVOL (or at this time restore SYSVOL data from backup then set the Burflag to D4)
    2. Then start NTFRS on this  server.
    3. You may want to rename the old folders with .old extensions prior to restoring good data.
  3. Clean up the folders on all the remaining servers (Policies, Scripts, etc) – renamed them with .old extensions.
  4. Set the BurFlags to D2 on all remaining servers and then start NTFRS.
  5. Wait for FRS to replicate.
  6. Clean up the .old stuff if things look good.
  7. If the “D4” won’t solve the problem try the “D2” value.

 

So circling back, to fix this and make it work, just copy the contents of SYSVOL to another location, then follow the KB, which simply states you must stop the NTFR service on ALL DCs. Then pick a good one to be the “Source DC.”

Of course, as I’ve stated above, if you have a large number of DCs, the best bet is to forcedemote the bad DC, run a metadata cleanup to remove its reference from AD, then re-promote it.

If you have a small number of DCs, and if you have a good DC and a bad DC, on the good DC, you would set the BurFlags to D4, and on the BAD DC you would set the Burflags to D2.

Example run:

In the example below, if you set BurFlags to D4 on a single domain controller and set BurFlags to D2 on all other domain controllers in that domain, you can rebuild the SYSVOL from the D4 DC (the source DC).

I’ve also heard of admins manually copying the SYSVOL folder, then set the BurFlags options as mentioned, which works too. But no, I haven’t tested it. That would be for a lab on another day. 🙂

Authoritative Restore Example

Use the BurFlags D4 option on the DC that has a copy of the current policies and scripts folder (a good, not corrupted folder).

  1. Stop the FRS service on all DCs. To do this to all DCs from one DC, you can download PSEXEC and run “psexec \\otherDC net stop ntfrs” one at a time for each DC.
  2. On a good DC that you want to be the source, run regedit and go to the following key:
    HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\NtFrs\Parameters\Backup/Restore\Process at Startup
    In the right pane, double-click “BurFlags.” (or Rt-click, Edit DWORD)
       Type D4 and then click OK.
  3. On the bad DC, run regedit and go to the following key:   HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\NtFrs\Parameters\Backup/Restore\Process at Startup
       In the right pane, double-click “BurFlags.” (or Rt-click, Edit DWORD)
       Type D2 and then click OK.
  4. Quit Registry Editor, and then switch to the Command Prompt (which you still have opened).
  5. On the good DC, start the FRS service, or in a command prompt, type in “net start ntfrs” and hit <enter>
  6. On the bad DC, start the FRS service, or in a command prompt, type in “net start ntfrs” and hit <enter>
  7. On the bad DC, check the Sysvol folder to see if it started populating.
  8. Check for EventID 13565 which shows the process started
  9. Check for EventID 13516, which shows it’s complete
  10. Start FRS on the other DCs.

The following occurs after running the steps above after you start the FRS service (NTFRS):

  • The value for BurFlags registry key returns to 0.
  • Files in the reinitialized FRS folders are moved to a <var>Pre-existing</var> folder.
  • An event 13565 is logged to signal that a nonauthoritative restore is started.
  • The FRS database is rebuilt.
  • The member replicates (copies) the SYSVOL folder from the GOOD DC.
  • The reinitialized computer runs a full replication of the affected replica sets when the relevant replication schedule begins.
  • When the process is complete, an event 13516 is logged to signal that FRS is operational. If the event is not logged, there is a problem with the FRS configuration.
     
    Note: The placement of files in the <var>Pre-existing</var> folder on reinitialized members is a safeguard in FRS designed to prevent accidental data loss. You can copy this stuff back if it didn’t work, but I have not yet seen when this has not worked!

Summary

I hope this helps cleaning up your FRS and SYSVOL replication issues.

Ace Fekay
MVP, MCT, MCSE 2012, MCITP EA & MCTS Windows 2008/R2, Exchange 2013, 2010 EA & 2007, MCSE & MCSA 2003/2000, MCSA Messaging 2003
Microsoft Certified Trainer
Microsoft MVP – Directory Services
Complete List of Technical Blogs: http://www.delawarecountycomputerconsulting.com/technicalblogs.php

This blog is provided AS-IS with no warranties or guarantees and confers no rights.