USN-rollback

Add a comment April 26th, 2010

1. You discover that the netlogon service is in a pause state
2. Event ID 2095 / 2103 is logged in the directory service event log
3. Inbound and outbound replication is disabled

You’re in a now in an USN-Rollback state!

(This situation will never happen in a single DC forest)

The most common reasons for this state is that an unsupported restore method has occurred.

Like:

– You clone a DC with ie. VMWare Converter
– You revert to a snapshot of a DC and start the DC in normal mode without setting the “Database restored from backup” registry key [1][2]

If your HW is set to do buffered disc writes and an unexpected power failure occur, may also put you in an USN-Rollback state

The way to get out of this state is to remove the bad DC out of your domain:

1. on the bad DC: dcpromo /forceremoval
2. on a healthy DC: Run a Metadata cleanup [3]
3. on a healthy DC: seize the FSMO if necessary [4]
4. rebuild the “bad” DC and promote it back

I’m living in a freezing country (bah!) and we use to say “pee in your pants to get warm (for about 5 seconds…)”.

I have seen some recommendation at some other blogs and forums that provides a “quick” solution to recover from a USN-Rollback state.

Here is a “pee in your pants” solution:

– Delete the “DSA Not Writable” registry key
– Enable replication with repadmin
– Reboot
– Fixed? Nah, not really!

Here is what Microsoft says about this in an extention to KB 875495 [5] that for some reason is hidden for the general public:

Deleting or manually changing the Dsa Not Writable registry entry value puts the rollback domain controller in a permanently unsupported state. Therefore, such changes are not supported. Specifically, modifying the value removes the quarantine behavior added by the USN rollback detection code. The Active Directory partitions on the rollback domain controller will be permanently inconsistent with direct and transitive replication partners in the same Active Directory forest.”

References:

[1] DC’s and VM’s – Avoiding the Do-over
[2] Backup and Restore Considerations for Virtualized Domain Controllers
[3] Metadata cleanup
[4] Seizing the FSMO’s
[5] KB 875495 – How to detect and recover from a USN rollback

  1. March 16th, 2015 at 22:30 | #1
    Eric

    I was considering deleting that key in an effort to stop USN mode on one of my 2008 servers. My situation is slightly different. Rather than demote and then re-promote, I took a system state backup of the USN mode server in it’s broken, non-replicating state, rebooted into DSRM, and performed a non authoritative restore from that backup. It being non authoritative, it replicated with the two healthy servers. I ceased receiving log errors and tested replication and am satisfied with the state of AD. If I reboot this server though, it still pauses ntds. At this point, I think it might be safe to remove that key without damaging my AD services and have a fully repaired server. I’m not an AD guru by any means and your post seemed pretty informative so I thought I’d see what you thought. Thanks in advance.

  1. No trackbacks yet.
Comments feed


8 − = four