How to Troubleshoot Active Directory Replication and Domain Controller Health

In any Windows-based environment setup, Active Directory (AD) is the core system for managing user identity, access, and login authentication. To keep AD environment running smoothly, two things are very important: replication and the health of Domain Controllers (DCs).

If replication fails or DCs are not set up properly, it can cause problems like inconsistent data, login issues, broken group policies, and even serious system downtime.

This guide gives you a simple, step-by-step process to find and fix problems related to replication and DC health.

Here are some key points to start troubleshooting with:

1. Check Event Logs First

Why? Event logs are the first place records signs of trouble, often before users notice.

Where to look:

  • Event Viewer → Windows Logs → System
  • Event Viewer → Applications and Services Logs → Directory Service

Watch for these key Event IDs:

  • 13508 – Replication attempt failed.
  • 1645 – Failed connection between DCs.
  • 1391 – Inbound replication issues.
  • 5788/5789 – DNS registration or name resolution issues.

==> Tip: Sort event ID’s by “Error” and “Warning” severity to speed up investigation.

2. Verify Network Connectivity

Why? AD replication relies on DC-to-DC communication. If they can’t talk to each other, replication won’t happen.

How to check:

  • Basic ping:
    ping dc.lab.com
  • Port testing:
     
    Test-NetConnection -ComputerName dc.lab.com -Port 389
  • Ensure TCP/UDP ports 389, 636, 3268, 3269, and 135 are open.
  • Disable any firewalls or adjust rules preventing DC communication.

Verify ports :  Verify Network Ports on a Domain Controller – dbtuhub.com

3. Validate DNS Configuration

Why? DNS is the navigation system for AD. Without it, DCs can’t find each other to replicate or respond to queries.

Checklist:

  • Name resolution:
     

    nslookup dc.lab.com
  • Verify existence of the _msdcs.yourdomain.com zone.
  • Ensure all DCs register their records and point to the correct DNS servers (not public DNS!).

4. Use the Repadmin Tool

Why? Repadmin is the main tool used to check and manage replication. It provides detailed information for  both successful and failed replication across the environment..

Common Commands:

  • Summary of replication across DCs:
    repadmin /replsummary
  • Detailed replication info:
    repadmin /showrepl
  • Force a full sync:
    repadmin /syncall /force
  • Check a specific DC:
    repadmin /showrepl DCName

5. Run Dcdiag for a Deep Health Check

Why? Dcdiag performs a broad series of diagnostic tests that cover not just replication but DNS, Netlogon, and general DC health.

Basic usage:

 dcdiag /v

For replication-only:

 dcdiag /test:replications

Use the output to pinpoint problem roles or services failing under the hood.

6. Check Time Synchronization

Why? Kerberos authentication and replication both rely on synchronized clocks. Even a 5-minute skew can cause trust and replication failures.

Check time status:

 w32tm /query /status

Make sure:

  • The PDC Emulator syncs with a reliable NTP source.

7. Review Active Directory Sites and Services

Why? Replication topology is defined here. A misconfigured site or link could isolate a DC.

What to verify:

  • Each DC is in the correct site based on subnet.
  • Site links exist and are not disabled.
  • Replication schedule matches your network’s bandwidth capacity.
  • All other DCs sync with the PDC.

8. Confirm FSMO Role Availability

Why? FSMO (Flexible Single Master Operations) roles handle key AD tasks. If these roles aren’t functioning, operations like schema updates, password changes, or new domain creation can fail.

Find current FSMO role holders:

 netdom query fsmo

 If a DC holding a role is offline:

  • Consider transferring the role.
  • Use NTDSUTIL if seizing is required.

 9. Detect and Resolve Duplicate GUIDs

Why? Duplicate GUIDs (usually from improperly cloned or restored DCs) can cause replication loops or failures.

Tools:

  • Use repadmin /showrepl for unusual entries.
  • Use ADSI Edit cautiously to check the objectGUID.
  • Clear lingering objects:

    repadmin /removelingeringobjects <DestDC> <SourceDC_GUID> /ADVISORY_MODE
  • Explanation:
  • <DestDC>: The name of the destination domain controller 
  • <SourceDC_GUID>: The GUID of the source domain controller that is considered authoritative.
  • /ADVISORY_MODE: Optional. If used, it only reports lingering objects without deleting them.

⚠️ This is advanced. Always back up before making changes to AD objects manually.

10. Check SYSVOL Replication

Why? SYSVOL holds policies and scripts. If it’s not replicating, users won’t get updated group policies.

What to check:

  • If using FRS:
     ntfrsutl ds dc
  • If using DFS-R:
     dfsrdiag ReplicationState

Look for backlogs, journal wrap errors, or service failures.

11. Perform regular AD Health Checks

Why?  Because when you regularly check the system’s health, small issues are caught early and major problems can be prevented.

Use PowerShell:

Get-ADReplicationFailure -Scope Domain -Target “lab.com”

Lab.com “Yourdomain” name.

Monitor:

  • Replication failures
  • DNS issues
  • Time sync drift
  • Faulty DCs

Monitoring, automate alerts or reports to stay proactive.

Final Thoughts

Managing Active Directory isn’t just about checking if users can log in or not. The real goal is to have a system that’s strong enough to fix its own issues and keep working properly, even when there’s problems.

By following below steps:

  1. Check Event Logs.
  2. Verify Network Connectivity.
  3. Validate DNS Configurations.
  4. Use Repadmin for Replication Diagnosis.
  5. Run Dcdiag.
  6. Ensure Time is Synchronized.
  7. Review Sites and Services.
  8. Verify FSMO Role Health.
  9. Check for Duplicate GUIDs.
  10. Inspect SYSVOL Replication.
  11. Run Regular Health Checks.

Leave a Comment