Today I got a weird issue that took me quite a long time to figure it out and I would like to share with you guys. Everything started when a brand new DAG was deployed and everything went just fine during the installation process however during the initial testing I noticed a RPC averaged latency performance counter really high.
In the figure below you can see how bad was it and I the server had just a few users that were part of my pre-pilot phase.
I’ve done a couple of tests, as follows:
– First, I ran the Get-MailboxDatabaseCopyStatus and everything was golden
– I moved the databases to a single node, and then noticed that the utilization kept the same
– I restarted the RPC Client access and the same issue were the performance counter didn’t change (expected)
– I restarted the Information Store and then the performance counter went to 0 but as soon as I moved any DB the number would increase again.
– I ran Test-ReplicationHealth and no issues at all
– I use PAL tool to analyse performance and the disks were just fine
– I ran netstat to see the connections and nothing was outstanding
– Exchange Best Practices Analyzer
In this environment that I was working they have a LTM (Local Traffic Manager) and then I noticed that I could access the Domain Controller (\\unc) however I couldn’t ping. I double check and there was a SNAT and a Virtual Server were in place, the missing piece was the ability to ping the Domain Controllers, so in a F5 we should go to System / Configuration / Local Traffic and change the default behaviour of the box to All traffic in the SNAT Packet Forwarding setting.
After performing that change I moved the databases around and the RPC averaged latency came back to 0