In this post, we will discuss on an issue that may be faced by quite many folks who have a remote desktop farm in their environment. The RDSH (remote desktop session host) servers might be getting hung quite often, especially during peak business hours. Out of all the servers in the farm, this situation could happen with any of the server, with no particular order.
Althought there could be many reasons behind it, but in some cases, it could become more problematic because event viewer might not be revealing any definitve clue. There could be few logs which we can associate with the server hung state, but none of them would be pointing to the right direction, like below.
1> The winlogon notification subscriber is taking long time to handle the notification event.
2> A timeout (120000 milliseconds) was reached while waiting for a transaction response from the SessionEnv service.
3> The server did not register with DCOM within the required timeout.
Remote Desktop Session Host server hangs/locks up (2008 R2 in vSphere 4.1) Here is a blog post bearing similarity to this case, where this issue is discussed.
Diagnosing the problem
To find the right direction to search for the solution, we can run a DCS (Data Collector Set) in Performance monitor for CPU utilization and Memory consumption, in any or all of the RDSH servers. Thec onfiguration steps are explained in this link Create a Data Collector Set to Monitor Performance Counters. This way we can know the picture of resource utilization, just moments before the server dives into hung state, the next time.
If the RDSH servers are VMware based, we can analyze the VM level logsand look for following.
1> GuestMsg: Too many channels opened
2> GuestRpc: Channel 6, unable to send the reset rpc.
Also the ESXI level logs related to performance could tell about the resource utilization during the time of VM’s hung state.
Another blog post of vmware community might shed some light into the topic VM of Esxi 6 crash after too many TSE connexion
Here we must discuss, the TCP Chimney Offload, Receive Side Scaling (RSS), and Network Direct Memory Access (NetDMA) features that are available for the TCP/IP protocol in Windows Server 2008 onwards.
Disabling these three features, in many cases resovles the issue Information about the TCP Chimney Offload, Receive Side Scaling, and Network Direct Memory Access features in Windows Server 2008
1> Receive Side Scaling (RSS).
RSS enables network adapters to distribute the kernel-mode network processing load across multiple processor cores in multi-core computers.
2> Network Direct Memory Access (NetDMA)
NetDMA provides operating system support for direct memory access (DMA) offload. TCP/IP uses NetDMA to relieve the CPU from copying received data into application buffers, reducing CPU load.
3> TCP Chimney Offload
TCP Chimney Offload is a networking technology that helps transfer the workload from the CPU to a network adapter during network data transfer.
If this also has not resolved the issue, we can proceed further with another solution.
Windows System Resource Manager (WSRM)
Windows System Resource Manager (WSRM) can be used to allocate processor and memory resources to applications, users, Remote Desktop Services sessions, and Internet Information Services (IIS) application pools.
With Windows System Resource Manager for the Windows Server® 2012 operating system, you can manage server processor and memory usage with standard or custom resource policies.
Equal per session
Out of all poicies, this is the one which might solve our purpose. When the Equal_Per_Session resource allocation policy is managing the system, resources are allocated on an equal basis for each session connected to the system. This policy is for use with RD Session Host servers.
This used to work nicely prior to Windows Server 2012 R2, but for some really disappointing reason, Microsoft has removed this feature, beginning with Windows Server 2012 R2 and left us with no alternatives.
There seems to be a workaround for this too. WINDOWS SYSTEM RESOURCE MANAGER AND WINDOWS SERVER 2012 R2 In this blog, author has described a way to still be able to use WSRM in Windows Server 2012 R2 and later editions. But this requires a presence of Windows Server 2012, which is not possible many times.
There is also a third party tool Process Lasso Server Edition, which claims to perform this task, but haven’t tested it.
So now we head to the final section, which is more of a preventive step.
Microsoft Remote Desktop Services 2012 Management Pack for System Center 2012
The Remote Desktop Services Management Pack helps you manage your computers that are running Remote Desktop Services on Windows Server 2008 R2 by monitoring the health of the following Remote Desktop Services role services.
When there is problem with the availability or performance of one of these components, Microsoft System Center Operations Manager 2007 uses the Remote Desktop Services Management Pack to detect the issue and alert you so that you can diagnose the problem and fix it.
To set this up, this link can be really helpful Monitoring RDS 2012 with System Center Operations Manager 2012 (Part 1)
I will post more here, if I find any more definitive solution to this issue.