Friday, January 15, 2010

Windows Server 2008 Freezes -- Finally Solved!

In our environment at work, we have a Citrix farm that users connect to that is running on Windows Server 2008 x64.  During our testing phase while we still had a relatively light load of users on the farm, things went pretty smoothly.  As we added more and more users to the Citrix environment, different issues cropped up here and there, but none as horribly evil as our servers freezing to the point of becoming completely unresponsive.  At that point, all sessions that users were in would lock up, forcing them to lose any unsaved data and restart their sessions again.  As you can imagine, management did not see this as an enhancement to their productivity.

So, for the last several months we have been troubleshooting this issue.  There was no pattern in regards to when servers would freeze.  At any given time, any of the four servers we have in production would freeze.  There was also no consistent user base on the server that would freeze (the only consistency being that they weren't too happy when it would happen).  After bringing in several consultants that helped set up this environment initially, we took our case to Citrix.  Several log and memory dump files later, they came to the conclusion that Internet Explorer was causing our servers to lock up.  Naturally, I then presented this information all to Microsoft support.  Upon further analysis, they discovered that we were experiencing a bug that has been resolved by a hotfix:

Basically, the hotfix resolves an issue that occurs when Server 2008 or Windows Vista is under a heavy load and there are a lot of network share accesses going on.  Well, in our case, the user profile is a network share, plus their Outlook PST files were out on a network share, plus their other file shares were network shares, and the list goes on.  After applying this hotfix (which was a little over two weeks ago) we have not experienced any freezes.  Good news for everyone.

If anyone is interested in the detailed symptoms:

  • Users sessions (terminal services/Citrix) would become completely unresponsive
  • The server would become unresponsive even at the console level
  • The server would respond to pings
  • Apparently anything in memory at the time of the freeze would continue to function -- as soon as you tried to access something else, the session would freeze
  • The only workaround when this occurred was to hard reboot the server