So you may or may not have seen the horrendous error described in the title of this post. If you have, I feel your pain. If not, go play the lottery cause you are one lucky Exchange admin. That or you just leave well enough alone and haven’t modified Exchange out of it’s default load. Don’t get me wrong, I am a fan of IIS. Heck, it gives us the wondrous platform for OWA and the new 2010 only Exchange Control Panel (ECP). IIS can sometimes be too plentiful in its logs and sometimes downright stingy on what is really happening under the hood. The following scenario is the latter of the two.
Customer just set up a new 2010 virtual lab environment (2003,2007 & the new 2010 boxes) a total of 10 new VM’s were made to simulate a small portion of the planned rollout. The VM’s were made from a template and added to the domain and were under a set of GPO’s that the exchange folks didn’t 100% understand the level of control. When they first tried to log into OWA, they got the dreaded IIS 500.19 Internal Server error. The error specifically called out the failure to add a part of the web.config file. This one in particular was referencing a custom HTTPHeader relating to IE.
Those of you not in the know with IIS 7 or higher, it no longer use the IIS Metabase to hold it’s configuration. It has been fully replaced with a hierarchy of configuration XML files. Go figure, Microsoft using XML formatting?! Who woulda thunk it. Seriously though, when the W3WC services start and the default websites load, it reads these files from the top down. Server wide settings are held in the parent file (C:\Windows\System32\Inetpub\inetsrv) called ApplicationHost.Config. Then from there, it’s possible to override these global settings on a persite basis using downlevel web.config files. These files can be edited directly and since there isn’t a “Save” button really anywhere the changes are immediate. If the site refreshes it’s config, it will be applied immediately. Some settings within the files are loaded at first start of the web sites and kept in memory.
Knowing this relationship of IIS config helped me figure our next steps. Here is what had been done before I was called in..
- Reboot server
- Remove the web.config file
- Remove the one offending line in the /OWA web.config file. Note: This allowed the authentication page to load, but after authentication, the rendering never completed and the site hung.
- Reviewed the windows server application event logs
I had hoped we could find more detail in the error message before I started to mess with any config files (Yes, of course I back them up first silly). So I went and enabled Failed Request Event Tracing or FRET. I specifically wanted to see more details around any error in the 500-550 range as a catch all. Although the FRET output was ultra detailed (shows which modules and components are used in every stage) it didn’t give the additional info I was looking for, so I retargeted the .config files.
Normally with this IIS hierarchy if there is a conflict between the level above and below, the more granular setting will win. This didn’t really make that criteria as it was trying to add the settings from the downlevel file. I (against all better judgement) edited the applicationhost.config file to remove the one line that was trying to be duplicated and after an IISRESET, worked like a charm. Best Practices dictate that all configurations should be done via the IIS GUI unless scripted via Appcmd.exe or PowerShell. Editing the config files directly is a dangerous game. This was purely a test environment and we were under the gun to get it going.
Root cause analysis still needs to be done to determine the source of this issue.
UPDATE: root cause was there was an IIS manual modification of the ApplicationHost.config file in their VM template for some internal application compatibilities. They will now update their Exchange 2010 server build document and remove this extraneous line.