I knew that when I had a few dozen emails this morning regarding SharePoint being unavailable for some users that we had a real problem. Generally I can just chalk it up to users not understanding what they are doing, or locking themselves or others out due to configuration changes that they make on their own sites. But this morning seemed legit.
Our production farm has 2 web front ends. Both were “up” but I found that one of them (let’s say WEB01) was throwing lots of “503 Service Unavailable” errors in the logs. I looked at the server and noticed that the main application pool was stopped. So I started it up and went off to look at some information in the logs. I found this:
Application pool SharePoint – 80 has been disabled. Windows Process Activation Service (WAS) encountered a failure when it started a worker process to serve the application pool.
Of course this meant that the affected application pool would stop immediately as well. A bit of research revealed that this probably meant that the identity of the application pool was somehow compromised or not configured appropriately. I am using a SharePoint-managed account for the identity of this application pool and noticed that our password policy had changed the password this morning. Our policy has the passwords changing every month on the 7th between 2 and 3AM.
To make a long story short — somehow the process either did not complete or did not execute on WEB01, and it had invalid credentials for the application pool’s identity. I found that you can force a password reset for the managed accounts inside of SharePoint Central Administration. So I forced that reset, had SharePoint generate its own password, and everything is now fine.
Have others experienced the issue of the passwords not being configured appropriately during one of these changes? Is this a bug in SharePoint 2010?