We use cookies to provide you with a better experience. If you continue to use this site, we'll assume you're happy with this. Alternatively, click here to find out how to manage these cookies

hide cookie message
80,259 News Articles

Microsoft: Botched firmware update set off Outlook.com outage

The Outlook.com partial outage lasting 16 hours on Tuesday and Wednesday morning was caused by a firmware update gone awry that triggered a temperature spike in a Microsoftdata center, resulting in automatic safeguards that made a large number of servers inaccessible.

Because of the unspecified safeguards, downed servers couldn't fail over on their own so restoration work had to be done manually, slowing down the process, according to a blog post by Microsoft Outlook.com Vice President Arthur de Haan.

[ BACKGROUND: Microsoft fixes Hotmail, Outlook.com glitches that caused outage

RELATED: Microsoft Hotmail, Outlook, SkyDrive problems could hurt customer confidence

QUIZ: Microsoft CEO Steve Ballmer said what? ]

De Haan apologized for the disruption of email access. "Outages are something we take very seriously and invest a significant amount of our time and energy in doing our best to prevent."

His description of what happened actually happened doesn't detail what software was being updated, what went wrong, what overheated, what safeguards kicked in or how many servers were involved: "On the afternoon of the 12th, in one physical region of one of our datacenters, we performed our regular process of updating the firmware on a core part of our physical plant. This is an update that had been done successfully previously, but failed in this specific instance in an unexpected way. This failure resulted in a rapid and substantial temperature spike in the datacenter. This spike was significant enough before it was mitigated that it caused our safeguards to come in to place for a large number of servers in this part of the datacenter," de Haan's blog says.

"These safeguards prevented access to mailboxes housed on these servers and also prevented any other pieces of our infrastructure to automatically failover and allow continued access. This area of the datacenter houses parts of the Hotmail.com, Outlook.com, and SkyDrive infrastructure, and so some people trying to access those services were impacted."

There was no way to restore the affected infrastructure without human intervention, which he says "added significant time to the restoration."

Microsoft is working on improvements to prevent the same scenario from playing out in the future. "Now that we're through the resolution, we're also hard at work on ensuring this doesn't happen again," he says.

Tim Greene covers Microsoft for Network World and writes the Mostly Microsoft blog. Reach him at [email protected] and follow him on Twitter @Tim_Greene.

IDG UK Sites

Microsoft Surface 3 UK release date, price and specs: New Surface tablet offers free upgrade to Win?......

IDG UK Sites

It's World Backup Day 2015! Don't wait another minute: back up now

IDG UK Sites

Adobe Comp CC iPad app review

IDG UK Sites

April Fool's Day pranks: play these geeky pranks on April Fools Day and fool your friends