It seems all is not well at Microsoft post yesterday’s outage as the Microsoft’s Azure cloud been up and down globally because of a DNS configuration issue.
This outage that started at 1:20 pm yesterday, lasted for more than an hour which ended up affecting Microsoft’s cloud services, including Office 365, One Drive, Microsoft Teams, Xbox Live, and many others that are used by Microsoft’s commercial customers. Due to the networking connectivity errors in Microsoft Azure even the third-party apps and sites running on Microsoft’s cloud got affected.
Meanwhile, around 2:30 pm, Microsoft started gradually recovering Azure regions one by one. Though Microsoft is yet to completely troubleshoot this major issue and has already warned that it might take some time to get everyone back up and running. But this isn’t the first time that DNS outage has affected Azure. This year in January, a few customers’ databases had gone missing, which affected a number of Azure SQL databases that utilize custom KeyVault keys for Transparent Data Encryption (TDE).
🛠️Engineers are investigation connectivity issues with Azure Services. More information will be provided as it becomes available. https://t.co/88WEhh7yJT
— Azure Support (@AzureSupport) May 2, 2019
The Azure status page reads, “Customers may experience intermittent connectivity issues with Azure and other Microsoft services (including M365, Dynamics, DevOps, etc).”
The Microsoft engineers found out that an incorrect name server delegation issue affected DNS resolution, network connectivity, and that affected the compute, storage, app service, AAD, and SQL database resources. Even on the Microsoft 365 status page, Redmond’s techies have blamed an internal DNS configuration error for the downtime.
Also, during the migration of the DNS system to Azure DNS, some domains for Microsoft services got incorrectly updated. The good thing is that no customer DNS records were impacted during this incident, also the availability of Azure DNS remained at 100% throughout this incident. Only records for Microsoft services got affected due to this issue.
According to Microsoft, the broken systems have been fixed and the three-hour outage has come to an end and the Azure’s network infrastructure will soon get back to normal.
We've identified and corrected a DNS configuration issue that prevented users from accessing Microsoft 365 services. Further details can be found in the admin center under SP178746, OD178975, and MO178979.
— Microsoft 365 Status (@MSFT365Status) May 2, 2019
Users have reported issues with accessing the cloud service and are complaining. A user commented on HackerNews, “The sev1 messages in my inbox currently begs to differ. there’s no issue maybe with the dns at this very moment but the platform is thoroughly fucked up.”
Users are also questioning the reliability of Azure. Another comment reads, “Man… Azure seems to be an order of magnitude worse than AWS and GCP when it comes to reliability.”
To know more about the status of the situation, check out Microsoft’s post.