2 min read

At the beginning of this week, Mandrill, a transactional email API for MailChimp users, experienced an outage where users were able to send but were unable to receive emails. The Madrill community also tweeted stating that they were also seeing ongoing errors with scheduled mail and webhooks and would resolve the issue soon.

Sebastian Lauwers, the VP of Engineering at Dixa, a customer service software tweeted that the issue took too long to resolve. He also asked for the reason why Mandrill was taking so long–nearly 23 hours–to sort the issue.

Today, one of the users with the username GuyPostington posted an email received from Mandrill, on HackerNews. The email explains the reason for Mandrill’s outage and how they will be addressing the issue. Mandrill uses a sharded Postgres setup as one of their main datastores. According to the email, “On Sunday, February 3, at 10:30 pm EST, 1 of our 5 physical Postgres instances saw a significant spike in writes. The spike in writes triggered a Transaction ID Wraparound issue. When this occurs, database activity is completely halted. The database sets itself in read-only mode until offline maintenance (known as vacuuming) can occur.” They have also tweeted the same

They further mentioned that the database is large due to which the vacuum process takes a significant amount of time and resources, and there’s no clear way to track progress.

To address this issue, the community writes, “We don’t have an estimated time for when the vacuum process and cleanup work will be complete. While we have a parallel set of tasks going to try to get the database back in working order, these efforts are also slow and difficult with a database of this size. We’re trying everything we can to finish this process as quickly as possible, but this could take several days, or longer.”

The email also states that once the outage is resolved, the community plans to offer refunds to all the affected users.

To know about this news in detail, visit Mandrill’s Tweet thread.

Read Next

Microsoft Cloud services’ DNS outage results in deleting several Microsoft Azure database records

Internet Outage or Internet Manipulation? New America lists government interference, DDoS attacks as top reasons for Internet Outages across the world

Outage in the Microsoft 365 and Gmail made users unable to log into their accounts

A Data science fanatic. Loves to be updated with the tech happenings around the globe. Loves singing and composing songs. Believes in putting the art in smart.