Yesterday, Google Cloud servers in the us-east1 region were cut off from the rest of the world as there was an issue reported with Cloud Networking and Load balancing within us-east1.
These issues with Google Cloud Networking and Load Balancing have caused physical damage to multiple concurrent fiber bundles that serve network paths in us-east1. At 10:25 am PT yesterday, the status was updated that the “Customers may still observe traffic through Global Load-balancers being directed away from back-ends in us-east1 at this time.”
It was later posted on the status dashboard that the mitigation work was underway for addressing the issue with Google Cloud Networking and Load Balancing in us-east1. However, the rate of errors was decreasing at the time but few users faced elevated latency.
Around 4:05 pm PT, the status was updated, “The disruptions with Google Cloud Networking and Load Balancing have been root caused to physical damage to multiple concurrent fiber bundles serving network paths in us-east1, and we expect a full resolution within the next 24 hours. In the meantime, we are electively rerouting traffic to ensure that customers’ services will continue to operate reliably until the affected fiber paths are repaired. Some customers may observe elevated latency during this period. We will provide another status update either as the situation warrants or by Wednesday, 2019-07-03 12:00 US/Pacific tomorrow.”
This outage seems to be the second major one that hit Google’s services in recent times. Last month, Google Calendar was down for nearly three hours around the world. Last month Google Cloud suffered a major outage that took down a number of Google services including YouTube, GSuite, Gmail, etc.
According to a person who works on Google Cloud, the team is experiencing an issue with a subset of the fiber paths that supply the region and the team is working towards resolving the issue. They have mostly removed all the Google.com traffic out of the Region to prefer GCP customers.
A Google employee commented on the HackerNews thread, “I work on Google Cloud (but I’m not in SRE, oncall, etc.). As the updates to  say, we’re working to resolve a networking issue. The Region isn’t (and wasn’t) “down”, but obviously network latency spiking up for external connectivity is bad. We are currently experiencing an issue with a subset of the fiber paths that supply the region. We’re working on getting that restored. In the meantime, we’ve removed almost all Google.com traffic out of the Region to prefer GCP customers. That’s why the latency increase is subsiding, as we’re freeing up the fiber paths by shedding our traffic.”
Google Cloud users are tensed about this outage and awaiting the services to get restored back to normal.
— Ian ⚡️👨🏽💻 (@IanFortier) July 2, 2019
Not a good day for cloud outages — Google Cloud East is down now: https://t.co/o1Aln9MK8F
— Becky Nagel (@beckynagel) July 2, 2019
This Google Cloud Outage is killing me
— wolff (@SeaWolff) July 2, 2019
Ritiko, a cloud-based EHR company is also experiencing issues because of the Google Cloud outage, as they host their services there.
Ritiko is experiencing an outage as a result of a Google Cloud (@googlecloud ) outage, where we host our services. We're working to migrate our servers to an unaffected region within the us. Customers might experience higher latencies in the interim. Sorry for the outage
— Ritiko LLC (@ritikoL) July 2, 2019
As of now there is no further update from Google on if the outage is resolved, but they expect a full resolution within the next 24 hours. Check this space for new updates and information.