Google faced a major outage on Monday this week as it went down for over an hour, taking a toll on Google Search and a majority of its other services such as the Google Cloud Platform. The outage was apparently a result of Google losing control over the normal routes of its IP addresses as they instead got misdirected, due to a BGP (Border Gateway Protocol) issue, to China Telecom, Nigeria, and Russia.
The issue began at 21:13 UTC when MainOne Cable Company, a carrier in Lagos, Nigeria declared its own autonomous system 37282 as the right path to reach 212 IP prefixes that belong to Google, reported ArsTechnica. Shortly after, China Telecom improperly accepted the route and further declared it worldwide, leading to Transtelecom and other large service providers in Russia to follow the same route.
A networking and security company, BGPmon, who assesses the route health of networks, tweeted out on Monday that it “appears that Nigerian ISP AS37282 ‘MainOne Cable Company’ leaked many @google prefixes to China Telecom, who then advertised it to AS20485 TRANSTELECOM (Russia). From there on others appear to have picked this up”.
BGPmon also tweeted that redirection of IP addresses came in five distinct waves over a 74-minute period:
Customer behind Cogent and NTT experienced the @google outages likely in 5 waves between these times (UTC) 74 minutes total:
21:13 – 21:17 4min
21:18 – 21:21 3min
21:22 – 21:28 6min
21:30 – 21:50 20min
21:51 – 22:32 41min
example ASpath: 174 2914 20485 4809 37282 15169
— BGPmon.net (@bgpmon) November 12, 2018
Another Network Intelligence company, ThousandEyes tweeted how a “potential hijack” was underway. As per ThousandEyes, it had detected over 180 prefixes affected by this route leak, covering a wide range of Google services.
BREAKING: Potential hijack underway. ThousandEyes detected intermittent availability issues to Google services from some locations. Traffic to certain Google destinations appears to be routed through an ISP in Russia & black-holed at a China Telecom gateway router. pic.twitter.com/Tz7shf7cOy
— ThousandEyes (@thousandeyes) November 12, 2018
This led to a growing suspicion among many as China Telecom, a Chinese state-owned telecommunication company recently came under the spotlight for misrouting the western carrier traffic through mainland China.
On further analysis, however, ThousandEyes reached a conclusion that, “the origin of this leak was the BGP peering relationship between MainOne, the Nigerian provider, and China Telecom”. MainOne is in a peering relationship with Google via IXPN in Lagos and has got direct routes to Google, that leaked into China Telecom. These routes then further got propagated from China Telecom, via TransTelecom to NTT and other transit ISPs. “We also noticed that this leak was primarily propagated by business-grade transit providers and did not impact consumer ISP networks as much”, reads the ThousandEyes blog.
BGPmon further tweeted that apart from Google, Cloudflare also faced the same issue as its IP addresses followed the same route as Google’s.
— BGPmon.net (@bgpmon) November 13, 2018
However, Matthew Prince, CEO, CloudFare, told Ars Technica that this routing issue was just an error and chances of it being a malicious hack was low .“If there was something nefarious afoot there would have been a lot more direct, and potentially less disruptive/detectable, ways to reroute traffic. This was a big, ugly screw up. Intentional route leaks we’ve seen to do things like steal cryptocurrency are typically far more targeted” said Prince.
“We’re aware that a portion of Internet traffic was affected by the incorrect routing of IP addresses, and access to some Google services was impacted. The root cause of the issue was external to Google and there was no compromise of Google services,” a Google representative told ArsTechnica.
MainOne also updated regarding the issue on its site, saying, that it faced a “technical glitch during a planned network update and access to some of the Google services was impacted. We promptly corrected the situation at our end and are doing all that is necessary to ensure it doesn’t happen again. The error was accidental on our part; we were not aware that any Google services were compromised as a result”.
MainOne further addressed the issue on Twitter saying that the problem occurred due to a misconfiguration in BGP filters:
We have investigated the advertisement of @Google prefixes through one of our upstream partners. This was an error during a planned network upgrade due to a misconfiguration on our BGP filters. The error was corrected within 74mins & processes put in place to avoid reoccurrence
— MainOne (@Mainoneservice) November 13, 2018
The main takeaway from this incident remains that doing business on the Internet is still risky and there are going to be times when it’ll lead to unpredictable and destabilizing events, that may not necessarily be ‘malicious hacks’.