Cloudflare’s view of the Virgin Media outage in the UK
2023-4-5 02:57:46 Author: blog.cloudflare.com(查看原文) 阅读量:20 收藏

Loading...

Just after midnight (UTC) on April 4, subscribers to UK ISP Virgin Media (AS5089) began experiencing an Internet outage, with subscriber complaints multiplying rapidly on platforms including Twitter and Reddit.

Cloudflare Radar data shows Virgin Media traffic dropping to near-zero around 00:30 UTC, as seen in the figure below. Connectivity showed some signs of recovery around 02:30 UTC, but fell again an hour later. Further nominal recovery was seen around 04:45 UTC, before again experiencing another complete outage between around 05:45-06:45 UTC, after which traffic began to recover, reaching expected levels around 07:30 UTC.

After the initial set of early-morning disruptions, Virgin Media experienced another round of issues in the afternoon. Cloudflare observed instability in traffic from Virgin Media’s network (called an autonomous system in Internet jargon) AS5089 starting around 15:00 UTC, with a significant drop just before 16:00 UTC. However in this case, it did not appear to be a complete outage, with traffic recovering approximately a half hour later.

Virgin Media’s Twitter account acknowledged the early morning disruption several hours after it began, posting responses stating “We’re aware of an issue that is affecting broadband services for Virgin Media customers as well as our contact centres. Our teams are currently working to identify and fix the problem as quickly as possible and we apologise to those customers affected.” Further responses after service restoration noted “We’ve restored broadband services for customers but are closely monitoring the situation as our engineers continue to investigate. We apologise for any inconvenience caused.”

However, the second disruption was acknowledged on Virgin Media’s Twitter account much more rapidly, with a post at 16:25 UTC stating “Unfortunately we have seen a repeat of an earlier issue which is causing intermittent broadband connectivity problems for some Virgin Media customers. We apologise again to those impacted, our teams are continuing to work flat out to find the root cause of the problem and fix it.”

At the time of the outages, www.virginmedia.com, which includes the provider’s status page, was unavailable. As seen in the figure below, a DNS lookup for the hostname resulted in a SERVFAIL error, indicating that the lookup failed to return a response. This is because the authoritative nameservers for virginmedia.com are listed as ns{1-4}.virginmedia.net, and these nameservers are all hosted within Virgin Media’s network (AS5089) and thus are not accessible during the outage.

Although Virgin Media has not publicly released a root cause for the series of disruptions that its network has experienced, looking at BGP activity can be instructive.

BGP is a mechanism to exchange routing information between networks on the Internet. The big routers that make the Internet work have huge, constantly updated lists of the possible routes that can be used to deliver each network packet to its final destination. Without BGP, the Internet routers wouldn't know what to do, and the Internet wouldn't exist.

The Internet is literally a network of networks, or for math fans, a graph, with each individual network a node in it, and the edges representing the interconnections. All of this is bound together by BGP, which allows one network (Virgin Media, for instance) to advertise its presence to other networks that form the Internet. When Virgin Media is not advertising its presence, other networks can’t find its network and it becomes effectively unavailable.

BGP announcements inform a router of changes made to the routing of a prefix (a group of IP addresses) or entirely withdraws the prefix, removing it from the routing table. The figure below shows aggregate BGP announcement activity from AS5089 with spikes that align with the decreases and increases seen in the traffic graph above, suggesting that the underlying cause may in fact be BGP-related, or related to problems with core network infrastructure.

We can drill down further to break out the observed activity between BGP announcements (dark blue) and withdrawals (light blue) seen in the figure below, with key activity coincident with the loss and return of traffic. An initial set of withdrawals are seen just after midnight, effectively removing Virgin Media from the Internet resulting in the initial outage.

A set of announcements occurred just before 03:00 UTC, aligning with the nominal increase in traffic noted above, but those were followed quickly by another set of withdrawals. A similar announcement/withdrawal exchange was observed at 05:00 and 05:30 UTC respectively, before a final set of announcements restored connectivity at 07:00 UTC.

Things remained relatively stable through the morning into the afternoon, before another set of withdrawals presaged the afternoon’s connectivity problems, with a spike of withdrawals at 15:00 UTC, followed by additional withdrawal/announcement exchanges over the next several hours.

Conclusion

Track ongoing traffic trends for Virgin Media on Cloudflare Radar, and follow us on Twitter and Mastodon for regular updates.

We protect entire corporate networks, help customers build Internet-scale applications efficiently, accelerate any website or Internet application, ward off DDoS attacks, keep hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check out our open positions.

Outage Cloudflare Radar Trends Traffic United Kingdom

文章来源: https://blog.cloudflare.com/virgin-media-outage-april-4-2023/
如有侵权请联系:admin#unsafe.sh