Table of Contents Link to heading

Introduction
Root Cause
Similar Incidents
Best Practices
Other Types of Network Outages

Introduction Link to heading

Info

The Optus 2023 outage was a major network failure that affected about 10 million Optus customers on November 8, 2023, disrupting phone calls, Internet access and emergency services.

Root Cause Link to heading

Note

According to Optus, at around 4.05am, the Optus network underwent a routine software upgrade. This upgrade led to changes in routing information from an international peering network. These changes propagated through multiple layers of the Optus network and exceeded the preset safety levels on key routers. Unable to handle the overload, these routers disconnected from the Optus IP Core network to protect themselves.

This was seen as an incident where routing updates sent between external parties had crashed individual routers. For example, a simple typo in a “route map” when redistributed between internal networks can similarly overload routers.

Best Practices Link to heading

Tip

Implement proper routing policies and security mechanisms to filter and verify the routing updates received from external parties.
- Using prefix lists, access lists, or route maps to allow or deny the routes based on their attributes, such as origin, length, or AS path.
- Using route authentication and encryption protocols, such as BGP MD5 or IPsec, to ensure the identity and integrity of the routing updates.
Apply route dampening and penalty mechanisms to suppress unstable or flapping routes that can cause network instability or congestion.
- Using the route dampening feature in BGP, which assigns a penalty to each route that changes state, and suppresses the route when the penalty exceeds a certain threshold. The penalty decays over time, and the route is unsuppressed when the penalty falls below another threshold.
Monitor the network performance and stability, and detect any anomalies or changes in the routing table or the BGP updates.
- Using network management and monitoring tools, such as SNMP, NetFlow, or Syslog, and setting up alerts or notifications for any unusual events or trends.

Analysis of Optus 2023 Outage

Table of Contents Link to heading

Introduction Link to heading

Root Cause Link to heading

Similar Incidents Link to heading

Best Practices Link to heading

Other Types of Network Outages Link to heading