Portal Outage Wednesday 4th April 2018 08:43:00


Updated 09:03 UTC The root cause was identified by a conflicting failover condition. A state where two processes identified different priorities for failover. Failover had started, before a second process identified that failover was no longer necessary. We're continuing to test failover scripts (all processes involved) to ensure their reliable operation (independently, and in unison). No customer data affected. Outage began at 8.43am UTC/GMT, and most customers were impacted by 8.47. All customer portals were restored by 8.47am (white-labelled portals take around a minute extra to recover). Apologies for the disruption, and we'll continue our testing of failover automation to minimise risk of recurrence.

Updated 08:49 UTC All customer portals are available and responding as normal. We're still investigating the root cause.

We're investigating errors appearing for customer portals.