Partial internet/cloudflare outage
Incident Report for UpGuard CyberRisk
Postmortem

Incident Summary

On June 21, 2022 at 06:40 UTC, the UpGuard CyberRisk, Web App & Authentication services experienced a partial outage for 46 minutes. We identified the cause to be related to the critical incident that Cloudflare reported. Some customers attempting to reach UpGuard CyberRisk observed a 500 error within their browser.

Fault

Cloudflare suffered an outage that affected traffic in 19 of their data centres which handles a significant proportion of their global traffic. Depending upon your location in the world you may have been unable to access websites and services that rely on Cloudflare. 

Detection

Internal alerting systems notified internal channels of the service disruption to UpGuard CyberRisk, Web App & Authentication services.

Impact

Outage: UpGuard CyberRisk, Web App & Authentication services were unavailable depending upon your location in the world for 46 minutes. There was intermittent performance within the product as a result.

Recovery

Cloudflare fix implemented at 07:29 UTC, Cloudflare brought all data centres back online by 07:42 UTC. UpGuard CyberRisk, Web App & Authentication services were accessible.

Timeline

June 21, 2022 at 06:27 UTC: Cloudflare incident flagged
June 21, 2022 at 06:34 UTC: A critical incident was declared by Cloudflare
June 21, 2022 at 06:40 UTC: UpGuard CyberRisk, Web App & Authentication services became inaccessible for some users
June 21, 2022 at 06:48 UTC: An incident response group was formed
June 21, 2022 at 07:14 UTC: Cloudflare identified the issue
June 21, 2022 at 07:26 UTC: UpGuard CyberRisk, Web App & Authentication services became accessible
June 21, 2022 at 07:29 UTC: Fix implemented by Cloudflare
June 21, 2022 at 07:30 UTC: UpGuard monitoring of fix commenced
June 21, 2022 at 07:48 UTC: UpGuard Customer communications sent 
June 21, 2022 at 08:06 UTC: Incident deemed closed, services restored

Root Cause

Our authentication service provider experienced high error rates and timeouts across authentication and management API. Depending upon your location in the world, you may have been unable to access UpGuard CyberRisk as a result of our authentication provider being directly affected by the Cloudflare outage.

Posted Jul 06, 2022 - 02:30 UTC

Resolved
UpGuard has been online and stable since Cloudflare implemented a fix.
Posted Jun 21, 2022 - 08:06 UTC
Monitoring
Cloudflare has implemented a fix. We have reports that UpGuard is now available to all users.
We are monitoring for any further issues.
Posted Jun 21, 2022 - 07:29 UTC
Identified
Cloudflare has identified the issue, and are working on a fix.
Please track the fix using: https://www.cloudflarestatus.com/incidents/xvs51y9qs9dj
Posted Jun 21, 2022 - 07:14 UTC
Investigating
UpGuard is currently inaccessible for some or most users. We believe this is related to a more widespread outage across the internet and are investigating.
Posted Jun 21, 2022 - 06:40 UTC
This incident affected: UpGuard CyberRisk (Web App, External API, Authentication).