Cascading failures for ROSA and OSD services that depend on Quay and AWS

Incident Report for Red Hat

Resolved

Per AWS's most recent update at 2:48 PM PDT, EC2 instance creation is no longer being throttled, and has returned to normal pre-incident levels. We have seen recovery of clusters affected by this outage, and are resolving this incident. If you are experiencing any issues, please reach out to Support.
Posted Oct 20, 2025 - 22:29 UTC

Update

Per AWS's most recent update at 1:03 PM PDT, multiple AWS services affecting compute and networking are continuing to see an improvement. However, there may still be impact to cluster operations, including cluster creation, image pulls, upgrades, and more. Please see https://health.aws.amazon.com/health/status for detailed and direct updates of these underlying services.
Posted Oct 20, 2025 - 20:36 UTC

Update

Per AWS's most recent update at 11:22 AM PDT, multiple AWS services affecting compute and networking remain down or degraded. This may impact cluster operations, including cluster creation, image pulls, upgrades, and more. Please see https://health.aws.amazon.com/health/status for detailed and direct updates of these underlying services.
Posted Oct 20, 2025 - 18:59 UTC

Update

According to the newest AWS update mitigations of the EC2 instance launch issue are ongoing. The incident is still ongoing. No customer action is required. Red Hat is monitoring the ongoing incident.
Posted Oct 20, 2025 - 13:03 UTC

Update

According to the newest update from AWS there are ongoing issues with VM launches. We're observing launch errors in the us-east-1 region which affects ROSA and OSD products. The status hasn't changed, no action is required from the customers.
Posted Oct 20, 2025 - 12:05 UTC

Update

The AWS incident has been updated from Degraded to Impacting. We are still seeing impact to the ROSA and OSD services in us-east-1 region, mostly related to EC2 instance launches. We are continuing to monitor the incident. Currently no customer actions are required. We will update the incident within 1 hour.
Posted Oct 20, 2025 - 11:18 UTC

Update

The impact of the incident is currently limited to the AWS us-east-1 region.
Posted Oct 20, 2025 - 09:28 UTC

Monitoring

Due to an ongoing Quay and AWS incident ROSA and OSD clusters may face degradations of on-cluster services as well issues during installation. Red Hat is actively monitoring the situation and will provide updates as we become aware of them.
Posted Oct 20, 2025 - 08:45 UTC
This incident affected: console.redhat.com (OpenShift Cluster Manager).