On February 28th, 2017 Amazon Web Services’ Simple Storage Service (S3) experienced an outage in its US-East-1 region located in North Virginia. The service outage lasted approximately 5 hours and also had side effects on multiple AWS services using S3. At 2pm PST, AWS announced that S3 operations had fully recovered. All AWS services affected by the S3 outage resumed within a few hours of the resolution. Today all services are operating normally according to AWS Service Health Dashboard.
How should I architect my environment to be resilient to region failures?
If your business requires a strict SLA and will be affected by any amount of downtime, you may need to consider a multi-region high availability architecture. If a service fails in one region, your traffic will be diverted to another region without any downtime. Although it involves more AWS services and complexity, our team has designed and implemented such architectures that allow you to maintain a very high SLA.
In case you missed it, AWS has recently released the Ohio region that has advantages for multi-region architectures as it is just a few milliseconds away from North Virginia and the traffic between North Virginia and Ohio costs the same as between Availability Zones.