You’ve probably heard that Amazon AWS had some problems recently. A question on Stackoverflow recently pointed out a detailed summary of the problem posted on the AWS message board.
Obviously every distributed system is different and every outage is unique so it is difficult to generalise. Some takeways I have are:
- Outages happen to even the best guys on the block…so you better plan for yours.
- Building distributed systems is hard…so you need experience and experienced friends.
- Manual changes are a common cause…not said explicitly in the AWS writeup, but strongly implied.
- Outages are often “emergent” phenomena whereby a simple error causes many systems to interact in a way which grows exponentially. The AWS writeup refers to this as a “storm” and I have witnessed similar “storms” in large distributed systems. The degree of coupling and simple aspects like backoff parameters can make the difference between a disturbance that grows exponentially or decays exponentially. Think of the Tacoma Narrows bridge – perhaps the analogy is a stretch, but tuning of a few simple parameters can avoid destructive resonances.
- One of the responses pointed to the Netflix Chaos Monkey as being vindicated by the outage. The “Lean” guys have taught us that if something is difficult (like testing or deployment) then you should do it often until it aint difficult any more. Perhaps system failure/resilience is the next frontier for this approach.
7 comments ↓
I see you don’t monetize your blog, don’t waste your traffic, you can earn extra cash every month because
you’ve got high quality content. If you want to know
how to make extra $$$, search for: Boorfe’s tips best adsense alternative
I have checked your website and i’ve found some duplicate content, that’s
why you don’t rank high in google, but there is a tool that
can help you to create 100% unique content, search
for: Boorfe’s tips unlimited content
term manuscript (late lat.manuscriptum,
ancient and medieval Latin,
the spread of parchment.
number of surviving European
which is carried out by the printing
Leave a Comment