Why Click Frenzy failed and How they could have Fixed it!
So it is becoming obvious that the ClickFrenzy.com.au situation is a bit of a failure. For the uninitiated, this was a big e-commerce push from the big Australian retailers to discount and make some big pre-christmas sales.
People on Twitter are putting this down to all sorts of things, including the lack of an NBN. But I thought it would be worth clarifying why it is actually failing.
Since clickfrenzy.com.au is offline right now, I can’t do any actual analysis on it, but I can make some educated guesses as to what is going on.
There are a few techniques that IT people use to handle large anticipated loads to their website.
1) Share the load, aka Round-Robin DNS
So when you go to a website like google.com – that just points you to a number on the Internet, like 184.108.40.206. Google has thousands of servers, and it picks one for you to handle your search. Every time you request Google, it is possible that you will get a different server handling your request. This way, if one of the servers is busy, or explodes, or whatever – your search still gets handled and you are fine.
I have sent clickfrenzy.com.au a lot of requests from different servers, and each time I get the IP address 220.127.116.11. So, they aren’t using this technique.
2) Load balancing, another way to share the load
That is fine though! There is another technique for sharing the load of a large anticipated number of requests! It is called load balancing. This is where all of the requests are sent to one server, whose job is to distribute that load amongst a pool of web servers! All that means is that you have some sort of manager-type looking at who is available to handle your request, e.g. look at and buy a cheap LCD screen, and then give you the right man for the job!
This can also be done on a round-robin basis, or on a “least load” basis, where you look at who is doing the least work and then give the request to them. If only construction workers operated in this fashion, ha ha.
3) Big internet connection
The people mentioning the NBN might have been onto something, but in the wrong direction. It is possible that clickfrenzy.com.au just didn’t have a good enough pipe to the internet to handle all the incoming requests. My sincerest guess is that this isn’t the case. No sensible technical person would go into a huge expected traffic situation like this without an adequate connection. And my gut instinct just tells me that this is not the case, though it could cause the same situation that customers experienced.
CUT THE CRAP CHRIS, WHAT HAPPENED:
Imagine you had a pipe. Now imagine that the pipe was easily wide enough to carry all the water you sent down it, but that you were trying to catch it in a series of buckets at the end. What starts to happen? You start spilling water. When this happens on the internet, you start to see your requests to a website take a LONG TIME and then eventually they TIME OUT. This is why you are not seeing an error page, your connection to clickfrenzy.com.au just sits there for ages, and then does nothing. This is because your request either never got down the pipe (slow internet connection) or it did get there, but none of the web servers/buckets were available to serve your requests.
The worst thing is, the more it fails, the more people will keep trying and retrying their requests. This puts even more stuff down the pipe, exacerbating the problem.
WHAT THEY COULD/SHOULD HAVE DONE:
Have you ever tried to purchase tickets on Ticketmaster for a popular event? What happens? Sometimes you’re able to get through and get your tickets..other times it says – “hey, sorry we are very busy, but lots of people are trying to use the site, please try again soon”. Instead of accepting every request even if they don’t have enough capacity to handle them, Ticketek has worked out how much capacity they can handle and started saying NO to the ones they can’t! The great thing about this is that instead of the servers getting overloaded and not being able to help ANYONE. They are able to help a large amount of people, and when they are finished, they can help the next lot.
And in this scenario, it is fine if people keep trying and trying. At some point, enough capacity will be available, and the user will get through. I think users mind this less – they know there is a problem, they get told it is busy, and they try again and at some point it works. Much better than no response at all and an eventual #clickfail.
If anyone is interested in a more detailed explanation of what happened here, please email me email@example.com. I must also point out that Bislr handles these kind of situations for every single site we host, even the free ones. If you would like to sign up for a website and online business platform for promoting your business, please head to http://www.bislr.com/signup – it is free to get started and you will be able to handle as much business which is sent your way. Bislr can handle extremely large loads, and incorporates all of the techniques mentioned above, and more.