While downtime is clearly not acceptable, it's not reasonable to draw conclusions about IB's quality assurance in this way. Looking from the outside it's just not possible to know the source of the problem. It could lie in so many different areas. Running high availabitily systems is a lot more difficult than people realise. In particular, automatic failovers are difficult because there can be failure modes not forseen in the system design. No matter how much testing happens in a test environment, there can always be something different in a production environment that screws thing up. As an example, I worked on a system for a large telco that was prepared to spend the money to have 24x7. Everything was dualed with automatic failover, striped and mirrored disk arrays, hot swappable disks, dual lans, dual redundant power supplies for everything. Failover well tested. It went down for a day because one of the disk power supplies caught fire - there was *smoke and flames*. Of course the whole thing had to be shut down. It could not resonably be said to have been anyones fault or due to incompetance or negligence. So while IB's customers have every right to demand reliable operation, drawing conclusions about IB's development and operational practices is not really justified because none of us know the facts. I think in the interest of better customer relations, IB should be providing a more detailed account of what went wrong, and what actions are in place to ensure it doesn't happen again. Incidently I have had no problems with IB today, though last friday there were some issues,.
I disagree, IF IB throws enough money into making this system truly fault tolerant, they can achieve that. At least make it 99.9999% available. 3 days out of the year is not 99.999% which I think is quite achievable in large scale high-availablity data centers. The fact that their HK operation is running means they could have created continentally-spanned clusters of database, application servers, etc.. and so on. While it's not easy, when's the last time I actually had problems logging into Citibank's system for hours on end and an existing problem that has been for 3 days? I can't even remember a downtime of more than 1 hour.
I disagree - the proof is in the pudding and the rest is just lame bull shitting around the issue. My late father was a high end developper, later project leader for fault tolerant systems so I have more than a passing knowledge of this. Do you fly in a plane and wait for a fault in the computer control system to develop before you fix it? I rest my case. Maria
I can understand a short downtime if a system goes down and there was a problem with the failover process. But these problems have been going on since Thursday. I think it would be appropriate for IB to offer a detailed explanation of what is happening and why it won't happen again.
Hmmm. I'm sure citi has had downtime of more than an hour in their systems. Used to work for credit card proc for them , and when there was telco failure, they went down just like everyone else, even with dial backup. And Citi has 100's of millions if not billion to spend. No comparison. Hey Etrade's had 30+ minute hold times for CS for 5 months. Stocks on a tear. Go figure.
My grandpa built the internet. I am a network engineer and I work with fault tolerant systems as well. While systems go down, it should not go down to the point where people can't use the core system at all. Build your own system so I don't have to rely on IB? right. Well, I'm about to do it soon, after Sequoia capital gives me $100 million dollars to build my own execution platform.
Hmmmm that is about 99.996% too much.... Same as NASA - they spend how many $$$ developping a pen that could write in space? After the wall came down they asked the Russians how they had solved that problem. "We use a pencil" was the reply. Maria
They used a pencil because they sent the owner of the pen factory to Siberia and didn't have any pens handy.
IB is working fine now (US). Came right up and all my pages are back. Wut's all da fuss? (yeah I know, it sux when they are down. costs people money...lucky for me I only need it Mon-Fri...) good luck to all