Showing posts with label Sun. Show all posts
Showing posts with label Sun. Show all posts

Tuesday, December 11, 2007

Data Center Meltdown

Denial and the coming “data meltdown” by ZDNet's Michael Krigsman -- Subodh Bapat, Sun Microsystems eco-computing vice president, believes we’ll soon see the first world-class data center meltdown. According to News.com: “You’ll see a massive failure in a year,” Bapat said at a dinner with reporters on Monday. “We are going to see a data center failure of that scale.” “That scale” referred to the problems [...]

Now let's think about that. How can a datacenter fail?

  1. It can lose power for an extended period. It takes a lot of backup generators to keep a 50 megawatt datacenter humming.
  2. A virus or worm can shut it down.
  3. A natural disaster can destroy it or force a shutdown. Often times this is due to a power failure rather than destruction.
  4. It can have its WAN/Internet connection(s) severed. This isn't quite catastrophic unless you're on the other side of the WAN.

Michael has pointed out that Subodh Bapat doesn't point to the expected cause of a major data center meltdown, he just says one is coming. That's because it's not really a matter of one cause. There are so many risks, and you multiply them by the growing number of data centers, and what you come up with is that there's bound to be a major failure soon. We just don't know what the precise cause will be.

Most of the threats involve a geographically localized destruction or disabling of the data center. This means you need off-site recovery, and it probably needs to be fast. That means you probably need more than recovery, you need one or more hot sites that can take over the load of one that fails. This is extremely expensive for a "normal" data center. I can hardly imagine how much it would cost for a 50+ megawatt facility. Basically what we have is too many eggs in one basket, with economies of scale pushing us to keep on putting more eggs in the same basket.

What surprises me is that Subodh Bapat didn't say, oh, well, Sun has the solution. Considering that Jonathan Schwartz put it forward over a year ago. Ok, well, he didn't exactly suggest Blackbox as a means of easily distributing data centers. But think about it. If you're a large corporation you are probably already geographically distributed. If you expand your data center in cargo-container sized units across the nation (or world), you are half..err..maybe one quarter of the way there. You still have to figure out how to make them be hot sites for each other, or at least recovery sites. But at least losses at any given site would be minimized.

Sphere: Related Content

Monday, July 16, 2007

Sun Ray Thin Clients

Last week I made a comment on Paul Murphy's blog about how the thin-ness of Sun Rays is really up to interpretation. Today he's decided to dedicate an entire blog to explaining why I'm wrong, because he figures if I have an incorrect understanding of Sun Rays, then a lot of people have an incorrect understanding of Sun Rays. He's probably right, although I don't think my understanding is that far off base, and he's been kind enough to let me see a draft copy of his blog so I can get a head start on the response. Here's what I said:

Smart, Thick, Thin, Display

It's all word games. Depending on how you define "processing," there is processing going on. It still has to render graphics, translate keyboard and mouse events, etc. A SunRay is just a compacted Sun workstation of yesteryear without a harddrive and special firmware designed to work solely as an X-Windows server.

The problem is the attempt to make "smart displays" seem more fundamentally different from other similar solutions just muddies the waters. People like me groan because yet another term has been introduced that means almost the same as other terms that will need to be explained to the higher-ups. The higher-ups get confused and either latch onto it or, more likely, have their eyes glaze over.

Anyway, enough with our industry's incredible ability to make sure words are completely meaningless...

The problem with Sun Ray and other similar solutions is that they are really a local optimum based on today's technology and practices for a relatively narrow range of priorities. Change the priorities and the solution is no longer optimum. Introduce distributed computing techniques with the same low administrative overhead and they lose out entirely.

As far as I can tell, the first part is technically accurate. Older Sun Rays ran a 100Mhz UltraSparc II, had 8mb of RAM, and ran a microkernel. See here and here. Newer ones use an even beefier system-on-a-chip.

So the Sun Ray client is obviously processing something, and actually has a fair amount of processing power. Just because it is not maintaining any application state, doesn't mean it's not doing anything. Murph asserts that a Sun Ray is not an X-Terminal, but he'll have to explain the difference to me. He could be right...I don't know. It's been about 7 years since I've used a Sun Ray, but from what I remember it felt just like using Exceed on a PC under Windows, which is quite common at my employer. He did mention this:
Notice that the big practical differences between the Sun Ray and PC all evolve from the simplicity of the device in combination with the inherently multi-user nature of Unix. In contrast the differences between the Sun Ray and X-terminal arise because the X-terminal handles graphics computation and network routing -making it more bandwidth efficient, but marginally less secure.
But the Sun Ray quite clearly has a graphics accelerator and talks over the network, so while there is probably a subtle difference in there that I'm not grasping, it doesn't seem particularly marterial. But that's not really the meat of the debate, it's just a technical quibble over what consitutes processing and an operating system. He's dilluting the debate by calling Sun Ray's "smart displays" instead of "thin clients" and thus drawing a false dichotomy, and I'm doing the same by pointing at internal technical specs that have little to do with actual deployment. The real debate is: "Where should processing take place?" I'll give you a contrite answer - as close to the data as possible. Any computation involves a set of inputs and a set of outputs. It makes no sense to shuttle a million database rows from a database server to an application server or client machine in order to sum up a couple fields. It makes much more sense to do it where the data is, and then ship the result over the network. Likewise, if you have a few kilobytes of input data and several megabytes/gigabytes of results, it makes sense to do the computation wherever the results are going to be needed. So this is my first issue with the centralized computing paradigm. Right now I'm typing this blog in Firefox on Linux, and my computer is doing a fair amount of work to facilitate that interaction with Blogger. I've also got a dozen other Windows open. Most of the memory and CPU I'm consuming is dedicated to the local machine interacting with me, the local user. Only a couple pages of text are being exchanged back-and-forth with blogger. So why not let the Sun Ray run Firefox (and an email client, a word processor, etc.)? The new ones have the processing power. They probably would need $100 worth of RAM or so to keep a stripped-down Unix variant in RAM, which could be loaded from the network. Intelligent configuration could make the client smart about whether to run an app locally, on a server, or on an idle workstation down the hall. Murph gives seven reasons: 1. portability Murph asserts that with Sun Rays you gain portability, because you can halt a session one place and immediately resume it another place. I don't doubt that is true, but I don't see any technical reason why the same could not be accomplished with a distributed architecture. All that happens is your terminal becomes the processing server for a remote application. Remember, in Unix, there isn't a fundamental difference between a client and a server. I'm not going to address the laptop debate right now. Murph has made some very good arguments against laptops in the past based on the security concerns of them being stolen, despite strong encryption. I think he underestimates the value of laptops and is probably wrong, but there are a substantial number of people who could live with a "portable terminal" because their homes and hotels have sufficient bandwidth. 2. reliability This is where the distributed model really shines. In my experience, networks just are generally one of the less reliable portions of the computing environment, especially WANs and my own internet connection. A pure thin-client solution simply stops working when the network goes down. In the past, Murph has asserted that everyone needs network connectivity to work, so this doesn't matter. But in my opinion most professionals can continue working for several hours, possibly at reduced productivity, when disconnected from the network. That buys time for IT to fix the network before the business starts bleeding money in terms of productivity. Keeping processing local, along with caching common apps and documents, increases the effective reliability of the system. 3. flexibility Murph lists nothing that cannot be done with a locally-processing workstation. 4. security Don't use x86 workstations, especially running Windows. The security gains are from a more secure operating system on a processor architecture designed for security and reliability. Eliminating permanent storage from the client does buy some security, because there is then no way to walk out the door with all the data, but distributed processing doesn't preclude centralized permanent storage. There are, of course, substantial advantages to having local storage, like being able to make a laptop that can be used in an entirely disconnected fashion. But I think that's a separate debate. 5. processing power There's nothing about a distributed computing model that says you can't install compute servers. Heck, this is done all the time with Windows (both to Windows servers and more commonly to Unix servers). Murph's example of a high-performance email server has nothing to do with the thin-client architecture, and everything to do with properly architecting your mail server. 6. cost There aren't significant cost savings in terms of hardware when switching to Sun Rays. Hardware is cheap, and you can throw out a lot of pieces in the common PC to reduce the cost. In fact, I bet Sun Rays cost more because of the servers. I don't doubt that when effectively administrated they cost less to keep running than a Windows solution, but that's mostly because of Unix. I'll admit that it is probably cheaper to administer Sun Rays than my distributed model because I think it will require greater skill and discipline (meaning higher paid admins), so in abscense of detailed numbers I'll say it's a wash. 7. user freedom This is partially a consequence of using Unix instead of Windows, and mostly a consequence of changing culture. So as I said before, Sun Rays, and centralized computing in general, represent a kind of a local optimum for a given solution and today's practices. But I don't think they make a solid generalized approach. Distributed computing can be successfully with all the advantages of Murph's Sun Ray architecture using today's technology, it just isn't common. Now I've ignored the elephant in the room: Much essential software only runs on Windows, and the minute you introduce Windows into the mix (local or centralized), you start compromising many of the advantages outlined above. Of course, what good is a computing environment if it won't run the desired software? Consequently, I think it will be a long time before anything like this flies in most enterprise environments.

Sphere: Related Content