Monday, January 21, 2008

I hate Apple and HP

A couple months ago I bought a new MacBook. For the most part I've loved it, but last night I ran into a problem.

When I bought my Mac, Apple was offering a $100 rebate on the purchase of a new printer to go with it. The sales guy pointed out that there are number of printers that cost around $100, so the printer would be essentially free. I chose an HP C4280 All-in-One. Look at that! If it wasn't for sales tax I would have made five cents off of the purchase. Well, you get what you pay for

As a printer it's worked fine. I didn't even need to install any drivers. I plugged it in and it just worked. Of course, that's what I expect from Apple. But last night my wife wanted to scan a document and make it into a PDF. I figured, "Ok, that should be easy." Boy was I wrong.

So I launch Image Capture on my Mac to scan the document. It tells me that I don't have any attached devices. Hmmm. I printed a few minutes ago. Why don't I have any attached devices? So maybe I'm using the wrong application. There's a "scan" button on the printer, so I press that, hoping that the right application will magically open up (see what Apple has done to me!). The printer thinks for a minute, and then tells me that it's not connected through USB. Well, of course it is, because I just printed over USB. I decide to do some Googling

It turns out that while the printer drivers come pre-installed with Leopard, the scanner drivers do not. It's a 192 MB download full of crapware. I hate Apple for making me think that I didn't need to install drivers, and then consuming a chunk of my Sunday evening installing drivers. They set an expectation and then disappointed. It would have been much better to just make me install all the drivers up front.

But why did I say it was full of crapware? Well, let's see. So I scanned the document as a PDF with text (so it has to do OCR) using "HP Scan Pro." That worked. Kind of. I did get a decent looking PDF document with the text correctly converted into text. I also got a completely locked up HP Scan Pro application, and I mean completely locked up. I tried to log out of my Mac, figuring that would clean up the crashed process. Nope! It sat there for a minute, then complained that an application wouldn't exit and asked if it should do it forcefully. I of course said yes, and then it just sat there for a few minutes longer. I got the same result from trying to shut down. At least when you tell Windows that it can violently kill, it violently kills the processes. Apparently MacOSX is too polite, or at least has more patience than I do.

That's another reason to hate Apple. It was worse than Windows, and using a product purchased from the Apple Store no less.

Fortunately I'm a Unix guy and I know how to violently kill processes.

su ps -ef | grep HP kill -9 pid1 (pidX is the process id of an HP process) kill -9 pid2

Until they are all dead. That worked. Of course in the process of doing this I discovered that there are a couple HP processes running as root, which disturbs me to no end.

What I'd like to ask Steve Jobs is: How many Mac users would know to drop down to a terminal, su to root, and violently kill the processes? I just can't see your average non-techie Mac user doing that. Apple should really do a better job screening the products it sells with new computers.

Sphere: Related Content

Wednesday, January 02, 2008

Slightly Less Double (Im)precision

Anonymous writes

You may want to try evaluating the polynomial using a different form. For example, given the polynomial: A*x^3 + B*x^2 + C*x + D one is often tempted to just enter it "as is". However it can also be expressed this way: ((A*x+B)*x+C)*x+D Generally speaking polynomials often work better using floating point this alternate way. Some programming languages/systems know this and convert the expression before evaluating it.

...and he is absolutely right. It makes the code of heck of a lot more efficient, too, because it eliminates the calls to Math.pow. That being said, it does not completely fix the problem. The polynomial line is a lot smoother, and the fitting algorithm yields a lower degree polynomial for the same mean error, but I still think the results are too fuzzy compared to the higher precision floating point. Here's a graph to show the difference:

Compared to the previous result:

Further improvement could probably be obtained by taking a close look at the QR Decomposition algorithm used to do the least-squares fitting.

In my opinion, the problem here is not so much the double-prevision floating points are bad. They are not. For many applications, especially with carefully crafted algorithms, they are great. They are certainly much higher performance than their higher-precision object-oriented kin. I'll warn you: Don't do high precision matrix operations with your MacBook in your lap - it gets really hot. The double-precision version finishes in a blink. It also takes a several orders of magnitude longer than using doubles. The problem is that, as an abstraction for real numbers, doubles are extremely leaky. Of course, this could be extended to any fixed-prevision, floating-point representation of numbers, depending the application.

Basically, I think in most applications doubles represent a premature optimization. Higher-precision numbers should be used by default, and then the precision reduced in order to improve performance, rather than doubles being used by default and then higher-precision numbers being considered if the programmer realizes that he has a problem due to lack of precision.

Of course, the problem is I don't know how precise is enough, because it depends entirely on the application. I'm tempted so say that, whenever possible, exact representations should be used. I've done a little research into it, and I'll do some more. There's tons of papers on the subject, but everything I've encountered so far seems to require intimate knowledge of floating points, the algorithm using the, and possibly the data being fed into the algorithm. That could help with library functions, such as matrix decomposition and solving, which could automatically scale up the precision of their internal operations in order to meet the expected resulting precision of the calling code, but that would still be leaky for "user-implemented" algorithms. What I really want is something that will work in the general case with reasonable performance characteristics, which can then be tuned for specific applications by reducing precision.

Sphere: Related Content

Open Source, Cost, and Enterprise Product Adoption

This is a relatively common topic, and today is was raised on TSS, as a result of a blog by Geva Perry:

Are developers and architects becoming more influential in infrastructure software purchase decisions in large organizations?

...with an implied "as a result of open source software." It's an interesting question, and I think it can be generalized to: What effect does license costs have on the acquisition of enterprise software?

In considering this question, it is important to remember that:

  1. In large organizations, money often comes in many colors, such as: expense, capital, and internal labor
  2. The level of authority an individual has depends on both the amount and the color of the money involved
  3. Certain colors of money are easier to obtain than others, and sometimes it varies dependent on the amount
  4. Accounting rules, both standard and self imposed, effect what can and cannot be purchased with a given color of money
In a nutshell, the budgeting process for large organizations can be extremely idiosyncratic. Not only do the official processes vary, but individual personalities and budget cycles can have profound effects.

So the first effect the cost of a piece of enterprise software has is to limit the various buckets of money that can be used to pay for it. However, this can be a very complex process. Let's assume any given piece of software typically has the following types of cost:

  1. One-time license cost (both for the application and support infrastructure, such as the OS and DBMS)
  2. Recurring maintenance and support cost
  3. Hardware cost (e.g. a server)
  4. Internal labor for development and deployment
  5. External labor for development and deployment

The lower the costs involved, the less approval is required. Driving down license costs pushes the initial acquisition decision closer to the users and/or developers. This is a big win for open source applications. It's probably a bigger win for application vendors. For example, most enterprise applications require a DBMS, such as Oracle. Oracle is not famous for being cheap. So let's say your potential customer can readily obtain $100k to spend on software licenses. If you are a software application company, do you want that money as revenue, or do you want 2/3 of it to go to Oracle and IBM?

I'll give you a hint. You want a department to be able to deploy your software without cutting a check to the big boys, but you want to be able to say "Yes, we support your enterprise standards" to the big-wigs in the IT department who think that there isn't a major conference for a piece of software, then it shouldn't be on the network. That way your product can be approved, and then the running it on Oracle can be deferred until "usage levels warrant it."

Hardware costs are even more interesting. At my employer, equipment that costs $5k or more is "capital," and less than that is expense. Capital is generally easier to obtain if (1) you know you need it a year in advance, or (2) it's the end of the year and there's still money laying around. It is impossible to obtain at the beginning of the year, when everyone thinks that they will actually spend their allocation, unless of course it was approved last year. Conversely, expense money is much more plentiful at the beginning of the year, when managers are still dreaming of sticking to their budgets, and becomes more and more difficult to obtain as reality sets in. So what's the point? Well, you want your product to require a small enough amount of hardware so that a first or second line manager can purchase it on expense without obtaining approval, but also have a recommended configuration that costs enough to be purchased on capital

This is interesting because big-iron Unix guys will often lament about how their systems have such a lower TCO than x86 Wintel or Lintel systems, so all the arguments about x86 systems being "cheaper" is bunk. What they are ignoring is that it is much easier to spend $5k on twenty small things (plus setup labor on each of those twenty items) than it is to spend $50k on one big item, because the $50k either has to be approved a year in advance or it has to wait until someone else's project falls apart so they can't spend the $50k. The "total" in TCO is completely ignored, because very few people think about the few hours that each of those servers requires to setup.

Now, about labor costs. Managers generally have at least some discretion over how their direct reports spend their time. If you actually think about it in terms of the fully burdened hourly cost of an employee, managers often have significantly more "budget" under their control by their means to direct how time is spent than they do for purchasing licenses and services. Again, this is a big win for open source.

The bottom line is that the best way to get your foot in the door is to have the lowest marginal cost of deployment as possible. I'll offer as evidence the countless wikis that have popped up around my workplace, very few of which are even officially approved.

Of course, this makes you wonder why the marginal cost of deploying a piece of enterprise software tends to be so high. Why aren't more vendors practically giving their software away for small deployments? Well, many do, such SQL Server Express and Oracle XE. But there's still more that don't. The problem is that it's really hard to get the total cost of initial deployment down to below the point where the bureaucracy kicks in, and once in kicks in it helps to be more expensive.

Yes, that's right, I said more expensive.

You see, these days one of the great ways to make your career in IT is to be a good negotiator. The combination of off-the-shelf software and outsources have shifted IT expenses from being dominated by internal labor to being dominated by procurement contracts. However, you don't build and illustrious career by negotiating $10k contracts. Likewise, once you pass a relatively small threshold someone decides that the project needs a "real" project manager, instead of just an "interested" manager or a technical lead, and project managers measure are unfortunately measure more by the size of their projects than the value that they deliver. (yes, that's right, screwing up a multi-million ERP implementation is better for your career than successfully deploying some departmental application under budget and on schedule)

In other words, once the signatures of additional people are required, you have to have something big enough for those people to consider it worth their time. So, if you are a software vendor, or an internal employee planning a deployment project, then you either need to go really small and viral or really big. Medium size projects are simply too hard to push through.

And, in my opinion, that's really a shame, because medium sized projects tend to have the best value proposition. Small ones involve too little of the organization to have a significant impact, and large ones become two unwieldy to meeting objectives. Projects need to be large enough to do things right in terms of technology and engage future users, but small enough have readily apparent benefits and incremental deliveries that provide real value (not simply "look, it's a login screen!").

Maybe it's different in other organizations, but somehow I doubt it. However, I'd be really interested in knowing what others' experiences are.

Sphere: Related Content

Monday, December 31, 2007

Double (Im)precision

Most computer users have, at one time or another, received an odd result from a spreadsheet or other program that performs calculations. Most programmers know that this is because of the impedance mismatch between the most common way for computers to "think" about numbers (base 2) and the way most people and businesses think about numbers (base 10). This issue receives a fair amount of attention, probably because we've all (we being programmers) have had to explain to users why a piece of software can't seem to do arithmetic properly. If you're scratching your head right now, or want to know more, I suggest reading this FAQ about decimal arithmetic.

However, as I've long know but rarely contemplated, there if another form of floating point error that can cause significant problems. Standard floating point numbers only store so many digits of precision. The result is that if you add a very large number to a very small number, the very small number simply vanishes. Let me demonstrate:

scala> 10000000000000000.0 + 1.0
res62: Double = 1.0E16
scala> 

The 1.0 simply vanished, because standard double-precision floating point numbers don't have enough digits to store the 1.0 part of such a large number. Doubles are an approximation, and that's fine, because often times all we're approximating things, anyway, and the roughly 15 decimal digits of precisions provided by a double is plenty, right?

Well, actually, no. It depends on what you're doing with them. A month or so ago my father convinced me that instead of spending my free time playing with AI problems and studying fringe programming languages, I should do something useful like predict stock price movements. Of course I can use AI and fringe programming languages to do this...

The first thing I decided to do was to smooth out stock price history and make it continuous by fitting a polynomial. So I wrote a quick polynomial class, grabbed Jama (a Java matrix library), downloaded some data, wrote a function to search for the appropriate degree, and this is what I got (supposedly with about 4% error):

Hmmm...that looks kind of funny. The blue line is the actual price. The red line is the polynomial, which has a degree of 44. That's a rather large degree, but certainly not enough to generate all those squiggles. Those squiggles are an artifact of double precision numbers not being precise enough to be used in calculating a degree-44 polynomial. They don't work that well for the matrix calculations that produce the polynomial, either.

I think that red squiggly line is a very good illustration of what happens when you don't pay enough attention to how your software works under-the-hood. Anyway, here's what the results look like using floating point numbers (courtesy of JScience) with 80 digits of precision (keeping the error at about 4%).

Looks a lot better, doesn't it? One interesting thing is that the resulting polynomial has a much, much lower degree than what was produced by the same algorithm using double precision numbers. With double precision numbers, 4% error was about as good as it could get with this dataset. However, using the higher-precision numbers it can go much lower. Below is a graph at about 2% error:

At this point you might be wondering why anyone would use double precision numbers for this type of thing, or perhaps thinking I'm a moron for trying. I certainly felt a little stupid. But using doubles is relatively common. Matlab and everyone's favorite Excel use doubles. Many of the other toolkits and libraries that I found do as well.

Overall I think I'm either missing something (because the use of doubles is so common), or I find this very frightening. I'm not an expert in this area. I know the doubles use much, much less memory and computational resources. They are also "just there" in pretty much all programming environments, so they are the default. Making arbitrary precision floating points work right wasn't a cakewalk, either. I spent the better part of a day tracing through my code, only to discover a bug in JScience. Also, once you have everything working, you have to figure out what precision to use. The default of 20 digits precision in JScience wasn't noticeably better than regular doubles.

Sphere: Related Content

Tuesday, December 11, 2007

Data Center Meltdown

Denial and the coming “data meltdown” by ZDNet's Michael Krigsman -- Subodh Bapat, Sun Microsystems eco-computing vice president, believes we’ll soon see the first world-class data center meltdown. According to News.com: “You’ll see a massive failure in a year,” Bapat said at a dinner with reporters on Monday. “We are going to see a data center failure of that scale.” “That scale” referred to the problems [...]

Now let's think about that. How can a datacenter fail?

  1. It can lose power for an extended period. It takes a lot of backup generators to keep a 50 megawatt datacenter humming.
  2. A virus or worm can shut it down.
  3. A natural disaster can destroy it or force a shutdown. Often times this is due to a power failure rather than destruction.
  4. It can have its WAN/Internet connection(s) severed. This isn't quite catastrophic unless you're on the other side of the WAN.

Michael has pointed out that Subodh Bapat doesn't point to the expected cause of a major data center meltdown, he just says one is coming. That's because it's not really a matter of one cause. There are so many risks, and you multiply them by the growing number of data centers, and what you come up with is that there's bound to be a major failure soon. We just don't know what the precise cause will be.

Most of the threats involve a geographically localized destruction or disabling of the data center. This means you need off-site recovery, and it probably needs to be fast. That means you probably need more than recovery, you need one or more hot sites that can take over the load of one that fails. This is extremely expensive for a "normal" data center. I can hardly imagine how much it would cost for a 50+ megawatt facility. Basically what we have is too many eggs in one basket, with economies of scale pushing us to keep on putting more eggs in the same basket.

What surprises me is that Subodh Bapat didn't say, oh, well, Sun has the solution. Considering that Jonathan Schwartz put it forward over a year ago. Ok, well, he didn't exactly suggest Blackbox as a means of easily distributing data centers. But think about it. If you're a large corporation you are probably already geographically distributed. If you expand your data center in cargo-container sized units across the nation (or world), you are half..err..maybe one quarter of the way there. You still have to figure out how to make them be hot sites for each other, or at least recovery sites. But at least losses at any given site would be minimized.

Sphere: Related Content