Showing posts with label IT. Show all posts
Showing posts with label IT. Show all posts

Wednesday, January 12, 2011

Types of Project Failure

I saw this question on StackExchange and it made me think about all the projects I've been on that have failed in various ways, and at times been both declared a success and in my opinion been a failure at the same time (this happens to me a lot). Michael Krigsman posted a list of six different types a while back, but they are somewhat IT centric and are more root causes than types (not that identifying root causes isn't a worthwhile exercise - it is). In fact, when if you Google for "types of project failures," the hits in general are IT centric. This may not seem surprising, but I think the topic deserves more general treatment. I'm going to focus on product development and/or deployment type projects, although I suspect much can be generalized to other areas.

Here's my list:

  1. Failure to meet one or more readily measurable objectives (cost, schedule, testable requirements, etc)
  2. Failure to deliver value consummate with the cost of the project
  3. Unnecessary collateral damage done to individuals or business relationships

I think these are important because in my experience they are the types of failure that are traded "under the table." Most good project managers will watch their measurable objectives very closely, and wield them as a weapon as failure approaches. They can do this because objective measures rarely sufficiently reflect "true value" to the sponsor. They simply can't, because value is very hard to quantify, and may take months, years, or even decades to measure. By focusing on the objective measures, the PM can give the sponsor the opportunity to choose how he wants to fail (or "redefine success," depending on how you see it) without excessive blame falling of the project team.

Subjective Objective Failure

The objective measures of failure are what receive the most attention, because, well, they're tangible and people can actually act upon them. My definition of failure here is probably a bit looser than normal. I believe that every project involves an element of risk. If you allocate the budget and schedule for a project such that is has a 50% chance of being met - which may be a perfectly reasonable thing to do - and the project comes in over budget but within a reasonably expected margin relative to the uncertainty, then I think the project is still a success. The same goes for requirements. There's going to be some churn, because no set of requirements is ever complete. There's going to be some unmet requirements. There are always some requirements that don't make sense, are completely extraneous, or are aggressive beyond what is genuinely expected. The customer may harp on some unmet requirement, and the PM may counter with some scope increase that was delivered, but ultimately one can tell if stakeholders feel a product meets its requirements or not. It's a subjective measure, but it's a relatively direct derivation from objective measures.

Value Failure

Failure to deliver sufficient value is usually what the customer really cares about. One common situation I've seen is where requirements critical to the system's ConOps, especially untestable ones or ones that are identified late in the game, are abandoned in the name of meeting more tangible measures. In a more rational world such projects would either be canceled or have success redefined to include sufficient resources to meet the ConOps. But the world is not rational. Sometimes the resources can't be increased because they are not available, but the sponsor or other team members chooses to be optimistic that they will become available soon enough to preserve the system's value proposition, and thus the team should soldier on. This isn't just budget and schedule - I think one of the more common problems I've seen is allocation of critical personnel. Other times it would be politically inconvenient to terminate a project due to loss of face, necessity of pointing out an inconvenient truth about some en vogue technology or process (e.g. one advocated by a politically influential group or person), or idle workers who cost just as much when they're idle as when they're doing something that might have value, or just a general obsession with sunk costs.

This is where the "value consummate with the cost" comes in. Every project has an opportunity cost in addition to a direct cost. Why won't critical personnel be reallocated even though there's sufficient budget? Because they're working on something more important. A naive definition of value failure would be a negative ROI, or an ROI below the cost of capital, but it's often much more complicated.

It's also important to note here that I'm not talking about promised ROI or other promised operational benefits. Often times projects are sold on the premise that they will yield outlandish gains. Sometimes people believe them. Often times they don't. But regardless, and usually perfectly possible to deliver significant value without meeting these promises. This would be a case of accepting objective failure because the value proposition is still there, it's just not as sweet as it was once believed to be.

Collateral Damage

Collateral damage receives a fair amount of press. We've all heard of, and most of us have one time or another worked, a Death March project. In fact, when I first thought about failure due to collateral damage, I thought any project causing significant collateral damage would also fall under at least one of the other two categories. It would be more a consequence of the others than a type unto itself. But then I thought about it, and realized experienced a couple projects where I feel the damage done to people was unacceptable, despite the fact that in terms of traditional measures and value delivered they were successful. There's a couple ways this can happen. One is that there are some intermediate setbacks from which the project ultimately recovers, but one or more people are made into scapegoats. Another way arises from interactions among team members that are allowed to grow toxic but never reach the point where they boil over. In my experience a common cause is an "extreme" performer, either someone who is really good or really bad, with a very difficult personality, particularly when the two are combined on a project with weak management.

Now, you might be wondering what the necessary causes of collateral damage are. It's actually fairly simple. Consider a large project that involves a large subcontract. The project may honestly discover that the value contributed by the subcontract does not justify its cost, or that the deliverables of the subcontract are entirely unnecessary. In this case the subcontract should be terminated, which in turn is likely to lead to a souring of the business relationship between the contractor and the subcontractor, along with potentially significant layoffs at the subcontractor. Often times a business must take action, and there will be people or other businesses who lose out through no fault of their own.

Back to Trading Among Failure Types

A Project Manager, along with projects sponsors and other key stakeholders, may actively choose which type of failure, or balance among them, is most desirable. Sometimes this "failure" can even really be success, so long as significant value is delivered. Some of the possible trades include:

  1. Failing to meet budget and schedule in order to ensure value.
  2. Sacrificing value in order to meet budget and schedule...
  3. ...potentially to avoid the collateral damage that would be inflicted in the case of an overt failure
  4. Inflicting collateral damage through practices such as continuous overtime in order to ensure value or completing on target
  5. Accepting reduced value or increased budget/schedule in order to avoid the collateral damage of the political fallout for not using a favored technology or process

Ultimately some of these trades are inevitable. Personally, I strongly prefer focusing on value. Do what it takes while the value proposition remains solid and the project doesn't resemble a money pit. Kill it when the value proposition disappears or it clearly becomes an infinite resource sink. But of course I know this is rather naive, and sometimes preventing political fallout, which I tend to willfully ignore in my question for truth, justice, and working technology; has an unacceptable if intangible impact. One of my most distinct memories is having a long conversation with a middle manager about a runaway project, and at the end being told something like "Erik, I agree with you. I know you're right. We're wasting time and money. But the money isn't worth the political mess it would create if I cut it off or forcefully reorganized the project." I appreciated the honesty, but it completely floored me, because it meant not only that politics could trump money, but politics could trump my time, which was being wasted. Now I see the wisdom in it, and simply try to avoid such traps when I see them and escape them as quickly as possible when I trip over them.

Sphere: Related Content

Saturday, January 16, 2010

Changing Tastes

When I was a kid, I hated onions, green peppers, and mushrooms. I used to tell people I was allergic to mushrooms so they wouldn't try to make me eat them. I hated any sort of chunky sauce or really textured meat. I think I wanted everything to have either the consistency of a chicken nugget or ketchup. My parents used to tell me that when I was older my tastes would change. That I liked crappy food, disliked good food, and eventually I would realize it. They were right.

So kids like chicken nuggets and ketchup. Wow, huge revelation. What does this have to do with technology? I'm on the steering committee for an internal conference on software engineering that my employer is holding. I'm the most junior person on the committee, and most of the members are managers who have more managers reporting to them. Our technical program committee (separate, more technical people on it, but all very senior) just finished abstract selection and we've been discussing topics and candidates for a panel discussion. During this process I have been absolutely shocked by how my tastes differ from those of my colleagues.

I've noticed that within the presentations selected topics on process, architecture, and management are over-represented. On the other side, many of the abstracts that I thought were very good and deeply technical fell below the line. I can't quite say there was a bias towards "high level" topics, because I think they were over-represented in the submissions. Given the diversity of technology that's almost inevitable. A guy doing embedded signal processing and a guy doing enterprise systems would most likely submit very different technical abstracts, but ones on management or process could be identical. It's almost inevitable that topics that are common across specialties will have more submissions.

There's a similar story with the panel discussion. I wanted a narrower technical topic, preferably one that is a little controversial so panelists and maybe even the audience can engage in debate. My colleagues were more concerned with who is going to be on the panel than what they would talk about, and keeping the topic broad enough to give the panelists freedom.

What's clear is that my colleagues clearly have different tastes in presentation content than me. I think they are genuinely trying to compose the best conference they can, and using their own experiences and preferences as a guide. I think their choices have been reasonable and well intentioned. I just disagree with many of them. If I had been the TPC chair, I would have explicitly biased the selection criteria towards deeper, technical topics. Those are the topics I would attend, even if they are outside my area of expertise. I would use my preferences as a guide. But that leaves me wondering, in another five years or ten years are my tastes going change? My tastes have certainly changed over the past decade, so I have no reason to believe they won't change over the next. Will I prefer process over technology and architecture over implementation? Will I stop thinking "show me the code!" and "show me the scalability benchmarks!" when see a bunch of boxes with lines between them? I don't think so, but only time will tell. When I was a kid, I would have never believed that I would ever willingly eat raw fish, much less enjoy it, but today I do.


Sphere: Related Content

Tuesday, January 05, 2010

Your Company's App

Tim Bray just posted a blog about how Enterprise IT is doing it wrong.  I can't really argue with that. He goes on to explain that Enterprise IT needs to learn from those companies building Web 2.0, because they deliver more functionality in less time and for a whole lot less money. This is where his argument breaks down. The problem is the type of Enterprise Systems he's talking about aren't Twitter, they're your company's app.

I work at a pretty big, stodgy, conservative company. I'm pretty sure, as far as things like ERP and PLM are concerned, my employer is exactly the type of company Tim is talking about, and like I said - he's probably right about the problem. But based on my own experiences and observations at my one lonely company, I think he's wrong in his suggestions. Comparing ERP and Facebook is like comparing apples and Apple Jacks.

The reason I think this is that in terms of deploying Enterprise 2.0 applications I think my employer has done fairly well. We have...

...and probably more stuff that I'm not aware of yet. Some of the above were bought, some were built, some were cobbled together with stuff from a variety of sources. I think most of them were built and deployed, at least initially, for costs about as competitive as possible with Web 2.0 startups and in reasonable timeframes. Of course, this means intranet scale at a Fortune 100 company (or in some cases a subset of it), not successful internet scale.

The problems the above face is much like the problem your typical startup faces: attracting users, keeping investors (sponsors) happy, dealing with the occasional onslaught of visitors after some publicity. But these problems are very different from the problems a traditional Enterprise Application faces. There is SOX compliance. There are no entrenched stakeholders. There are no legacy systems, or if there are they aren't that important. If only 10% of the potential user community actually uses the application, it's a glowing success. Negligible adoption can be accepted for extended periods while the culture adjusts, because the carrying cost of such applications is low.

But enterprise systems needs to deal with SOX. They have more entrenched stakeholders than you can count. These days there's always at least one legacy system, and often several due to disparate business units and acquisitions. If these systems fail, business stops, people don't get paid, or the financials are wrong. If only 90% of your buyers use your ERP systems to process purchase orders, it's an abject failure and possibly endangering the company.

A year or two ago someone "discovered" that a very common, important record in one of our internal systems had 44 (or something like that) required fields, and decided that this was ridiculous. A team was formed to streamline the processes associated with this record by reducing the number of fields. A detailed process audit was conducted. It turned out that every single one of them played a critical role in a downstream process. All the fields remained, and some old timers smiled knowingly.

As many humorless commenters pointed out on Eric Burke's blog, your company's app is your company's app for a reason, and often it's a good reason. These applications aren't complicated because of the technology. Internet applications are complex due to extrinsic forces - the sheer number of users and quantity of data that can deluge the application at any given moment. Enterprise systems tend to be the opposite. Their complexity is intrinsic due to the complexity of the diverse processes the support. The technology complexity (most of which is accidental) is secondary. Tim's suggestions provide a means of addressing technological complexity, and of building green field non-business critical applications in the face of uncertain requirements. They don't provide a means for dealing with critical systems laden with stakeholders, politics, and legacy.

I think the solution, or at least part of the solution, to the problem with enterprise systems lies in removing much of the functionality from them. There are things that must be right all of them time, but most of business (and engineering, etc) exists on a much fuzzier plane. The problem comes from coupling the precise (e.g. general ledger) with the imprecise (e.g. CM on an early stage design), and thus subjecting the imprecise to overly burdensome controls and restrictions. Only after this separation has been carefully implemented can functionality evolve in an agile fashion.

Sphere: Related Content

Tuesday, April 21, 2009

McKinsey and Cloud Computing

McKinsey has created a tempest-in-a-teapot by denouncing the economics behind both in-the-cloud-cloud such as Amazon E2C and behind-the-firewall clouds for large enterprises.  At a high level I think their analysis is actually pretty good, but the conclusions misleading due to a semantic twist.  They use Amazon E2C as a model, and their conclusions go something like this:

  1. Amazon E2C virtual CPU cycles are more expensive than real, in-house CPU cycles
  2. You waste 90% of those in-house CPU cycles
  3. You'll waste almost as many of those virtual cloud CPU cycles, only they cost more, so they are a bad deal
  4. You stand a decent shot at saving some of those real CPU cycles through virtualization, so you should aggressively virtualize your datacenter
  5. You're too inept to deliver a flexible cloud behind-the-firewall, so don't even try

I'll let you ponder which of the above statements is misleading while I address some related topics.

The goals of cloud computing are as old as computing itself.  They are:

  1. Reduce the time it takes to deploy a new application
  2. Reduce the marginal cost of deploying a new application over "standard" methods
  3. Reduce the marginal increase to recurring costs caused by deploying a new application over "standard" methods

Back in the days of yore, when programmers were real men, the solution to this was time sharing.  Computers were expensive and therefore should be run at as a high of utilization as possible.  While making people stand in line a wait to run their batch jobs was a pleasing ego trip for the data center operators, the machines still wasted CPU time while performing slow I/O operations and waiting in line generally made users unhappy.  Thus time sharing was born, and in a quite real sense the first cloud computing environments, because in many cases a large institution would purchase and host the infrastructure and then lease it out of smaller institutions or individuals.

The problem here is that the marginal cost equations end up looking like a stair-step function.  If you had a new application, and your enterprise / institution had excess mainframe capacity, then the marginal cost of letting you run your application was near zero.  But if there was no spare capacity - meaning the mainframe was being efficiently utilized - then the marginal cost was high because either someone else had to be booted off or you needed an additional mainframe.

Now fast-forward a couple decades to the PC revolution.  Somewhere along the way the cost curves for computers and people crossed, so it became appropriate to let the computer sit idle waiting for input from a user rather than having a user sit idle while waiting for a computer.  Now you could have lots of computers with lots of applications running on each one (although initially it was one application at at time, but still, the computer could run any number of them).  This smoothed out the non-recurring marginal cost curve, but as PCs proliferated it drove up recurring costs through sheer volume.

Unfortunately this had problems.  Many applications didn't work well without centralized backends, and some users still needed more compute power than could be reasonably mustered on the desktop.  So the new PCs were connected to mainframes, minicomputers, and eventually servers.  Thus client-server computing was born, along with increasingly confusing IT economics.  PCs were cheap, and constantly becoming cheaper, but backend hardware remained expensive.  The marginal non-recurring cost becomes completely dependent on the nature of the application, and recurring costs simply begin to climb with no end in sight.

Now fast forward a little more.  Microsoft releases a "server" operating system that runs on suped up PCs an convinces a whole bunch of bean counters that they can solve their remaining marginal non-recurring cost problems with Wintel servers that don't cost much more than PCs.  Now more expensive servers.  No more having to divide the cost of a single piece of hardware across several project.  Now if you want to add an application you can just add an inexpensive new Wintel server.  By this time the recurring cost equation had already become a jumbled mess, and the number of servers was still dwarfed by the PC on every desk, so there no tying back the ever increasing recurring costs.  This problem was then further exacerbated by Linux giving the Unix holdouts access to the same cheap hardware.

Thus began the era of one or more physical servers per application, which is where we are today, with McKinsey's suggestion for addressing: virtualization behind the firewall.  The problem with this suggestion is that, for a large enterprise, it isn't really that different from the cloud-in-the-cloud solution that they denounce as uneconomical.  One way is outsourcing a virtualized infrastructure to Amazon or similar, and the other is outsourcing it to their existing IT provider (ok, not all large enterprises outsource their IT, but a whole lot do).

Virtualization, in the cloud or otherwise, isn't the solution because it doesn't address the root cause of the problem - proliferation of (virtual) servers and the various pieces of infrastructure software that run on them, such as web servers and databases.  Hardware is cheap.  Software is often expensive.  System administrators are always expensive.  Virtualization attacks the most minor portion of the equation.

Virtualization is the right concept applied to the wrong level of the application stack.  Applications need to be protected from one another, but if they are built in anything resembling a reasonable way (that's a big caveat, because many aren't) then they don't need the full protections of running in a separate OS instance.  There's even a long standing commercially viable market for such a thing: shared web hosting.

It may not be very enterprisey, but shared web site/application hosting can easily be had for about $5 per month.  The cost quickly goes up as you add capabilities, but still - companies are making money by charging arbitrary people $5 per month to let them run arbitrary code on servers shared by countless other customers running arbitrary code.  How many enterprise IT organizations can offer a similar service at even an order-of-magnitude greater cost?

Not many, if any.  Yet do we see suggestions pointing out that Apache, IIS, Oracle, SQL Server, and countless other pieces of infrastructure can relatively easily be configured to let several applications share compute resources and expensive software licenses?  Nope.  They suggest you take your current mess, and virtualize it behind the firewall instead of virtualizing it outside the firewall.

Sphere: Related Content

Friday, February 08, 2008

Linguistic Success

There's a new favorite pastime on the fringes of the Scala community. That pastime is blogging about aspects of the Scala language that will prevent it from being a "success" unless they are quickly addressed. "Success" is roughly defined as "widespread commercial use." A good metric might be: "How hard is it to find a job programming Scala?" The premise of the criticism usually revolves around one or more complicated, terse, or "foreign" (relative to the author) constructs that are common Scala, or at least favorites among frequent posters, and how these constructs will prevent "average" programmers from understanding Scala and thereby prevent commercial adoption. A recent favorite is symbolic function names (/: and :\ for folds).

The logic of this argument seems relatively sound. When choosing a technology for a project, it is very important to consider the availability of potential employees who know that technology. Forcing every new member to scale a potentially steep learning curve is a frightening prospect. I can imagine development managers lying awake at night fearing that their expert in some obscure technology will leave, and it will take months to replace him. It's a legitimate, although I think slightly exaggerated, fear.

That being said, I think it has little to do with the adoption of a new language. The majority of programmers simply do not spend their free time learning new languages. Many, maybe most, won't even use their free time to learn languages that they know have ready pre-existing demand. They learn when they are paid to learn, or when failing to learn will lead to immediate negative consequences for their career. I think the same can be said of most professions. Most people work because they have to, not because they want to.

Consequently, I think expanding beyond a core of enthusiasts is very difficult, if not impossible, simply be attracting people to the language. Right now some leading-edge Java people are taking a look at Scala, because they think it might be the next big thing in Java-land. These people are different than enthusiasts. Enthusiasts will learn a language for the sake of learning it. The leading-edge folks learn it as a high risk investment. If they can get a head start on the next big thing, it will be great for their careers (and businesses). These people constantly think "can I sell using this technology?" and "if I do sell it, while it come back to haunt me?" This is a very pragmatic perspective, and it is the perspective I take when I'm at work.

Confusing, odd-ball language features make the sell a lot harder. Pitching them as features increases personal risk.

But it doesn't matter.

Why? Because the vast majority of developers are not going to learn a new language because they want to, they are going to learn it because they have to. Not to mention that there are countless languages out there, so going after the enthusiasts and leading-edgers (who are mostly lookers anyway) is just fishing in an already over-fished pond.

So how does a language become a success?

Enough people use it for real projects. Let's say a consultant rapidly prototypes an application using the technology, and that application makes it into production. Now maintenance programmers have to learn that technology. Sure, it's complicated, but unlike the guy learning it on free weekends, the maintenance programmers have all-day every-day. It's hard to learn complex concepts a hour or two at a time, but these guys have all day, and the next day. It's hard to memorize something by looking at it a couple hours a week, but spend all day staring and it and it will click. Not to mention that their livelihoods depend on it. Any sort of "cowboy" development team can cause this to happen, and frankly such teams are pretty common.

So maybe one-in-five maintenance programmers actually like the technology, and admire the cowboys, so when they go get a chance to do new development, they use it, too.

The same thing can happen with products from startups. Let's say a startup builds a piece of enterprise software using Scala. They sell it to big, conservative companies by emphasizing the Java aspect. They sell customization services, too. And then it's back to the maintenance programmer, who has no choice.

Notice a pattern? The key to language success is making it powerful enough for a couple cowboys to do the work of an entire team in a shorter period of time. Selling fast and cheap is easy. If you have enough fast and cheap, the business people won't care if you are making it out of bubble-gum and duct-tape, because you are giving them what they want.

The key to success is making the reward justify the risk. Judging by what some people using Scala for real-world projects are saying, and my one hands-on experience, I think Scala offers it. It's just a matter of time before it sneaks its way into enterprises, just like Ruby has.

Sphere: Related Content

Wednesday, January 02, 2008

Open Source, Cost, and Enterprise Product Adoption

This is a relatively common topic, and today is was raised on TSS, as a result of a blog by Geva Perry:

Are developers and architects becoming more influential in infrastructure software purchase decisions in large organizations?

...with an implied "as a result of open source software." It's an interesting question, and I think it can be generalized to: What effect does license costs have on the acquisition of enterprise software?

In considering this question, it is important to remember that:

  1. In large organizations, money often comes in many colors, such as: expense, capital, and internal labor
  2. The level of authority an individual has depends on both the amount and the color of the money involved
  3. Certain colors of money are easier to obtain than others, and sometimes it varies dependent on the amount
  4. Accounting rules, both standard and self imposed, effect what can and cannot be purchased with a given color of money
In a nutshell, the budgeting process for large organizations can be extremely idiosyncratic. Not only do the official processes vary, but individual personalities and budget cycles can have profound effects.

So the first effect the cost of a piece of enterprise software has is to limit the various buckets of money that can be used to pay for it. However, this can be a very complex process. Let's assume any given piece of software typically has the following types of cost:

  1. One-time license cost (both for the application and support infrastructure, such as the OS and DBMS)
  2. Recurring maintenance and support cost
  3. Hardware cost (e.g. a server)
  4. Internal labor for development and deployment
  5. External labor for development and deployment

The lower the costs involved, the less approval is required. Driving down license costs pushes the initial acquisition decision closer to the users and/or developers. This is a big win for open source applications. It's probably a bigger win for application vendors. For example, most enterprise applications require a DBMS, such as Oracle. Oracle is not famous for being cheap. So let's say your potential customer can readily obtain $100k to spend on software licenses. If you are a software application company, do you want that money as revenue, or do you want 2/3 of it to go to Oracle and IBM?

I'll give you a hint. You want a department to be able to deploy your software without cutting a check to the big boys, but you want to be able to say "Yes, we support your enterprise standards" to the big-wigs in the IT department who think that there isn't a major conference for a piece of software, then it shouldn't be on the network. That way your product can be approved, and then the running it on Oracle can be deferred until "usage levels warrant it."

Hardware costs are even more interesting. At my employer, equipment that costs $5k or more is "capital," and less than that is expense. Capital is generally easier to obtain if (1) you know you need it a year in advance, or (2) it's the end of the year and there's still money laying around. It is impossible to obtain at the beginning of the year, when everyone thinks that they will actually spend their allocation, unless of course it was approved last year. Conversely, expense money is much more plentiful at the beginning of the year, when managers are still dreaming of sticking to their budgets, and becomes more and more difficult to obtain as reality sets in. So what's the point? Well, you want your product to require a small enough amount of hardware so that a first or second line manager can purchase it on expense without obtaining approval, but also have a recommended configuration that costs enough to be purchased on capital

This is interesting because big-iron Unix guys will often lament about how their systems have such a lower TCO than x86 Wintel or Lintel systems, so all the arguments about x86 systems being "cheaper" is bunk. What they are ignoring is that it is much easier to spend $5k on twenty small things (plus setup labor on each of those twenty items) than it is to spend $50k on one big item, because the $50k either has to be approved a year in advance or it has to wait until someone else's project falls apart so they can't spend the $50k. The "total" in TCO is completely ignored, because very few people think about the few hours that each of those servers requires to setup.

Now, about labor costs. Managers generally have at least some discretion over how their direct reports spend their time. If you actually think about it in terms of the fully burdened hourly cost of an employee, managers often have significantly more "budget" under their control by their means to direct how time is spent than they do for purchasing licenses and services. Again, this is a big win for open source.

The bottom line is that the best way to get your foot in the door is to have the lowest marginal cost of deployment as possible. I'll offer as evidence the countless wikis that have popped up around my workplace, very few of which are even officially approved.

Of course, this makes you wonder why the marginal cost of deploying a piece of enterprise software tends to be so high. Why aren't more vendors practically giving their software away for small deployments? Well, many do, such SQL Server Express and Oracle XE. But there's still more that don't. The problem is that it's really hard to get the total cost of initial deployment down to below the point where the bureaucracy kicks in, and once in kicks in it helps to be more expensive.

Yes, that's right, I said more expensive.

You see, these days one of the great ways to make your career in IT is to be a good negotiator. The combination of off-the-shelf software and outsources have shifted IT expenses from being dominated by internal labor to being dominated by procurement contracts. However, you don't build and illustrious career by negotiating $10k contracts. Likewise, once you pass a relatively small threshold someone decides that the project needs a "real" project manager, instead of just an "interested" manager or a technical lead, and project managers measure are unfortunately measure more by the size of their projects than the value that they deliver. (yes, that's right, screwing up a multi-million ERP implementation is better for your career than successfully deploying some departmental application under budget and on schedule)

In other words, once the signatures of additional people are required, you have to have something big enough for those people to consider it worth their time. So, if you are a software vendor, or an internal employee planning a deployment project, then you either need to go really small and viral or really big. Medium size projects are simply too hard to push through.

And, in my opinion, that's really a shame, because medium sized projects tend to have the best value proposition. Small ones involve too little of the organization to have a significant impact, and large ones become two unwieldy to meeting objectives. Projects need to be large enough to do things right in terms of technology and engage future users, but small enough have readily apparent benefits and incremental deliveries that provide real value (not simply "look, it's a login screen!").

Maybe it's different in other organizations, but somehow I doubt it. However, I'd be really interested in knowing what others' experiences are.

Sphere: Related Content

Tuesday, December 11, 2007

Data Center Meltdown

Denial and the coming “data meltdown” by ZDNet's Michael Krigsman -- Subodh Bapat, Sun Microsystems eco-computing vice president, believes we’ll soon see the first world-class data center meltdown. According to News.com: “You’ll see a massive failure in a year,” Bapat said at a dinner with reporters on Monday. “We are going to see a data center failure of that scale.” “That scale” referred to the problems [...]

Now let's think about that. How can a datacenter fail?

  1. It can lose power for an extended period. It takes a lot of backup generators to keep a 50 megawatt datacenter humming.
  2. A virus or worm can shut it down.
  3. A natural disaster can destroy it or force a shutdown. Often times this is due to a power failure rather than destruction.
  4. It can have its WAN/Internet connection(s) severed. This isn't quite catastrophic unless you're on the other side of the WAN.

Michael has pointed out that Subodh Bapat doesn't point to the expected cause of a major data center meltdown, he just says one is coming. That's because it's not really a matter of one cause. There are so many risks, and you multiply them by the growing number of data centers, and what you come up with is that there's bound to be a major failure soon. We just don't know what the precise cause will be.

Most of the threats involve a geographically localized destruction or disabling of the data center. This means you need off-site recovery, and it probably needs to be fast. That means you probably need more than recovery, you need one or more hot sites that can take over the load of one that fails. This is extremely expensive for a "normal" data center. I can hardly imagine how much it would cost for a 50+ megawatt facility. Basically what we have is too many eggs in one basket, with economies of scale pushing us to keep on putting more eggs in the same basket.

What surprises me is that Subodh Bapat didn't say, oh, well, Sun has the solution. Considering that Jonathan Schwartz put it forward over a year ago. Ok, well, he didn't exactly suggest Blackbox as a means of easily distributing data centers. But think about it. If you're a large corporation you are probably already geographically distributed. If you expand your data center in cargo-container sized units across the nation (or world), you are half..err..maybe one quarter of the way there. You still have to figure out how to make them be hot sites for each other, or at least recovery sites. But at least losses at any given site would be minimized.

Sphere: Related Content

Sunday, December 09, 2007

Why isn't enterprise software sexy?

Robert Scoble asks: Why isn't enterprise software sexy?

A couple of the Enterprise Irregulars and several commentors respond: Because it's supposed to be reliable and effective, not sexy.

I think the reasons are more fundamental than that. Consider:

  1. Most enterprise software customers have mutual non-disclosure agreements with the software vendors.
  2. Most bloggers have day jobs
  3. Most companies do not want their employees violating the terms of contracts.
  4. Most companies do not want their employees airing the company's dirty laundry in a public forum
  5. Many of the most interesting pieces of information surrounding enterprise software involve dirty laundry.

Personally, I have two ground rules for blogging:

  1. Keep it professional. No personal matters, politics, etc.
  2. Keep my employer, coworkers, and my employer's customers and suppliers out of it.

Let's face it. Most IT and software development projects don't go right. The commercial software that you use doesn't perform half as well as the salesman said it would. Consultants have mixed motives, and more importantly aren't miracle workers. The internal team if often over extended and working outside their core area of expertise. Goals are unclear, completely undefined, or changing on a regular basis. Politics are everywhere on all sides.

This is the reality of business, and talking about it in the general case is OK. People read it, nod their heads, and this "Been there, done that, doing it again right now." But specifics are quite different. Specifics can lead to embarrassment, contract violations, and lost sales.

More people don't blog about enterprise software because it strikes too close to home. I don't think it has anything to do with whether enterprise software is sexy or not.

Update:

It just occurred to me that I probably wasn't very clear here on a few points. What is mean by "enterprise software" is enterprise application software, like SAP, Oracle's various applications, PeopleSoft, etc. I don't mean infrastructure software like DBMSes, application servers, operating systems, etc. There is plenty of good information freely available on infrastructure, and people blog about it all the time.

Also, if you look at a lot of the blogs that are out there for, for example, SAP, they are almost all too high level to be particularly useful. It's pretty easy to find platitudes about how you need the right team, buy-in, executive sponsorship, etc and how those (or a lack of them) caused an implementation to succeed or fail. That's all (for the most part) true but everyone already knows it. But there's not a lot of people (that I know of, please post links in the comments if I am wrong) out there posting technical discussions about implementations. Google for "ABAP programming blog" (ABAP is the programming language for SAP), and then do the same for Scala and Lisp. I bet there are more people earning their living off of ABAP than Scala and Lisp combined. Probably an order of magnitude more. So why aren't there more people writing about it? Ok, so Scala and Lisp are both interesting languages. So do the same for Ada and COBOL.

Update: December 10, 2007: Feedback on enterprise software

Enterprise software does receive feedback. Often times significant amounts in forms much better thought out than most blogs. The difference is that the feedback is private between the customer and the software provider. Occasionally it is shared among customers through conferences or other types of "user group" meetings.

If the software supplier will listen (including acting), then this can work fairly well for software that is already deployed. The problem is that there is no information solid information available to support purchasing decisions. There are no readily available sources of "tips and tricks" or "common pitfalls" for deployment or customization efforts. For example someone could write:

We've spent the last 3 months trying to make Foomatic run on top of Oracle. All the sales material touts the Oracle support. But then your deployment tanks, the consultants come in, and they ask you why the heck you aren't using SQL Server. The software is developed against SQL Server and then "ported" to Oracle, and the Oracle port never works right.

Fill in your favorite or least favorite infrastructure products there, the name of an enterprise application, and there you have some useful information. The sales material lies. Once you deviate from their default stack you are screwed. That's the type of information that would be useful - before buying the software or before trying to run it on your "enterprise standard." Not from an over-priced consultant.

Sphere: Related Content

Monday, September 17, 2007

The Tar Pit

The editor of TSS has decided to run a series discussing The Mythical Man Month, by Frederick Brooks. Hopefully it will produce some good discussion. There are a lot of Agile advocates that hang out on TSS that really could use to (re)learn some of the lessons of software development. I make a point of rereading it every few years lest I forget the lessons learned by computing's pioneers. The first chapter - The Tar Pit - contains on of my favorite concepts, as illustrated by the graphic below (slightly changed from original): Programs Let me explain. A program is what we all have developed. It's simple piece of software that is useful to the programmer and/to some set of users who are directly involved in defining its requirements. Most bespoke departmental and a substantial portion of enterprise applications fall into this category. They are sufficiently tested and documented to be useful within their originating context, but once that context is left their usefulness breaks down quickly. In addition, they are not solidly designed to be extensible and certainly not to be used as a component in a larger system. Obviously this is a range, and I've really described a fairly well developed program - one almost bordering on a programming product. That script you wrote yesterday to scan the system log for interesting events that has to be run from your home directory using your user account in order to work is also just a program. Programming Products Moving up the graph, we hit programming product. In theory, all commercial applications and mature bespoke applications are programming products. In practice this isn't really the case - but we'll pretend because they are supposed to be and I increased the standard over what Brooks originally described. The big challenge with programming products is that, according to Brooks, they cost three times as much to develop than simple fully-debugged programs yet they contain the same amount of functionality. This is why it's so hard to get sufficient budget and schedule to do a project right. The difference between a solid piece of software and something just cobbled together is very subtle (you can't tell in a demo) yet the cost difference is quite astounding. Consequently, I think most commercial applications are released well before they hit this stage, and bespoke ones require years to mature or a highly disciplined development process to reach this point. Programming Systems Programming systems are programs intended to be reused as parts of larger systems. In modern terms, they are libraries, frameworks, middleware, and other such components that are all the rage in software development. Like programming products, programming systems are thoroughly tested, documented, and most importantly are useful outside of the context in which they were created. And, like programming products, according to Brooks they take three times as long to develop as a regular program. Developing programming systems for bespoke applications or niche use can be a tar pit all its own. For one, many programmers like building libraries and frameworks. The problems are more technically challenging, and there is no strange-minded user to consider. The programmer and his colleagues are the user. Programming systems are relatively common in groups that execute a lot of similar projects and/or that contain programmers who really want to build components. Programming System Products Amazingly, programming systems products are relatively common - even if there really aren't that many of them. As you've probably guessed, a programming system product has all the traits of both a programming product and a programming system. It is useful to a wide range of users and can be effectively extended and/or embedded for the creation of larger systems. It has complete documentation and is extensively tested. Where are these wondrous things? Well, you are using one right now (unless you printed this). Your operating system is one. It both provides useful user-level functions and a huge amount of infrastructure for creating other programs. MS Office is one as well, because it has a pretty extensive API. Most commercially developed enterprise systems should be programming system products, because:

  1. They provide off-the-shelf functionality for regular users
  2. Customers always customize them
  3. They often must be integrated with other products
  4. Simply integrating their own components would go better with a programming system
The problem is that they are not, because of: The Tar Pit Brooks didn't explicitly write this definition of The Tar Pit but I think he would agree. Put yourself in the position of a development manager at a startup or in a larger company about to launch on a new product. On on hand, you want to make the product as good as possible. You know that what you develop today will serve as the base for the company/product line for years to come. It needs to be useful. It needs to be extendable. It needs to be thoroughly tested and documented... It needs to be cheap and delivered yesterday. The differences between a programming system product and a simple programming product are far more subtle than the differences between a program and a programming product. But the programming system product costs a full NINE TIMES as much to develop as the program with essentially the same "outward functionality" - at least if you are a sales guy or a potential customer sitting in a demo. I think this is the struggle of all engineering teams. If the product is long lived, doing it right will pay major dividends down the line. But it can't be long lived if it is never released. It stands a worse chance if it comes out after the market is flooded with similar products (actually, that's debatable...). The ultimate result is a mish-mash of tightly coupled components that, as individuals, fall into one of the lesser categories but as a whole fall down. There is a documented API, but the documentation isn't really that good and the application code bypasses it all the time. The user documentation is out-of-date. Oh, and the application isn't really that general - hence why all the major customers need the API so they can actually make it work. Escaping the Tar Pit Ok, so if you develop a large system you are destined to fall into the tar pit because cheap-and-now (well, overbudget and past schedule) will override right-and-the-next-decade. You need a programming system product, but budget and schedule will never support much more than a program. So how do you escape it? Accept It Products that give the outward impression of being far more than they are often sorely lacking in conceptual integrity. If you are building an application - build it right for the users. Remember you can build it three times for the cost of building a programming system product. Maybe by the third time there will be budget and schedule for it. Or maybe, just maybe, you can evolve your current system. But pretending will just make a mess while wasting significant amounts of time and money. Partition It Some pieces of your system are probably more important than others. There are places where requirements will be volatile or highly diverse amoung customers - those are the places where you need a truly extensible system. You should also be able to reuse strong abstractions that run through your system. The code associated with those abstractions should be top-notch and well documented. Other pieces just need to be great for the users, while a few that remain need to be convenient for yourself to extend or or administrators. Open Source It Find yourself developing yet another web framework because what exists just isn't right? Open source it. This isn't really my idea. It's what David Pollak of CircleShare is doing with the lift web framework for Scala (actually, I'm guessing at David's reasoning, I could be wrong). The infrastructure for your application is essential, but it isn't your application. It is what Jeff Bezos refers to as muck. You have to get it right, but it's distracting you from your core mission. So why not recruit others with similar needs to help you for free? That way you don't have completely give up control but also don't have to do it alone. Theoretically the same could be done for applications. Many large customers of software companies have significant software development expertise - sometimes more than the software companies. I think it would be entirely feasible for a consortium of such companies to develop applications that would better serve them than commercial alternatives. But I have yet to convince someone of that...

Sphere: Related Content

Tuesday, August 14, 2007

Business Engagement and IT Project Failure

CIO.com recently ran an article by the CIO of GE Fanuc regarding "functional engagement" (i.e. "the business" or more commonly "the users") and project failure. Michael Krigsman later posted a more succinct summary on his ZDNet blog. Here's a even shorter version:

  1. Make sure a single business-side has significant capability, responsibility, and authority for project success.
  2. Don't short-circuit important parts of the project
  3. Make that person understands the "laws of IT" and defends them to his peers, subordinates, and superiors
#1 is just plain common sense. #3 amounts to establishing a scape goat for the inevitable failure caused by short-circuiting appropriate systems engineering or architectural activities. Notice that there is neither a definition of "failure" nor a definition of "success." I think it can be inferred that he's defining "failure" as "exceeding project or maintenance budget and/or schedule." Not once does he mention delivering real innovation to the business or even real value - just avoiding causing massive amounts of pain. Consider the following statement:
Enterprise platforms like Siebel, Oracle and SAP are not intended to be heavily customized. When going from a niche, custom application to Siebel, you need a strong functional leader to push back on every “Yeah, but” statement. They must start at zero and make the business justify every customization. Saying “no” to customization is bitter medicine for your business partners. They will make contorted faces and whine ad nauseum. But it is for their own good.
Let's think for a moment. Your customer (internal or external) wants to spend millions of dollars implementing a CRM (or ERP, PLM, etc.) package. Earlier, it went to the trouble of building one from scratch. That probably means it considers CRM to be an extremely important part of its business, and it expects to derive a competitive advantage from having a shiny new system. The best way to be competitive is to copy what everyone else is doing, right? Also, repeatedly receiving "no" as an answer will really make your customer want to take an active role in the implementation, right? Hmmm....somehow I don't think so. Introducing a strong impedance mismatch between the organization and the software it uses is failure. Standing in the way of innovation is failure. Mr. Durbin wants you to fail. There is actually a simple, albeit occasionally painful, solution to this problem: don't buy an off-the-shelf application that neither meets the business's requirements nor can be cost-effectively extended to meet them. First, you have to understand your business processes and requirements. I mean really understand them, not just understand the official process documentation that no one follows. All those winces and "yes buts" are requirements and process details that absolutely must be understood and addressed. Second, you do a trade study. This is a key part of systems engineering. Replacing enterprise systems is expensive, often prohibitively expensive, so do a thorough analysis. Take some of your key users to training sessions and write down every wince and vaguely answered question, because those are all either issues that will stand in the way to delivering value or will require customization. Finally, keep your pen in your pocket. Don't be tempted to write checks, buy licenses, and sign statements-of-work before your really understand the product. Just ignore those promises of discounts if you get a big order in before the end of the quarter. The next quarter isn't that far away, and the end of the fiscal year may even be approaching. The discounts will reappear then. Instead, make sure the vendor really understands the requirements and business process, and structure any contracts in a way that they must be met or the vendor will face significant penalties. It's amazing what comes out of the woodwork when you do this. All of a sudden that 3x cost overrun from the original projection is sitting right there in front of your eyes in the form of a quote, all before you've sunk any real money into the project. The purpose of engaging "the business" in an IT project is not to create a personal shield for all the deficiencies in the system you are deploying. It is to make sure you identify those deficiencies early enough in the project cycle to avoid cost and schedule overruns.

Sphere: Related Content

Friday, August 03, 2007

Minimal Software Development Process

It's not uncommon for me to find myself in debates regarding what constitutes a sufficient software development process. On one side, there are the Agile folks who argue that code-and-fix, with user representatives determining what should be fixed. Of course they have lots of rules to regulate the code-and-fix, some of which are quite sensible and some of which are quite dubious, to make it appeal to closet process engineers. On the other side you have old-school developers who believe in such rarities as requirements signed in blood and requiring human sacrifice to alter. Ok, so perhaps I'm exaggerating a little bit. But that's how it often feels. So I'm going to propose my own process that is every bit as simple as what the Agile crowd pushes while being more all-encompassing than traditionalists. Here it is:

  1. Define the problem
  2. Define the solution
  3. Apply the solution
  4. Validate the result
Now you are probably thinking that this is nothing novel. I'm just using weird words for:
  1. Develop requirements
  2. Design and Code to the Requirements
  3. uhh....Test?
  4. Customer Acceptance Testing
Wrong! Even if you were more creative than me, or pulled out RUP's phases or anything like that. But that's expected, as I rarely see steps 1 and 4 completed. Let me explain. Define the Problem This is where most projects get in trouble. Their problem definitions looks kind of like this:
  • We need to rollout a corporate standard ERP system.
  • We need to web-based foo tracking database accessible to the entire corporation.
  • We need an automated workflow for the bar process.
I could go on-and-on. Sometimes these are followed by phrases like "will save X million dollars" or "will reduce cycle time by X%." Really the problem is someone said that a competitor was saving money by applying an automated workflow to the bar process in order to track foos in the ERP system with a web-based frontend, the company at hand doesn't have one, so that's a problem. Anyway, often times these statements are really masking genuine problems such as:
  • My old college roommate is an ERP salesmen and wants a new boat
  • We have a ton of foo, but no one really knows where it is or if it's being used. So we keep on buying more foo, even though we probably have enough. The problem is when we find so foo, the people with the foo always claim that they need all of it, even though often times they clearly aren't using it. We have some data, but it's massive and impossible to interpret. We need a way to find unused foo and prove that it is unused so that we can transfer it where it is needed.
  • Some cowboys in department X keep on ignoring the bar process. They think they are heros because it saves time and money upfront, but really they just end up creating costly, time consuming problems down the line (for me). I need a way for force them to follow the bar process, but it can't be too hard otherwise they'll convince upper management to let them ignore it.
So why is the difference important? Don't you just end up with the first set of statements as your project charter, anyway? No. A couple months ago I faced a situation similar to the second item. A consultant had (unfortunetly) convinced a couple senior managers that they wanted a big, fancy database integrated to half of our internal systems and requiring a whole bunch of data maintenance. Fortunately the consultant also directed them to me. They had tons of data, and had spent countless hours fiddling with spreadsheets trying to turn it into actionable information. Having failed, they decided they needed a huge database that would cost hundreds of thousands of dollars to develop and then require staff to keep up-to-date. They also needed something in about a couple weeks. So I poked a proded until I finally understood what they needed to know, what data they had, and what they needed to decide based on that data. Then I wrote a few hundred lines a Python to analyze the data and make pretty graphs, along with a couple dozen lines of VBA to stick the outputs of the Python program into a PowerPoint presentation. They were thrilled with the result. Hundreds of thousands of datapoints were transformed into actionable charts that even the most impatient executive could correctly interpret. This took me about 2 weeks effort. Their original requirements would have taken a couple man years effort to implement, and the result would not have solved their problem. Traditionalists would have wasted the time to implement the requirements (which were actually fairly well developed), or at least a portion of them. Agilists would have fiddled around for a while and achieved the same result. Now I'll admit that on the majority of projects it's the other way around. Understanding the problem makes that cost of the solution grow by an order-of-magnitude, rather than shrink. My guess is only 1 in 4 can actually be simplified by understanding the problem, and 2 in 4 become significantly more complex. But solid estimates that can be tied to solid business cases are extremely important. Delivering a cheap solution that doesn't deliver value is a waste of money. In my experience development teams assume that "the customer" or "the business" already understand their problem and are defining requirements that will solve it. When in reality, the problem is usually vaguely understood at best, and the description of requirements is every bit a design activity as coding is. Define the Solution This is where most of the traditional software engineering activities occur. You have a customer (or marketing) who has given you some high level requirements defining the general shape of the system, and then you go about gathering more detailed requirements, followed by architecture, design, code, and test. Or maybe you are agile so you do all of those activities at once and only bother writing down code (in the form of application code and test code). Either way, knowing the problem really helps. Some people probably object to lumping all of these activities under one heading because they take so much time. I would agree, but they rarely done entirely sequentially. Prototyping is every bit as valid of a method for eliciting requirements as interviews. Sometimes it is a lot more effective. Also, there are strong feedback loops among all of the elements. So really, they are all done pretty much at the same time. It just happens that requirements kick off the process and testing finishes it up. Others would object because "the completed system is the solution." Well, no. It's not. You don't really know if you have a solution until after you've deployed and run the system long enough for the business to adjust itself. Apply the Solution This is just another way to saying "deploy," plus all the other things you have to do like training. If you think of it at a really high level (too high for effective engineering), the organization is the problem, and you apply the software to the organization to see if the organization gets better. Validate the Result This is where you measure the impact of deploying the software on the problem so see if indeed the problem has been solved. I don't think anyone will disagree that a piece of software can meet all of its requirements and fail to make a positive impact. So you need to measure the impact of the deployed system. In practice, success is declared as soon as a piece of software that meets "enough" of its requirements is rolled out. This puts the organization in a very bad position, because if the software subsequently fails to deliver value, then the declaration of success and those who made it are called into question. In most cases the organization will just end up either ignoring the system or limping along until enough time has passed to call the system a problem.

Sphere: Related Content

Monday, July 16, 2007

Sun Ray Thin Clients

Last week I made a comment on Paul Murphy's blog about how the thin-ness of Sun Rays is really up to interpretation. Today he's decided to dedicate an entire blog to explaining why I'm wrong, because he figures if I have an incorrect understanding of Sun Rays, then a lot of people have an incorrect understanding of Sun Rays. He's probably right, although I don't think my understanding is that far off base, and he's been kind enough to let me see a draft copy of his blog so I can get a head start on the response. Here's what I said:

Smart, Thick, Thin, Display

It's all word games. Depending on how you define "processing," there is processing going on. It still has to render graphics, translate keyboard and mouse events, etc. A SunRay is just a compacted Sun workstation of yesteryear without a harddrive and special firmware designed to work solely as an X-Windows server.

The problem is the attempt to make "smart displays" seem more fundamentally different from other similar solutions just muddies the waters. People like me groan because yet another term has been introduced that means almost the same as other terms that will need to be explained to the higher-ups. The higher-ups get confused and either latch onto it or, more likely, have their eyes glaze over.

Anyway, enough with our industry's incredible ability to make sure words are completely meaningless...

The problem with Sun Ray and other similar solutions is that they are really a local optimum based on today's technology and practices for a relatively narrow range of priorities. Change the priorities and the solution is no longer optimum. Introduce distributed computing techniques with the same low administrative overhead and they lose out entirely.

As far as I can tell, the first part is technically accurate. Older Sun Rays ran a 100Mhz UltraSparc II, had 8mb of RAM, and ran a microkernel. See here and here. Newer ones use an even beefier system-on-a-chip.

So the Sun Ray client is obviously processing something, and actually has a fair amount of processing power. Just because it is not maintaining any application state, doesn't mean it's not doing anything. Murph asserts that a Sun Ray is not an X-Terminal, but he'll have to explain the difference to me. He could be right...I don't know. It's been about 7 years since I've used a Sun Ray, but from what I remember it felt just like using Exceed on a PC under Windows, which is quite common at my employer. He did mention this:
Notice that the big practical differences between the Sun Ray and PC all evolve from the simplicity of the device in combination with the inherently multi-user nature of Unix. In contrast the differences between the Sun Ray and X-terminal arise because the X-terminal handles graphics computation and network routing -making it more bandwidth efficient, but marginally less secure.
But the Sun Ray quite clearly has a graphics accelerator and talks over the network, so while there is probably a subtle difference in there that I'm not grasping, it doesn't seem particularly marterial. But that's not really the meat of the debate, it's just a technical quibble over what consitutes processing and an operating system. He's dilluting the debate by calling Sun Ray's "smart displays" instead of "thin clients" and thus drawing a false dichotomy, and I'm doing the same by pointing at internal technical specs that have little to do with actual deployment. The real debate is: "Where should processing take place?" I'll give you a contrite answer - as close to the data as possible. Any computation involves a set of inputs and a set of outputs. It makes no sense to shuttle a million database rows from a database server to an application server or client machine in order to sum up a couple fields. It makes much more sense to do it where the data is, and then ship the result over the network. Likewise, if you have a few kilobytes of input data and several megabytes/gigabytes of results, it makes sense to do the computation wherever the results are going to be needed. So this is my first issue with the centralized computing paradigm. Right now I'm typing this blog in Firefox on Linux, and my computer is doing a fair amount of work to facilitate that interaction with Blogger. I've also got a dozen other Windows open. Most of the memory and CPU I'm consuming is dedicated to the local machine interacting with me, the local user. Only a couple pages of text are being exchanged back-and-forth with blogger. So why not let the Sun Ray run Firefox (and an email client, a word processor, etc.)? The new ones have the processing power. They probably would need $100 worth of RAM or so to keep a stripped-down Unix variant in RAM, which could be loaded from the network. Intelligent configuration could make the client smart about whether to run an app locally, on a server, or on an idle workstation down the hall. Murph gives seven reasons: 1. portability Murph asserts that with Sun Rays you gain portability, because you can halt a session one place and immediately resume it another place. I don't doubt that is true, but I don't see any technical reason why the same could not be accomplished with a distributed architecture. All that happens is your terminal becomes the processing server for a remote application. Remember, in Unix, there isn't a fundamental difference between a client and a server. I'm not going to address the laptop debate right now. Murph has made some very good arguments against laptops in the past based on the security concerns of them being stolen, despite strong encryption. I think he underestimates the value of laptops and is probably wrong, but there are a substantial number of people who could live with a "portable terminal" because their homes and hotels have sufficient bandwidth. 2. reliability This is where the distributed model really shines. In my experience, networks just are generally one of the less reliable portions of the computing environment, especially WANs and my own internet connection. A pure thin-client solution simply stops working when the network goes down. In the past, Murph has asserted that everyone needs network connectivity to work, so this doesn't matter. But in my opinion most professionals can continue working for several hours, possibly at reduced productivity, when disconnected from the network. That buys time for IT to fix the network before the business starts bleeding money in terms of productivity. Keeping processing local, along with caching common apps and documents, increases the effective reliability of the system. 3. flexibility Murph lists nothing that cannot be done with a locally-processing workstation. 4. security Don't use x86 workstations, especially running Windows. The security gains are from a more secure operating system on a processor architecture designed for security and reliability. Eliminating permanent storage from the client does buy some security, because there is then no way to walk out the door with all the data, but distributed processing doesn't preclude centralized permanent storage. There are, of course, substantial advantages to having local storage, like being able to make a laptop that can be used in an entirely disconnected fashion. But I think that's a separate debate. 5. processing power There's nothing about a distributed computing model that says you can't install compute servers. Heck, this is done all the time with Windows (both to Windows servers and more commonly to Unix servers). Murph's example of a high-performance email server has nothing to do with the thin-client architecture, and everything to do with properly architecting your mail server. 6. cost There aren't significant cost savings in terms of hardware when switching to Sun Rays. Hardware is cheap, and you can throw out a lot of pieces in the common PC to reduce the cost. In fact, I bet Sun Rays cost more because of the servers. I don't doubt that when effectively administrated they cost less to keep running than a Windows solution, but that's mostly because of Unix. I'll admit that it is probably cheaper to administer Sun Rays than my distributed model because I think it will require greater skill and discipline (meaning higher paid admins), so in abscense of detailed numbers I'll say it's a wash. 7. user freedom This is partially a consequence of using Unix instead of Windows, and mostly a consequence of changing culture. So as I said before, Sun Rays, and centralized computing in general, represent a kind of a local optimum for a given solution and today's practices. But I don't think they make a solid generalized approach. Distributed computing can be successfully with all the advantages of Murph's Sun Ray architecture using today's technology, it just isn't common. Now I've ignored the elephant in the room: Much essential software only runs on Windows, and the minute you introduce Windows into the mix (local or centralized), you start compromising many of the advantages outlined above. Of course, what good is a computing environment if it won't run the desired software? Consequently, I think it will be a long time before anything like this flies in most enterprise environments.

Sphere: Related Content

Wednesday, June 13, 2007

Software Engineering and IT Professionalism

A few weeks ago I was reading the chapter on Philip Greenspun and ArsDigita in Founders at Work and some of his comments really rang true for me. On a number of occasions I've been in debates with advocates of both agile and traditional software development methods, as well as people from other IT-related disciplines. In my opinion, the argument usually boils down to this question:

How important is it that an IT Profesional take responsibility for ensuring his work delivers value to his customer/client/employer in a cost effective manner?
No one would argue with the idea that it is important to deliver value. No one would argue that it is important to be cost effective. So why would this be such a contentious topic? Well, I think it boils down to an old addage to predates information technology by an awfully long timer...
The customer is always right.
The glaring problem with this is that he is not always right. Not by a longshot. If he were, he would most certainly not be hiring you to apply your brain to create something for him. He would simply apply his omnipotence to quickly creating the perfect software, running on the perfect operating system, on failure-proof hardware. Ok, yes, we all know that. The customer has hired us for our technology expertise. He knows what he needs, and we know how to create it. Right??? Nope. He doesn't. And no one is surprised by that. No self-respecting text on requirements development starts with "The customer/user knows exactly what he needs. All the analyst needs to do is take detailed notes, and the customer/user will precisely, consistently, and completely describe the application that will be his needs." It's all about how users don't know what they want, they have conflicting desires, they tell you what they think you want to hear instead of what they really mean, and so on. I doubt there's any disagreement so far. So what did Mr. Greenspun say?
We had this idea that programmers could be professionals, like doctors or lawyers, and, to that end, we wanted the programmers to be real engineers - to sit down face to face with the customer, find out what was needed, come up with some suggestions or changes based on the programmer's experience with similar services, and then take a lot of responsibility for making it happen.
So what does this mean? Here's my take:
  1. A professional will tell a customer when he has reason to believe that the customer needs something different from what he is asking for.
  2. A professional will inform his customer when he believes his services will not deliver value in a cost effective manner.
  3. A professional will speak in the customer's language, or at least close enough so that the customer can understand.
  4. A professional will take the time to understand the customer's need, not just the customer's request.
  5. A professional will express "engineering trades" in terms that the customer can understand
  6. A professional will make commitments in terms of cost and schedule, and take responsibility for meeting them.
Most "IT Professionals" I know regularly practice at least a couple of these. Very few practice all. Most don't have much trouble trying to make helpful suggestions, and are very conscious of cost and schedule. They try their "best" to communicate effectively, but usually are too concerned with their own specialty than learning the basics of their customers'. Very rarely will they state that their services are not really required, especially in consulting or contracting situations. The end result is exchanges kind of like this:
Customer: We would like our customers to be able to order replacement parts and lookup technical information securely over the internet. Under no circumstances do we want unauthorized people ordering parts or viewing information. Do you have anything for that? IT Guy: We have this piece of off-the-shelf software called WebSphere. It can be used to create an integration hub among enterprise applications. It can also be used to host portals, and can be combined X and Y to implement federated security using two-factor authenitcation and strong encryption.
So let me translate into something the customer would actually understand:
IT Guy: We know of the technology required to do what you want, and have some of the infrastructure already in place. However, we will have to build the website, which we call a "portal," along with the software to connect is to Z, which is our internal system that handles orders. Fully automating these transactions will cost an arm and a leg, and data regarding who is authorized to order parts and access data will have to be extremely well maintained to prevent inapprioriate transactions. The last thing you want is for the factory to build something that the customer won't pay for, because the person who ordered it shouldn't have been able to. Do you think it would be better if we enabled customers to communicate with internal representatives in a highly structured way? Perhaps if that worked well we could add more automation later on. This would be both safer and much less expensive. What do you think? Customer: Oh my God! Of course I want a real person to be involved.
Now the IT Guy just passed up a resume building project in favor of something rather unglamorous. I can think of many other situations where customers have asked for applications dependent on data that didn't exist, direly flawed mathematics, grossly unrealistic assumptions about intended users, and highly exaggerated availability needs stated by customers who didn't want their application to seem unimportant. Faced with this problem, many IT providers hide the techie from the customer. Instead of placing someone who can't communicate in front of the customer, they place someone with no technical skills in front of the customer and have that person communicate with the techies. The problem is this is just miscommunication via proxy. In other words, it is worse. The problem is that professionals are inherently at least partially generalists, while the IT industry has a nasty habit of trying to create line workers. A lawyer can be disbarred for pursuing a frivolous lawsuit (ok, the bar the frivalous is pretty low, but it can still happen). A doctor can be sued for malpractice and lose his license for ordering inappropriate treatments. Entire accounting firms can be destroyed for allowing financial numbers to be misrepresented. These are professionals. They are responsible for their clients, even if it means acting against their client's immediate wishes. But a programmer who hacks together a barely functional piece of software is rewarded for being efficient. An architect who designs a system full of unncessary components is respected for his experience. A system administrator who bypasses proper security to make something work is heralded as a hero. Massive cost and schedule overruns while under delivering functionality are considered perfectly normal. IT people are actively encouraged to be like 15 minute oil change technicians: Suggest a bunch of unneccesary work, while failing to properly do the basics like clean the windshielf or vacuum the floor mats. So more often than not we are not true professionals. Philip Greenspun was right. IT people tend to avoid professionalism, and it is exactly what we need. Innovation in IT can only be achieved by effectively applying technology in novel ways to business problems. All the easy problems have mostly been solved. The only way future innovation can be attained is through a close, effective working relationship with customers. But you can see how that isn't happening. More and more work is being sent offshore. Job posting focus on very specific techincal skills while ignoring the business context. New methodologies place make developers slaves to what customers think they want rather than providers of what customers need. New "innovations" are mostly ways of delivering the same old services - or often diminished services - in a less expensive way. Maybe the tide will turn. I'm sure it will eventually. Progress must progress. Let's hope we can speed it up.

Sphere: Related Content