Sunday, July 13, 2008

Love, Hate, and Type Inference

Ever since I started using Scala, I've had somewhat of a love-hate relationship with type inference.  On the local level, type inference can make code much more concise, easier to read, and easier to refactor.  It makes code easier to read because in cases where the type of variable is obvious, type annotations just add noise that distract from the meaning of the code.  It makes code easier to refactor, because often times changes to the type of something in one place in a program can ripple throughout the rest of the program through a simple recompile, assuming the new type supports the methods being used on the original type.  This leads to something akin to structural typing without the associated overhead.

However, at a more macro level I think type inference can make code much more difficult to read.  For example, consider the following code from the Scalax library (full source here):

//  Scalax - The Scala Community Library
//  Copyright (c) 2005-8 The Scalax Project. All rights reserved
abstract class InputStreamResource[+I <: InputStream] extends CloseableResource[I] {
    def buffered : InputStreamResource[BufferedInputStream] =
        new InputStreamResource[BufferedInputStream] with Wrapper {
            type Handle = BufferedInputStream
            override def wrap(is : SelfHandle) = new BufferedInputStream(is)
            
            override def buffered = this
        }

    def slurp() = for (is <- this) yield StreamHelp.slurp(is)
    
    /* Obtains a Reader using the system default charset. */
    def reader =
        new ReaderResource[Reader] with Wrapper {
            type Handle = Reader
            // XXX: should be UTF-8 by default instead of OS default
            // practically, here in Russia I never used default charset
            override def wrap(is : SelfHandle) = new InputStreamReader(is)
        }
    
    def gunzip =
        new InputStreamResource[GZIPInputStream] with Wrapper {
            type Handle = GZIPInputStream
            override def wrap(is : SelfHandle) = new GZIPInputStream(is)
        }
    
    /** Obtains a Reader using the supplied charset. */
    def reader(charset : String) = {
        // Do this lookup before opening the file, since it might fail.
        val cs = Charset.forName(charset)
        new ReaderResource[Reader] with Wrapper {
            type Handle = Reader
            override def wrap(is : SelfHandle) = new InputStreamReader(is, cs)
        }
    }
    
    def lines = reader.lines
    
    def lines(charset : String) = reader(charset).lines
    
    def readLines() = reader.readLines()
    
    def readLine() = reader.readLine()
    
    def pumpTo[O <: OutputStream](osr : OutputStreamResource[O]) {
        // Note InputStream should be opened before OutputStream
        for (is <- this; os <- osr) StreamHelp.pump(is, os)
    }
}

This is an example of where type inference makes code easier to refactor yet more difficult to read.  If one were to change the return type of the reader methods, or the subsequent lines and/or readLines methods on that type, then the return types of these methods would automatically change on recompile.  However, now try to figure out the return types of the lines and readLines methods.  In order to do that, you need to know what the return type of the reader method is, and the structure of that type. Figuring out the return type of reader is reasonably straight forward, as it is defined in the same.  However, the base class for the return type is not, so in order to figure it out you need to trace through several class definitions and potentially in other source files.  I doubt this is a big deal for those people who are intimately familiar with the code, but I pity the new guy who has to sort it out when he's trying to create a new subclass of this abstract class.  Of course, there's always the API docs, so users of the code, and the poor new guy, do have a place to turn.

So that's Scala, which supports a relatively limited form of type inference.  Now, let's consider a snippet of OCaml, which has much fuller type inference, that I stole from Mauricio Fernandez:

   (* Copyright (C) 2008 Mauricio Fernandez  http//eigenclass.org
      Solution to the coding challenge at
      http://beust.com/weblog/archives/000491.html *)
   open Printf
   module S = Set.Make(struct type t = int let compare = (-) end)
   let (--) set elm = S.remove elm set
  
   let permutations f zero digits len =
     let base = S.cardinal digits in
     let rec aux n digits = function
         0 -> f n
       | len -> S.iter (fun s -> aux (n * base + s) (digits -- s) (len-1)) digits
     in S.iter (fun s -> aux s (digits -- s) (len - 1)) (digits -- zero)
 
   let () =
     let max = 10_000_000_000 in
     let digits = List.fold_right S.add [0; 1; 2; 3; 4; 5; 6; 7; 8; 9] S.empty in
  
     let count = ref 0 and prev = ref 0 and maxj = ref 0 and base = ref 0 in
     let report () = printf "Found %d numbers under %d.\n" !count max;
                     printf "Max jump: %d (%d -- %d).\n" !maxj !base (!base + !maxj)
   
     in try
       for i = 1 to 10 do
         permutations
           (fun num ->
              if num >= max then raise Exit;
              (* printf "%d\n" num; *)
              incr count;
              let jump = num - !prev in
                if jump > !maxj then (maxj := jump; base := !prev);
                prev := num)
           0 digits i
       done;
       report ()
     with Exit -> report ()

It's not entirely fair that I am picking on this OCaml code, because I don't know the language.  Also, this is a short, self-contained program that I personally would be likely to write in Python without any type annotations at all.  So in a way, for short programs like this, I think full type inference is great.  You get all of the protection of static typing with none of the hassle.  That being said, I think it would be extremely difficult to approach a large code base that is this devoid of type annotations.

All this adds up to my love-hate relationship with type inference.  I love it when I'm writing my own code, and I hate it (aside from local variables) when I am reading other people's code.  Over the past months I've decreased my use of it in Scala, preferring instead to explicitly specify types wherever they would not be entirely obvious from looking at the code, even though all my Scala code is hobby-work so no one else has to read it.  Of course, being hobby work, I often go days, weeks, or even months without looking at a chunk of code, so I often need help remembering what I wrote.  Overall I would say that excessive reliance on type inference obfuscates code, and that there are plenty ways for programmers to obfuscate their code, so it is a style issue as opposed to a language design issue.  I also think it will be interesting to see how many dynamic language people migrate over to languages with strong type inference as these languages gain more attention.  As you can see from the OCaml code above, there is no extra "typing burden" placed on the programmer.  If the program is correctly typed, it will compile and run quickly.  If not, it won't.  In theory this should satisfy the "type annotations are too much work/add too much noise to my program" camp, but not the "I don't want the compiler telling me what I can and can't do" camp.  My guess is that more people fall into the latter than the former, even the ones that claim they don't, but only time will tell.

Update: I just had a thought. Try thinking of type inference like using pronouns in natural languages. Imagine talking to a person who rarely uses pronouns. It's slow, cumbersome, and occasionally makes you think the speaker thinks you are an idiot who can't remember what he said 30 seconds ago. Likewise, when someone speaks entirely in pronouns, especially when they use a pronoun to refer to something that isn't part of the immediate conversation, it leads to utter confusion. Context is key, and the compiler can usually keep a much more distant context in its working memory than a programmer can. Type annotations provide context. Type inference assumes context.

Update 2: Here's a link to some OCaml API docs, where, just like in Scala, you can clearly see the inferred types.

Sphere: Related Content

Saturday, June 28, 2008

Cedric's Code Challenge

Cedric Buest issued a interesting little challenge this morning:

Here is an interesting coding challenge: write a counter function that counts from 1 to max but only returns numbers whose digits don't repeat.

For example, part of the output would be:

  • 8, 9, 10, 12 (11 is not valid)
  • 98, 102, 103 (99, 100 and 101 are not valid)
  • 5432, 5436, 5437 (5433, 5434 and 5435 are not valid)

Also:

  • Display the biggest jump (in the sequences above, it's 4: 98 -> 102)
  • Display the total count of numbers
  • Give these two values for max=10000

He welcomed brute-force solutions, but really the challenge here is in coming up with something more efficient and elegant. There are basically three general approaches:

  1. Run through all the numbers from 0 to n, test each for no repeating digits, and track the above statistics while you do it. This is the brute force method.

  2. Permute the digits in a fashion that generates numbers sequentially and track the statistics. Alternatively you could generate them in any order, sort them, and then calculate the statistics.

  3. Derive a heuristic that proves a given sequence of numbers will all contain repeated digits and can therefore be skipped.

I think #2 is probably the ideal fashion, but I didn't think of it until I was mostly done coding #3.

Finding Repeated Digits

The first step in solving this problem, no matter what the approach, is to come up with an algorithm to detect repeated digits. Commenters on Cedric's blog came up with a number of ways to do this, most of which centered around converting the integer into a string and then finding repeated characters. This is a rather frighteningly inefficient approach. There is certainly no need to convert the number into a string in order to know its digits. A much simpler approach is allocate a ten element array of booleans initialized to false, and start generate the digits from lowest to highest by moding the number by ten. The first time you encounter a digit, you flip it's associated array element to true. The second time, you exit because you have detected a repeat. The array is essentially serving as a thrifty man's map. Here it is in Scala:

  def containsRepeatedDigit(i: Long): Boolean = {
    val digits = new Array[Boolean](10) // elements default to false
    def f(i: Long): Boolean = {
      if (i == 0L) false // all digits have been processed
      else {
        val d = (i % 10L).asInstanceOf[Int]
        if (digits(d)) true
        else {
          digits(d) = true
          f(i / 10L)
        }
      }
    }
    if (i < 11L) false else f(i)
  }

The Heuristic

Consider the number 2201. It has repeating digits. The question is: What's the next number without repeating digits? It is 2301. You could calculate it using brute-force, but you'd end up scanning an extra 99 numbers. Notice that the repetition is in the upper digits. This means that you will cannot get a number with non-repeating digits until the second digit (counting from the left) changes. Now consider the number 2200. In this case changes need to occur in both the lower digits and the upper digits, however addressing the upper digits allows us to skip a much larger section of the search space. Finally, consider 22200. In this case, you still want the second digit. However, you are searching from the right, so algorithms what detect the first repeat won't work. Here's Scala code to find the appropriate digit. Notice that it looks very similar to the repeated digit test above.

  def max(array: Array[Int]): Int = {
    def f(idx: Int, m: Int): Int = {
      if (idx == array.length) m
      else if (array(idx) > m) f(idx + 1, array(idx))
      else f(idx + 1, m)
    }
    f(1, array(0))
  }

  def repeatedDigit(i: Long): Int = {
    val prevIdx = new Array[Int](10)
    val recentIdx = new Array[Int](10)
    def f(i: Long, idx: Int) {
      if (i > 0) {
        val d = (i % 10L).asInstanceOf[Int]
        if (recentIdx(d) > 0) prevIdx(d) = recentIdx(d)
        recentIdx(d) = idx
        f(i / 10L, idx + 1)
      }
    }
    f(i, 1)
    Math.max(max(prevIdx), 0)
  } 

Now that we have an algorithm for finding the highest digit that needs to be changed, we need one that will take that information and turn it into the next possible number containing no repeating digits. This simply requires basic arithmetic.

  def nextPossibleNonRepeating(i: Long): Long = 
        nextPossibleNonRepeating(i, repeatedDigit(i))

  def nextPossibleNonRepeating(i: Long, idx: Int): Long = {
    if (idx == 0) i + 1L
    else {
      val p = Math.pow(10.0, (idx - 1).asInstanceOf[Double]).asInstanceOf[Long]
      val r = i % p
      val d = p - r
      i + d
    }
  }

Given this, it is easy to generate a sequence:

  def nextNonRepeating(i: Long): Long = nextNonRepeating(i, repeatedDigit(i))
  def nextNonRepeating(i: Long, idx: Int): Long = {
    val p = nextPossibleNonRepeating(i, idx)
    val d = repeatedDigit(p)
    if (d == 0) p else nextNonRepeating(p, d)
  }

Solution

Once this is all done, the solution is pretty straight-forward. It takes the general form of the function use to generate the next number with non-repeating digits, only it has to keep track of a bunch of extra information.

  def printNonRepeatingReport(start: Long, stop: Long, last: Long, gapLow: Long,
                              gapHigh: Long, cnt: Long): Unit = {
    if (start > stop) {
      println("max: " + last)
      println("max gap: " + (gapHigh - gapLow) + " between " + 
              gapLow + " and " + gapHigh)
      println("count: " + cnt)
    } else {
      val d = repeatedDigit(start)
      if (d == 0L) {
        //println(start)
        val gap = start - last
        val (gl, gh) = if (gap > (gapHigh - gapLow)) (last, start) 
                       else (gapLow, gapHigh)
        printNonRepeatingReport(start + 1L, stop, start, gl, gh, cnt + 1L)
      } else {
        printNonRepeatingReport(nextPossibleNonRepeating(start, d), stop, last, 
                                gapLow, gapHigh, cnt)
      }
    }
  }

Results

I'm not going to list all the numbers here, just the statistics for 1-10,000:

  • max: 9,876
  • max gap: 105 between 1,098 and 1,203
  • count: 5,274

Of course I haven't checked it against a brute-force solution or other posted solution, so I owe a beer/coffee/tea, depending on your persuasion, to anyone who can point out a bug and provide the solution.

Just for kicks, here's to 10,000,000,000:

  • max: 9,876,543,210
  • max gap: 104,691,357 between 1,098,765,432 and 1,203,456,789
  • count: 8,877,690

Which took 1 minute 23 seconds on my MacBook. Try that with a brute-force approach.

Sphere: Related Content

Monday, June 23, 2008

Google AppEngine

There's a lot of buzz out there about how Google AppEngine is a game changer. My question is: Which game? AppEngine reduces the cost of hosting a web application down to zero. Zero is a pretty profound number. You don't even need to provide a credit card number in case they might want to charge you some day. All you need to a mobile phone number capable of receiving text messages. This means that not only is there no up-front cost, but also that there is no risk that you will suddenly incur huge fees for exceeding transfer quotas when you are lucky enough to be Slashdotted. Your application will just be temporarily unavailable. Hey, if that service level is good enough for Twitter, it's good enough for you, right?

The Open Source Game

Starting an open source project has been free for years. Many project hosting communities exist, providing projects with source code management, issue tracking, wikis, mailing lists, other features, and even a little visibility within their directories. Web hosting is not exactly expensive, especially light-weight scripting languages like PHP, Perl, and Python; but it still costs something and therefore requires a sponsor. Often times for a fledgling project the sponsor is also the founder, who also happens to be the lead programmer, help desk, and promoter. There are some who do an amazing job of doing this, but for the most part I think project founders choose to wait.

Note: If I am wrong about the availability of good, free hosting for open source web applications, please provide links to them in the comments section. I would love to be proven wrong on this.

The Startup Game

If Paul Graham were to comment on AppEngine, it probably be that it eliminates one of the last remaining excuses anyone may have to launching a web startup. If you have an idea, some time, and can hack Python – you can launch a web application with AppEngine. You don't need money, infrastructure, long-term commitment, or any of those other things that may scare a person away from a startup. With AdSense, you even monetize your application without hardly any effort.

Of course there are limitations. For one thing, AppEngine is very limited in what it can do. Pretty much any idea with even moderate bandwidth, storage, or computation requirements that cannot be delegated to another web application (e.g. embedding YouTube functionality) is out. So, unless you plan on building a business based on mashing together existing applications, then AppEngine probably is not your free lunch. That being said, I fully expect that Google will gradually add APIs for all of its applications to AppEngine, thereby providing a very powerful base for new-but-thin ideas.

The R&D Game

Just for a moment, let's say there was a company that required all of its employees to spend 20% of their time working on a project other than their primary job. This is a company that is betting that it cannot predict which idea is going to be the next big thing, so it must try lots and lots of things and see what sticks. As this company grows, providing infrastructure for those projects would become a real challenge. Especially providing infrastructure that allows them to be testing in the wild as opposed to simply internally, and allows the projects to instantly scale if they just happen to turn out to be a success. This company would also want to ensure their employees weren't slowed down by having to deal with muck. Such a company would need an infrastructure that could scale to support many, many applications without burdening infrastructure teams. Such a company would need AppEngine.

Now extend the idea further. Let's say it doesn't really matter whether the next big thing was developed by an employee or not. What matters is that the next big idea is bound to the company, for example by using the company standard authentication mechanism or, more importantly, the company standard monetization mechanism.

Ok, so we all know what company I'm talking about. AppEngine allows Google to outsource some of their R&D at very low cost, and given that most successful AppEngine developers will probably choose AdSense to monetize their creations, Google stands to profit regardless of whether they ever pay any fees or not. In cases where creators do pay hosting fees, the great Google gets paid twice.

The Recruitment Game

Distinguishing among potential recruits is very challenging. The accomplished academic is almost certainly smart, may fall apart when asked to work with a significant team on something important to the company rather than a research interest. The industry veteran may have great corporate experience, but political skills could be masking shallow or outdated technical skills. The bottom line is recruiting is hard because in most cases you never see a direct sampling of an individual's work. At best you can see what a team he was on produced and take at educated guess as to his contribution. Open source projects can provide more information, but for most programmers there is no real motivation to participate in such projects.

AppEngine provides more motivation to the programmer, because he can more readily show his creation to other people without incurring any cost and there is always a chance that he will make some money. There are probably a lot of talented “almost founders” out there who would start a company, but perhaps lack some of the personality traits necessary to do so or found themselves in a situation where they need a steady income stream that simply isn't available for the founder of an early-stage startup. These individuals, and others, will make great pickings for Google.

Conclusion

Long term, in order for Google to grow, it has to attract more people to spend more time on sites displaying AdSense advertisements. Over the past few years Google has come out with countless new online services, most of which on still in beta, and none of which has yielded anything close to the monetization potential of their search business. AppEngine allows them to vicariously service the long tail without continuously expanding their already highly diverse R&D efforts. As a nice added bonus it will provide the opportunity to acquire future high-value startups before they are even real startups by simply hiring the developers and maybe tossing some stock options their way. On the flip side, I don't think AppEngine is going to have much effect on mainstream web application hosting. The API is too constrained and for a company with resources the ties to Google are most likely too tight. So I predict AppEngine will be more of a muted success for Google. The infrastructure built for it will be useful internally, it will help them get more sites up more quickly addressing eclectic needs without burdening employees, and it will provide a proving ground for future hires and possibly very early stage acquisitions. This could all add up to AppEngine being a very significant success, but it also means the success may out of the public eye.

Sphere: Related Content

Wednesday, June 18, 2008

What does I/O bound really mean?

There's a lot of old folk wisdom out there that justifies the use slow languages and runtimes on the basis that the impact on performance doesn't matter. Here's a sampling of them:

  • Most applications are I/O bound so the performance of the programming language doesn't matter

  • The database does all the heavy lifting, so performance of the application code doesn't matter

  • Computer time is cheap compared to programmer time

Like most folk wisdom, these all have an element of truth. They are often brought up in discussions about the merits of strongly-typed compiled languages versus dynamic languages, and natively compiled languages versus virtual machine based languages versus interpreted languages. I could write about any one of them, and many more, but today I'll address I/O because of the “work” I've been doing on the WideFinder 2 project.

WideFinder 2

The original WideFinder project was, at least on first inspection, quite clearly I/O bound (well, if you had an input file smaller than your main memory...) The WideFinder 2 is a little better because it is a little more computationally complex, but not by much. The benchmark is really simple: process a big (42 GB) web server log file and report some basic statistics. In order to perform the benchmark, the program must parse the file, store information from each line on several maps, and finally extract some statistics out of them.

Tim benchmarked sequential block input on the test system at just shy of 150M/s. This means that the I/O alone required to process the 42GB file should take about 5 minutes, meaning that if the benchmark truly is I/O bound then the program shouldn't take much more than 5 minutes to run. Consequently, if we are to judge based on Tim's reference implementation of the WF2 in Ruby then it quite clearly isn't I/O bound.

I/O Takes CPU, too

If you look closely at the Bonnie benchmark results you'll notice that doing raw block I/O – meaning just reading files into a buffer – consumes almost 80% of the a single core of the CPU. That means that I/O alone comes pretty close to being bound by the CPU as well as being bound by the disk. In fact, experimentation uncovered that placing the system high load reduces maximum I/O throughput. In order to achieve maximum throughput, you actually have to ensure that you keep one core free to manage the I/O.

Application I/O is more than I/O

Another key factor is that what most application programmers think of as I/O involves a lot more than shuttling bits from disk into memory. For example, the Java and other unicode based platforms have to decode the character encoding of the file into the native character encoding of the runtime. In the case of the JVM, this not only requires that every byte be processed, but also frequently doubles the memory consumption of each character and requires the data be copied into a separate buffer. Furthermore, application code usually deals with files line-by-line in the form of some standard (often immutable) string object, thereby requiring yet another pass over the data. So when your code is simply iterating over lines in a file, it isn't really just doing I/O. A fair amount of CPU/memory bound work is required to deliver those nice, neat, unicode strings.

Large datasets change the rules

Memory management isn't free, and even with an industrial strength garbage collector it isn't always simple. A common practice with immutable objects is to have them share some internal state in order to avoid excess data copying. For example, when you take a substring of a Java string, the two string objects share an underlying character array. Often times this saves memory, and it changes generating a substring of a string from a linear time and space operation (with respect to string length) to a constant time and space operation. Unfortunately if the substring is much longer lived that the original string, and especially if it is much smaller, you end up with something that feels awful lot like a memory leak (I think it could be debated whether it is a leak or not).

Even if you're not carrying around a lot of excess baggage in the form of substrings, processing gigabytes of data consumes a lot of memory. In the case of WF2, a little bit of pretty much every line needs to be stored for the calculations at the end. The cast majority of my “debugging” time for WF2 was spent figuring out what JVM parameters will actually allow the program to finish without slowing to a crawl (pre-tuning there were points were it would spend 90% of its time in GC) or consuming every bit of memory available (at least a few WF2 solutions hog tons of memory).

All this means that when dealing with large files other parts of the system need to do more work, which takes away time that could be spent on “real” processing or on I/O (which we've already seen can be a CPU hog). Furthermore, I/O routines and routines commonly used along side heavy I/O (like regular expressions) must be very careful about their memory usage.

So is WF2 really I/O bound?

It depends. Go take a good hard look at the WF2 results page. If you look at Tim Bray's results you would conclude that no, WF2 clearly is not I/O bound. However, if you look at the top you'll see some very impressive numbers that indicate that WF2 indeed is I/O bound (side note: I really need to take a hard look at OCaml). Of course, you could argue that even in the OCaml case it really is CPU bound, because making the task take about as long as the required I/O saturated 8 cores and 32 hardware threads. Scanning down the list would seem to indicate that in most cases I/O does not represent a significant bottleneck, but then it's hard to really tell. The disk may not be the bottleneck, but the I/O routines within the libraries or runtime may be. Consequently, from the application programmer perspective, WF2 is I/O bound.

Redefining “I/O bound” and future impacts

For a long time “I/O bound” primarily referred to hardware or possibly operating system limitations. That was a useful definition, but it is time for it to change. Most software being developed today has a very tall stack of abstractions sitting between it and the hardware. Operating systems schedule I/O and have to split limited resources among many competing processes. Virtual machines sit on top of operating systems, isolating the programmer from the underlying OS and hardware. Libraries and frameworks isolate the programmer from the virtual machine and even other libraries and frameworks. I/O from the programmer's perspective is, or at least should be if he is working on top of good abstractions, that I/O is “everything that happens between when my program requests some data and when it receives it.” Consequently, libraries and runtimes should go to great lengths to ensure that being I/O bound is as close to its original meaning as possible. Prior to multicore systems becoming pervasive that was largely true, but today's I/O libraries fail to take advantage of them, and consequently force I/O into being a bottleneck when it should not be.

That's why in my Widefinder submissions I've worked to separate parallelized I/O into a library-like module. It quite clearly is not done, but it is relatively successful. I reused my the parallel I/O code that I developed for WF1 on WF2 without changing any of the logic. It doesn't provide many features, and it could use a fair amount of optimization and simplification (the line joining logic is an atrocity), but it works.

Sphere: Related Content

Wednesday, April 09, 2008

Multiprocess versus Multithreaded...

...or why Java infects Unix with the Windows mindset.

Recently Paul Murphy, the king of the Sun zealots, blogged about Java bringing the Windows mentality to Windows, all the while slamming Java. In response, John Carrol, a Microsoft employee, rose to the defense of Sun's self-declared crown jewel. Talk about weird.

The funny thing is they are both right, although Murph's arguments are pretty weak.

A little history

Unix and Windows evolved with a very different definition of what the primary unit of isolation should be. On Windows, it is (or was) the node. Each Windows user (and DOS user before him) occupied exactly one node. The worst that could happen is the user destroys his own workspace, so interactive performance reigned supreme over system integrity. You have a node. You have a user. The node does what the user wants as fast as it can. Initially this applied to running a single application at a time, then to allowing several to be open at once but with the one in the foreground receiving primary resources, and finally to allow several applications to run simultaneously. Multithreading reigned king because it was lower overhead and focused on making that foreground process more responsive. Threads were optimized, while processes were neglected.

Unix evolved to be fundamentally multiuser, and its primary unit of isolation is the process. Unix systems were intended to be shared, so it was important that one user could not dominate over another. Furthermore, and slew of processes (daemons) all ran as the same under the same account, while providing services to multiple users, so in order for users to share processes must share. Unlike on Windows, one process crashing the entire system was not acceptable, because that would destroy multiple users' data. As a result, processes were designed to represent a strong level of isolation and heavily optimized to make sure people used it. Threads were largely ignored, or simply treated as processes with a shared heap space, because several cheap processes could simply be chained together to accomplish the same thing in a simpler manner.

The Unix Way

I want you to consider good old-fashioned CGI programs for a moment. Imagine one written in C. First, you may think "Oh my God, running a web application in a non-managed environment. The resource leaks! The memory leaks! The memory consumption of all those processes! Oh the horror!." Of course, you would be wrong. Repeating launching and terminating a Unix process is dirt cheap. Especially a simple program written in C. The OS will cache an image of the executable in memory which can be shared among invocations. The individual process can leak all the resources it wants, because as soon as it terminates all the resources will be automatically freed by the OS, not matter how incompetent the programmer. If the process fails to terminate your friendly neighborhood sysadmin can kill it without hurting any other process.

This method works for producing super-available applications despite incredibly crappy code. I've seen it, both in the for of CGI and in the form of much more sophisticated applications. It works. Users get upset about lost transactions, but the application as a whole almost never goes down.

Enter Java

Java took cheap Unix processes and made them expensive. To compensate, it provided primitives for multithreading. It provided a garbage collector to at least slow memory leaks. It turned all those transient application processes into one big JVM process not only serving all the transactions for a given user, but serving all the transactions for an entire application or even multiple applications. Java made it more difficult to make destructive program errors, but it also made the consequences much more severe. Your friendly neighborhood sysadmin is powerless against a runaway thread or a slow memory leak. All he can do is kill the process, bumping out all of the users, killing all of their sessions.

It's so bad, the process might as well be a node. Unix becomes Windows. The JVM is practically an operating system, but without all of the features of an operating system and a whole lot less mature.

Enter Java Frameworks

This is really what Murph was railing against, although he didn't name it and he conflated it with the core language by labeling "Business Java." Frameworks evolved for a myriad of reasons which are often summarized as "taking care of the plumbing to the developer can focus on the business logic." The "plumbing" is a lot of things, including managing certain resources and generally ensuring the application code executes within a well defined life cycle where it is unlikely to do damage. In other words, instead of giving the user a simple, uniform mechanism like a process to protect the world from his mistakes, he is given dozens of hooks where he can implement little snippets of focused and hopefully bug-free functionality. All this involves a lot of learning above and beyond "the Java you learned in school" (meaning the core language and libraries), putting a cognitive load on the programmer and additional runtime load on the machine.

Multiprocess versus Multithreaded

Most Unixes have evolved efficient threading, and Windows has come a long way in becoming a multiprocess, multiuser environment. Consequently, developers needs to be able to intelligently decide when to use multiple processes, when to use multiple threads, and when to use a hybrid approach. For example, Apache httpd has for quite a while now used a hybrid approach. One one hand on most operating systems threads involve less overhead than processes, so it is more efficient to use multiple threads than multiple processes. On the other hand multiple processes ultimately will give you better reliability because they can be spawned and killed independently from one another, so making a system that can run for months without stopping doesn't require writing a program that will run for months without stopping.

So how do you choose? My rule of thumb is to look at the amount of shared data or messaging required between concurrent execution paths and balance against how long the "process" (not OS process) is expected to live. Execution paths with lots of shared data or that are chatty will benefit from the lower overhead of threading, and threading allows you to avoid the complexities of shared memory or IPC. Of course, multiprocessing allows you to avoid the complexities of threading APIs, and there are libraries to address both, so the complexity issue could be a wash depending on your previous experience.

So why is Murph so wrong? Is JC right?

I think Murph wants to divide the world along nice clean lines. System programmers program in C. They don't need the hand-holding of managed runtimes or languages that treat them like impudent children. They do need lots of flexibility and lots of rope. Application programmers, on the other hand, need high-level abstractions that are close to the business domain that they are addressing. They need to be able to rapidly build software and rapidly change it as requirements evolve. They don't need lots of flexibility and should stay away from low-level details. So, in Murph's eyes, the problem with Java is that it doesn't do either particularly well. The managed runtime and object-orientation get in the system programmer's way, while the general-purpose nature of the language and mish-mash of libraries and frameworks just confuse application developers, or rather distract them from their true purpose. System programmers need C. Application developers need 4GLs.

The fatal flaw in Murph's reasoning is that it ignores the in-between. What happens when the systems programmer or 4GL creator fails to provide the right abstraction for the application developer? He's stuck, that's what happens. Software development is as much about creating abstractions as using them. Consequently, application developers need general-purpose languages.

Sphere: Related Content