Next »

Contains...

Fewer dull moments than should normally be expected.

« Complete Index »

« rss1.0 feed »

Most recently...

Historically...

Categorically...

And possibly...



Advanced Search

But not...

Blogroll:

"Academygirl" • Anders • "Antipope" • Bérubé • Blankenhorn • Blaze • Bond • Boudreaux • Boyd • another Boyd • Brayton • Bruce • Burke • Carroll • Cartwright • Cascio et al • Cheney • Cho • Clark • Christine • Cole • "Cranky Professor" • Cringely • Crispin et al • DeLong • Doctorow et al • Efimova • Ellis • Elsberry • Fafnir, and Giblets, and the Medium Lobster • Farber • Felber • Fisher • Gaiman • Golub • Greenwood • Griffiths • Herasimchuk & Driscoll • Hsieh • Hyde • Jain • John & Belle • Johnson et al • Jones • Jones (again) • A different Jones • dammit another Jones; is nobody named "Smith" anymore? • Kelly • Lambert • Laporte • "Laputan Logic" editor • Laura • Lawley • Leander • Lee • that Leiter fellow • Lessig • Levin • Lindberg • "Little Professor" • Lynch • Manley • McMurray • Michael & Friedrich • Minar • "Mindles H. Dreck" • Mooney • Myers • Nielsen Hayden • O'Connor • Orzel • Osborne • Paquet • "Pedant" • Piquepaille • Pontikos • Postrel • "Radagast" • Rana • Rheingold • Rivka • Rosenhouse • Sadagopan • Salo • Saltman • Scalzi • Scalzi (again) • Shirley • South • Spivack • Sterling • Suber et al • Taylor • Terry • Tozier • Various Ann Arborites • Various crescat editors • Various crooked editors • Various de novo editors • Various other "evolutionists" • Various futurismic editors • Various gene expression editors • Various commentators on the Invisible Adjunct issue • Various kuro5hin editors • Various linguists • Various many2many editors • Various nanotechnologists • various Demos Greenhouse folks • Various o'reilly geeks • Various philosophers • Various slashdotters • Various speculists • various xplanaziners Vielmetti • Wentworth et al. • Wheaton • Wilkins • Woit • Yee • Yglesias • Zúniga

Links:

»» Tozier Consulting ««

»» Corners Bumped Books & Antiques ««

And there may be recent influence by...

Books:

ole red-eyes and the magic beanstalk • Peter Ackroyd being fascinatingly obsessive-compulsive • 100 absolutely great designers • what they think of the devil • Jim Woodring is a freakin' god • how to dress the part • gotta love kooks • Magic, Mystery, and Science: The Occult in Western Civilization • a great statistical computing guide • a must-read (and entertaining) biology volume • classic horror tropes revisited • the only way to read Harry Potter (by listening) • how many people make complicated and influential decisions

And I want these: Sklar's Learning PHP • Collette's Multiobjective Optimization: Principles and Case Studies (Decision Engineering) • Jackson's Marginalia: Readers Writing in Books • Boyer's Religion Explained: The Evolutionary Origins of Religious Thought • Fingerut's Creating and Planting Garden Troughs • James's The Prop Builder's Molding & Casting Handbook • Warner's Making Concrete Garden Ornaments • Shepheard's The Cultivated Wilderness: Or, What is Landscape? • Lotov's Interactive Decision Maps: Approximation and Visualization of Pareto Frontier • Baum's What Is Thought? • Lynch's The Image of the City • Niemeyer's Learning Java, Second Edition • Darwin's Java Cookbook • William's Open Innovation: The New Imperative for Creating and Profiting from Technology • Vandermeer's The Thackery T. Lambshead Pocket Guide to Eccentric and Discredited Diseases • Johnson's Mind Wide Open: Your Brain and the Neuroscience of Everyday Life • Mithen's The Prehistory of the Mind: The Cognitive Origins of Art, Religion and Science • Cude's The Ph.D. Trap Revisited • Jackson's 20th Century Pattern Design: Textile & Wallpaper Pioneers • Gigerenzer's Bounded Rationality: The Adaptive Toolbox • and others...

Valid XHTML 1.1 | Valid CSS
powered by blosxom.

Creative Commons License
All content on this website (including text, photographs, audio files, and any other original works), unless otherwise noted, is licensed under a Creative Commons License.

Listed on Blogwise

2004-08-12

Please go save these century-old Ann Arbor Newspapers

[this was originally posted on 11 August]

[Update 12 August] The address of the house in question is 1709 Cherokee. I drove by earlier today on my way out of town. Stop in tomorrow (Friday) or Saturday and nab some great newspapers!

I need your help. If you’re a local in the area — Ann Arbor, Ypsilanti, Pittsfield Township, Washtenaw County. If you read this, help.

Today I was driving along, minding my own business, when I suddenly found myself reading a hand-written sign posted at an intersection near our house by some diabolical Mephistophelean persons. Right fiends. [Well, extraordinarily nice people, actually…]

It read, in scrawly letters, “HISTORICAL DOCUMENTS & BOOKS SALE Today ->”

Fiends. Of the kind who know your deepest secrets and desires.

OK, so I succumbed and went in. Who knows? I thought: Maybe it’s a sickly, scant garage sale. A stack of mouldering worm-eaten books.

Not so. It’s the estate sale of a renowned historical librarian from the University of Michigan, and what a packrat he had been.

Veritable fiends.

So in I went. Wandered, looking at stacks of ephemera in binders, some nice old books. A smiling and apparently entirely un-fiendish lady with glasses told me there were also two back rooms. And when I saw what was in the one back bedroom, I blanched. Actually, I think I stumbled, froze and the blood right out of my head. There was a rushing noise, I know that.

I called my wife as I ran out of the door and told her to get ready for me to pick her up.

So, at any rate (and as will be recounted at some length elsewhere), we came back immediately with a checkbook and spent two hours looking carefully. We bought some stuff. But only one piece from that back bedroom.

What were those things that made my heart run cold (or is it hot?): Bound Ann Arbor newspapers from the 1860s-90s. The Ann Arbor Democrat, The Michigan Argus, The Courier, The Peninsular Courier & Family Visitant, and other papers from Washtenaw county.

Not only are they the exact same sort Barbara and I have been transcribing on these very pages — from the poorly-photographed microfilms produced by [a local company who produces microfilms] in the 1960s, and offered desultorily by the Ann Arbor District Library. But as I found out, these are the same, exact volumes that were photographed (oftentimes illegibly and very, very badly) by [the local company who microfilms documents] in the 1960s.

The esteemed Departed (may he be garlanded and reap everlasting benefits and substantial perks from the best sort of Paradise available) had saved them from the loading dock back in the 60s and secreted them in his attic for forty years or more. They were supposed to be burned, since the microfilms had been made and were considered passable by the fools at [that local company who produces microfilms] who convinced everybody to destroy all the world’s newspapers.

And here in the back bedroom they were. They are, actually — stacks and stacks of them. Maybe 50 full quarterly volumes. The actual newspapers that record the history of Ann Arbor, Ypsilanti, and parts thereabout.

They do not exist anywhere else, to my knowledge. Maybe a couple of archives here in the state. Not elsewhere.

They’re $40 a pop, if you seem earnest enough and promise the folks not to rip them up to sell on eBay. Actually, promise me. They should be preserved. The microfilms produced by [the local microfilm bad guys] suck. Really.

Please, if you’re a local reader, go get some and save them from the dustbin. I can’t afford to at the moment, for reasons of storage and time and frankly money.

But if you go, and solicit your friends to go, and save them — well, somehow we’ll make it worth your while. Either by buying them at some undisclosed future date, or commending you vociferously, or maybe establishing an archive online, or something. Making it seem (in the long term) like a good idea, in other words.

From what I understood, the sale will be open again on Friday. Or maybe Thursday…. The exact address is unknown to me, and we don’t get the local paper (ironically enough, eh?), but I believe the sale was on Cherokee Rd, near the intersection of Stadium & Packard. If you have an Ann Arbor News, have a look and let me know (please), and get over there and buy, buy buy. Save it.

And there seems to be an amazing estate sale coming up, by the way. Lots of glass, furniture, stuff like that.

But lots more history in those bound newspaper volumes….

2004-08-11

Fafblog: 1855!

Today my wife bought a copy of Doesticks: What He Says. From 1855. A very nice book we will be keeping around the house for a very long time.

She’s transcribed the charming first chapter of the book, which includes (a) Doesticks dissing Spiritualists back-handedly, (b) Doesticks speaking of himself in the third person by two different names (Philander being the other), (c) dream scenes, (d) “scare quotes” (though it may be just the way they wrote back in the scary scary 1850s) , (e) a tone that reminds me how I’ve read that Mortimer Thomson (Doesticks) was a subtle and witty and diabolically effective political commentator in seemingly silly mode, and (f) the extraordinary passage:
But my physician informs me that I have got the “cacoethes scribendi,” which he says is as bad as the small-pox, toothache, and yellow fever. The disease, he says, must have its course—it may end in a malignant biography—result in an infectious broad-sword and blunderbuss, yellow covered novel, or degenerate into a weak form of pseudo-sentimental verse writing, in which latter case he intends to order me a literary tombstone.

Fuckin’ a — it’s Fafblog. Really really really Fafblog. Doesticks is Giblets.

Bow down before Doesticks. Bow down before Doesticks NOOOOOW.

More soon. First we must go bow once more before Doesticks. For Doesticks is/was/will be very interesting indeed.

2004-08-10

Evolving ants for OCR

Barbara has been setting up a system in the office for scanning the pages of public domain books to submit to the Distributed Proofreading/Project Gutenberg system. The workflow is essentially:
  1. create 300-dpi bitmaps of every page
  2. “clean up” the bitmaps for OCR by aligning them and cropping them and getting the thresholds right and applying a threshold function to convert them to one-bit bitmaps
  3. [submit the images to the PG website at this point]
  4. for each page, [somebody needs to] run an OCR algorithm to create a (somewhat) rich text file, indicating italics and bold-face text at least
  5. [the book’s pages are proofread and recompiled and marked up by hand and scripts to create Project Gutenberg submissions by various and diverse hands].

All you really need to think about today, though, is the OCR.

There is some very nice OCR software out there in the world. High-end commercial stuff. Stuff that does all sorts of fancy stuff with dictionary and context and syntax and fonts and mixed metaheuristcs and training. There is damn-all when it comes to Open Source, cheap or free software, by comparison. gocr, which is passable, but in general it’s far, far less accurate than something big and unwieldy like Omnipage Pro. Worse, Omnipage Pro costs $400+ and doesn’t run in MacOS X. Liars.

Why the stagnation in OCR? Here’s my hypothesis: early lock-in of neural networks methods.

Back in the good old days, OCR was touted as one of the most important success stories of neural networks models. Having a computer read a page in a book was soooo kewl by 1980s standards — and after all, it cashed in on the results from all the early neural computing academic papers, the design pattern (an eye) was pretty obvious. A picture is locally processed, and little bits of it are used to generate small pattern fragments (angles, widths), and these are compiled hierarchically into some bigger patterns (loops, stems), and up near the top the neural network spits out a letter (or just maybe a “don’t know” message).

Sure, it works fine. But it’s tapped out. Time for something new.

See, there’s so much munging involved in this workflow, just because of how the neural networks work and the assumptions about letters and words and language that they embody that it’s sometimes as much work to prep a page for OCR as it is to type it in by hand. This is particularly true if there’s any structure to a page: not merely columns, but tables, figure legends, call-outs, footnotes, foreign language quotes, illustrations with letters in them. And noise, foxing, crumples, tears, crayon….

And then there’s the huge and rather difficult but oft-ignored stuff that goes into making a real text file. After all, character recognition simply produces a list of the characters that appear on the page. And their physical locations. How you get from those to a string of words in a file is not always a simple matter, especially if you want your OCR system to use syntactic cues and sense-making to improve its accuracy.

Look, a few weeks back Barbara proofread Robert Hooke. We’re talking about long S characters, and Greek and Latin and weird bumpy distressed metal type and bad scans. The OCR, doubtless trained on a corpus of modern novels, newspaper pages and magazine columns, sucked.

Because, I’m arguing, the design pattern of the neural networks they use embodies too many assumptions about the scans and quality and orientation and stuff. Neural nets assume too much. Or if not the networks themselves, then the people who use them.

So I want to propose a challenge. An exercise. A way out of the box.

Use genetic programming to create a colony of OCR ants.

Say the scanner produces a bitmap file. Who knows what the page orientation is, or if it’s even straight? Not me. Not you. Who knows if the scan was noisy or clean, or gray-scale or color or what the resolution was or how big the letters are in pixels? Not me. Not you. Not your ants. Who knows if there are illustrations on the page, or typographic ornaments, or foxing or squished spiders, or marginalia scrawled along the side? Go ahead — guess.

All you and your ants will know (in the acceptance test system) is that a number of png-format images (of at least 100 dpi, and maybe up to 2400, and at least 1 bit but maybe 24 bits, and maybe with a normalized histogram of values and then again maybe not, and in some arbitrary orientation) will be provided, and that these files are scans of pages bearing text that the ants need to recognize — that is, output. The ants’ task will be to produce a text file containing as few Type I and Type II errors as possible.

Your OCR ants are little virtual software agents, walking around on the bounded (maybe with toroidal wrapping if you must, but I’d advise against it…) 2-dimensional discrete world of the scan bitmap. One pixel is a quantum of space, to them, the smallest increment of motion and distance.

Along with the scan bitmap in which they “physically reside”, your ants may play with:

  • up to five other bitmaps of 24-bit depth and the same size as the scan, called the pheremone sensoria (initialized to all-white),
  • a linear discrete space of infinite length called the text workspace, which contains a character at each location (initialized to NULL), and
  • a up to five linear pheremone sensoria that map 1:1 to the cells of the text workspace, and contain long integers (initialized to 0).

Ants walk around in the scan bitmap. That is, they each have a discrete “position” property that refers to the coordinates of a pixel in the bitmap. They start the test at the upper left corner of the bitmap, all stacked on top of one another if need be.

They also have a single indexed cursor position that refers to their “attention” in the text workspace. Their cursors all start at the same location “0” in the text workspace.

As ants walk around the scan, they can examine the color of pixels in their immediate neighborhood (defined as you see fit, but no more than 8 pixels in any direction, which is a lot).

They can also examine the color of their location pixel in the pheremone sensoria — “reading” pheremones left by themselves and other ants that have passed nearby previously (again, the nature and meaning of these pheremones are entirely up to you).

They’re also aware of the characters in the region three positions to either side of their cursor in the text workspace, and the pheremone values (long integers) in the pheremone sensoria associated with those letters.

And finally, they have memory, which is just a list of the last 100 sensory vectors of pixel colors, pheremone levels, and characters.

These are not dumb ants! We’re talking (a) a whole bunch of pixel values in the scan, (b) up to five pheremone values in the bitmap space, (c) seven character values from the text workspace, and (d) up to 35 pheremone measures in the workspace, times 100. Whew! Plenty of room to do anything in the world, yes?

Don’t get cocky. There’s a catch, which you shall see at the end.

Each ant, following its internal program, responds to all these sensory inputs (and its memory of up to 100 sensory records from the past 100 time-steps), and does stuff, including:

  • it sits still or moves one pixel in any of eight directions in the bitmap
  • it may do one of the following: (a) write a letter to the current cursor position in the workspace, (b) swap the current letter with one at either side, or (c) move its cursor one step to the right or left
  • it may excrete pheremones of its own in one or more of the ten pheremone sensoria

After they have been “secreted” by the ants, all pheremones diffuse in the sensoria. After each time-step, the amount of a pheremone in any location (whatever the dimensionality of the space) becomes the weighted average of the value in the cell and its neighbors (eight in the bitmap, two in the linear spaces), including a fixed decay term. You may select these terms as you see fit, adding directional biases to the weights used in averaging, setting them to be faster or slower than one another. But in every case, the decay term must be positive, so that all pheremones eventually disappear unless replaced. You may use different constants for weights and decay in each sensorium, but each location within a sensorium must obey the same rules: sum(weight*values)/(sum(weights)+decay)

For example, suppose you choose the weights for one of the 2-D pheremones to be 8 for the current location C and (1,1,2,1,2,2,2,1) for the values in the (N, NE, E, SE, S, SW, W, NW) neighboring cells, and a decay constant of 2.2. The pheremone concentration at every location would be calculated according to (8C+N+NE+2E+SE+2S+2SW+2W+NW)/22.2.

The astute reader will ask: How do the ants map the bitmap to the text file? I mean, how do they even know which way is up?? Yeah, interesting, huh? Good question. Anybody?

Then somebody else will ask: Hang on, what about using dictionaries and syntax and stuff? Don’t the ants get dictionaries? Hmmm. Well, no. I’m imagining you’ll be designing/discovering/training these ants using some method of supervised learning, using a corpus of scans and text files. If they can master those scans, then they should be able to pick up a few rudiments of English along the way, yes? Sortof implicitly, yes? Anything else?

Somebody scribbling on the back of an envelope will ask: Now see here — the ants can only move in 45-degree directions on the grid. How the heck are they supposed to recognize the text if it’s aligned at, say, 72 degrees? Pheremones? Memory? Be creative.

Oh, yes: I don’t think there should be just one kind of ant. I suspect it might be useful to break the task down into pieces, and have different castes of ant work on it.

OK. Well, there you have it. See you all Monday after break, eh?

The formal acceptance test will be based on a sample of several hundred scanned pages (selected from the English pages in the Project Gutenberg archives) of varying quality, resolution, bit-depth, alignment, clarity and content. The ants’ text file will in each case be compared to the human-made “correct” reading of the scan, and graded in terms of:
  • coverage: How many of the correct letters and words are present? This should be high.
  • flow: How many places are there inverted orders? This should be low.
  • deletion and insertion errors; Both should be low.
  • modesty: For all characters present, two points will be deducted for each wrong character, but only one for each “#” (don’t know) character; this should be high
  • and parsimony (the catch I mentioned above): The number of operations made by ants should be as low as possible — you may reduce this by reducing the number of ants, or reducing their algorithmic complexity, or making them algorithmically efficient…
  • .

And some others. This is still a draft. Think about it. I’ll clean up the acceptance test in the next few days, with more detail about how the tests will be written and how the ants will have to be submitted. You tell me more about what the acceptance test should be to be useful….

May you live in ironic times

In the local station identification break this afternoon in the NPR program Day To Day — just before their interview with an ACLU spokesman about the “Surveillance Industrial Complex” and the neo-Orwellian nightmare of civilian Homeland Security — we were treated to a commercial from the Michigan State University Certificate Program in… Homeland Security.

2004-08-09

Did you mean “Google Frolin”?

Background: This came up at dinner with Barbara and Mark last night. Both Barbara and I have experienced it; Mark must not look up the same weird shit we do on Google.

So I’m confused.

But Mark is a Highly Respected and Justly Famous Professor, and we mere townies. Thus on the basis of missing street cred I must call upon you, my loyal readership of sixty-some souls, to rally and gather me mountains of evidence that confirm or refute this alleged behavior. So we can make our case in the Court of Public Opinion.

We must prove the existence of GoogleFools!

Well, hang on… don’t leave… it might be fun, at least.

Viz: Sometimes when you search for something at Google, it will suggest an alternative spelling (even if it’s a phrase). This is called, surprisingly enough, the Spell Checker. For example, when I type in a random series of pronounceable letters, say frenta, Google responds with some hits (well I’ll be damned; I guess it really is a word of some sort), but Spell Check also puts this up at the top of the results page: “Did you mean: fernao

If you click the link in the word “fernao,” you re-search Google and it shows you pages containing “fernao.”

All clear? This is nice, and useful, and a pleasure if you’re a fumblefingered fool like I am. (Though one has to wonder how far off-key I would have to get to swap a letter all the way across the keyboard, and then rearrange four of the remaining five to transform “fernao” into “frenta”….)

Anyway, here is the controversy: Both Barbara and I have encountered situations where the suggested link Spell Checker proffers results in zero hits. That is, Google recommends an alternative spelling which is not in Google. Weird, huh?

Anyway, Mark has never observed this. And goddammit I can’t at the moment make it happen. To date it’s just a now-and-then kind of noteworthy anomaly thing, since I was basically trying to do something with Google when it cropped up, and inevitably got sidetracked with the actual useful stuff.

But now my utility function has changed. I need help. Bring me a GoogleFool.

I will Offer A Reward for the first example of a GoogleFool. Said reward being… ummm, well, lessee… how about a nice Victorian photograph of somebody you don’t know?

Note that once a GoogleFool appears and is logged, like a good Googlewhack it will disappear into the inevitable darkness where ephemera go to die….

Blue Blaze Irregulars, get crackin!

2004-07-30

30 July 2004: Naked Solicitation

Things are moving ahead here on several fronts — including the same diverse points of science, engineering, pure-D ranting and raving, and the arts. Not least among these is the construction of new infrastructure to streamline our sales process for antiquarian and collectible books, photographs, ephemera, zines and the like. For a few days it will remain quiet here, then all hell will break loose (one hopes).

Meanwhile, do take a moment and have a look-see at the items on offer in my eBay Storefront, including numerous cabinet photographs and Cartes de Visite from Michigan, Ohio, Pennsylvania and parts west; rare railroadiana and train technical manuals; some First Day Covers for the philatelist you know and love; antique postcards suitable for reproduction as online novelties; some board games.

Tell you what I’m gonna do: 20% discount on all items you purchase if you mention that you found the items via my blog on or before August 8 2004.

Pretty please.

2004-07-22

For those arriving in search of Erdos Number auction info, and material on the proposed online science collaboration community

Because of the inflow of visitors brought by articles in Science News and elsewhere, this explanatory entry has been “pinned” to the top of the blog temporarily (thus the very funky date, which is a side-effect of the kludge I used).

More recent articles appear below, labeled with their actual dates of publication.

I’m in the middle of a draft that tells the story of the auction, the press coverage, and the reaction of readers and visitors and rumor-hearers and correspondents… but it’s still the middle.

In the meantime, you might not have seen these articles in the press on it:

See more ...

Somebody who once wrote to me has a very serious problem today

…because so far they have sent me 31978 copies of the PE_ZAFI.B virus since about 7am. It can’t hurt me (Macintosh, you know), but apparently it can hurt my host’s mailboxen, and so I’ll be offline until tech support manages to address it in the morning.

The serious problem is not the viral infection my onetime correspondent has. It’s my current attitude towards them. Oh, and also the company that allows their OS to be exploited this way.

I am sending them pain. Thoughts of pain. You there, in Redmond: You… are… experiencing… pain.

2004-07-20

eBay redefines “nonfiction”, greatly simplifying ability to hide books where buyers can’t find them

In a radical move this morning, eBay’s technical support staff have restructured the entire database in order to shift books by authors in the antiquated and misleading “fiction” grouping into the easier-to-search nonfiction category. Now prospective buyers can comfortably scan lists of works by such famed truth-telling authors as L. Frank Baum, Andre Norton, John Grisham and That God Fellow without concerning themselves that the books they’re seeing contain lies or otherwise questionable statements.

Kudos, eBay Database Modeling Team! Always remember: To be distinctive is to invite criticism. Thanks for helping booksellers everywhere avoid criticism.

2004-07-18

Update to my post on Ratchet Auctions

I added pictures, and fixed some words and stuff. Comments welcomed.

How many point mutants do you have?, or Oh please Harlag Elison, don’t sue me!

Back when I was a molecular biologist, we spent a lot of time in the lab making point-mutant libraries of proteins and DNA. So say a DNA sequence reads ACGTCGA. A point mutant of that sequence is any that differs in exactly one position. So in this case, {CCGTCGA, GCGTCGA, TCGTCGA, AAGTCGA, AGGTCGA, ATGTCGA, ACATCGA,…}. A point-mutant library is the full set of all the alternative sequences. If you want to include single-position deletions, then it’s a point-mutant/point-deletion library.

Google is doing something like this when they try to second-guess what you really meant to type. They look at all the “close” variants of what you actually typed, and check to see if any made “more sense.” A recent thread in the eBay forums complaining about how their ISBN database fails to catch simple mistakes led me to check up on a few things.

This affords the opportunity to play. For instance:

Google search for “Harlam Ellison”

Check it out. There are a whole bunch of sites using that spelling. Harlan Ellison is mutating — at least in Spanish — into Harlam Ellison. This is too many to be a simple typo, but rather suggests that a new alternative spelling is arising. Maybe one of them was a typo, and the rest were copies of it. Or maybe a book was printed that way?

And who else might he be? How many other one-character variants of Harlan’s name exist? Of those, how many are clearly simple “original” typos, vs. duplicates of earlier variants (replicates, for lack of a better term)? What proportion of the point mutant library is covered on Google for Harlan? “Harlan Elison” seems pretty popular, for example.

Who has the most?

How many do you have?

How does the number scale with the number of Google hits a name has?

Thoughts on CollectorsBookMarket.com

The venerable AB Bookman magazine died in 1999, but has since been resurrected as something of a specialist alternative to eBay for professional booksellers. It’s in beta, meaning (a) there are some technical issues that need to be worked out, (b) users and aficionados are suggesting design changes, and (c) it’s rough. I admire their perseverance, and entrepreneurial spirit. I wish them the best of luck.

Given that, I don’t think I like it.

Fundamentally, I have to say that I can’t vouch for the principle of the thing. Yes, eBay has indeed alienated professional booksellers and transformed what was the most useful venue for sales into an unusable money pit. But the response that CBM seems to offer — we’ll go and make our own, then — strikes me as rather sentimental and foolish. This is a situation in which an underdog is trying to compete with the proverbial 800-pound gorilla, in what might as well be a mature market. People know how to use eBay, they think of eBay first when they go to buy something, they like eBay, like it or not, most of them will happily stay at eBay.

Making another one, a little baby eBay Just For Us which works just like it (or rather like it used to work), isn’t a way to get those customers — repeat after me: Customers Have Money, And You Don’t; Customers Have Money, And You Don’t — to switch over and buy books. All this is is a way to assuage the hurt feelings of booksellers, and that without weirding them out with lots of newfangled innovative changes.

Well, sorry hurt booksellers (and by extension CollectorsBookMarket.com) — there need to be newfangled innovative changes. Time to get unstuck, get different, get smart. Not copycat-smart, but really smart.

Inertia kills.

First off, the power of pure first-year marketing will kill you splat before you know what hit you. Q: What pain are you trying to address with your offering? A: the pain of alienated booksellers who want categories on eBay. Q: How much money do the people feeling that pain have? A: Very little, since they are having trouble selling books on the biggest auction site around. Q: How will you change their experience so that their pain is ameliorated? A: Well, we won’t change it at all; we’ll just make it like eBay used to be. But with books. Q: How will you make a profit from people with little money; that is, how will you get them more money that they will give to you? A: Ummm… we’ll make it just like eBay used to be, but with books?

No no no no no. Dammit, no.

What differentiates CBM.com from eBay? It’s just for books. It has different enigmatic little icons. It uses categories (just like eBay used to do). It’s small, and nobody goes there. Except us sellers.

It can’t be like eBay. That’s boring, boring is bad. Boring is deadly.

Come over here. No, I mean here. I tell you secret. You know Real Problem with selling books at eBay is? Really and truly? Listen — I tell you Free Professional Consulting Secret, is just for you, my friend, because I know you honest person: eBay sucks for selling books because customers never look at the auctions.

What, you think I cheat you? Bah. Tozier not cheat. Listen, I say again so maybe you see what I mean, because you cannot see my finger in air or my eyebrows waggle suggestively: The number of interested prospective customers who visit an auction listing before it closes is too low. Raise that chance — the chance that somebody will come along (even one person) who is willing to buy the item — and profits will increase. Forget making it “like it used to be,” forget categories and browsing and making the experience just like eBay was in its halcyon days of glory (which were a pretty tarnished golden age at their peak, if you ask me). For god’s sake, do something to get more people to look at the books.

Otherwise you go squish. Quick.

Am I a Mean Person? Well, yes, much of the time I am. But not entirely so, so here are, a few positive suggestions and criticisms:

  1. eBay’s Yankee auctions are the wrong, wrong wrong wrongest possible way to sell books in a small startup-site with poor attention and scant listings. Try a ratchet auction or even a drunkard’s auction instead — something that doesn’t end before anybody has looked at the listing. These formats (yes, I invented them; use them dammit) are open-ended sales formats that still find market prices automatically. That is, they’re like eBay auctions in that they get as high as possible a price given the market, but unlike eBay’s sales they don’t cut you off before anybody sees the damned item. At the very least, make the auctions last 30 days, or even longer.

  2. Don’t let people try to make up for lost time by inflating their prices. Booksellers have seen hard times, and they’ve been grating their teeth for years as the bottom drops out of the market and prices on eBay drop below what they can ask. Well, don’t let them list everything on CBM.com for $40 or more. Christ, I looked at what’s listed there today and I saw practically nothing affordable. There have to be loss leaders. If you don’t know what a loss leader is… well, you shouldn’t be in retail. And again the whole market price formation thing (the basic economics I mentioned above) is vital here. Get sellers to price items to sell now, when there are no buyers. Let them ramp prices up later on when they want to, when there is a real market, when somebody might actually spend some money and keep your heads above the waterline.

    Maybe consider ratchet auctions for that, too. Because like Dutch auctions they find fair prices without all that fiddling around with starting numbers and reserves.

  3. Either go graphic, or lose the little dollar signs and first-edition doo-dads and dingbats that look like somebody concocted them pixel-by-pixel using a crappy Win 95 box. It’s books — you’re selling things to people can arguably read words. Eschew all page junk like the little icons and the colored outlines and the extra font weights and stuff. Make a nice clean column that says “first edition?” and fill it with “yes/no” if you want to. Make a checkbox that says, “Show only dollar auctions” and let me click it.
  4. Above all else, make it different from eBay. Do something different and noteworthy: let visitors add comments to listings. let sellers create cross-links between their sales; allow linking to off-site sales pages (even eBay); include full-text comparison; allow visitors (no, not just paid subscribers) to use the sold items database as a price guide; make sure every auction listing is given random time as “featured”, not just the expensive ones; freely mingle the store listings with the auction listings, without differentiating; make sure Froogle sees you; make sure those of us who use Macintoshes can use all the functionality that eBay denies us. Something.
  5. Say why you’re better. Trumpet it on the front page, in your advertising copy, in each and every damned listing if you have to. Make sure you can complete the sentence, “You want to buy and sell at CBM.com because_____” without pausing or looking back. Everybody must know.

    So far, I don’t.

There’s more. I’m tired. As my art teacher used to say, “It cuts the eyes.” This is not a minor thing in online marketing, in representing dense data sources like your site in a way that makes people not get reader’s cramp. That’s something that would take a whole lot of work — will take a whole lot of work. But you’re a beta. So that’s excusable.

Work on the business model, the product design, the real marketing plan now. Get it right, or don’t bother.

Be different. You can’t be bigger. You can’t even be better, right away — at least int he minds of the people who buy books. So get their attention some other noteworthy, amazing, surprising way.

Virginia Postrel’s great article on Operations Research

Back on June 27, Virginia Postrel wrote a very nice piece on the history and practice of Operations Research, “the most influential academic discipline you’ve never heard of.”

It does a good job explaining what I do now professionally. And nailed why I’m interested in going back to grad school. And for that matter, with this bit:

“Some time in the 1970s or 1980s, O.R. was in a sense hijacked by mathematicians who insisted on imposing their view of rigorous mathematics onto the field. This placed much less emphasis on modeling and empirical work,” says Richard C. Larson, a professor of civil and environmental engineering and engineering systems at MIT and for 15 years the codirector of the institute’s Operations Research Center, which recently celebrated its 50th anniversary. “In some OR journals today, the only empirical data are, ‘Date of submission’ and ‘date of acceptance.”’
why I’m increasingly worried about doing so.

[And it’s a good response to a U-M statistics professor’s sotto voce question, “Why would anybody want to be in OR?”]