Latest Article
Wordcount
By Patrick | 07.23.2004A nifty Flash widget for visualizing word frequency.
The counts come from the British National Corpus.
Which just won't go away.
The "about" page mentions that they're going to start using web data, which sounds more interesting.
Recent Articles
Okay this is a Stretch
By Patrick | 07.06.2004 But it's kind of funny. And he does mention "grep," after all....Any Day Now, I Will Hear You Say
By Patrick | 07.06.2004 When will we really be able to talk to our PCs? A little column in Wired, with predictions from a guy from IBM, a guy from a company called ScanSoft, and a guy from a university called CMU. Eheh....Nontransmitting People
By Patrick | 07.05.2004 Generally I stay away from "style," "spelling," and that hideous topic people call "punctuation." I pretty much pay that stuff no heed whatsoever. So sue me. BUT. When it comes to hiding behind untrue technical excuses, my blood starts boiling. When Aly Colón tried to find out why newspapers are so notoriously untrustworthy in their treatment of accent marks, the excuses he got were noxious: "We do not use accent marks because they cause garbled copy in some newspaper computers. (We categorize them as "nontransmitting symbols.)" Not to put too fine a point on it, but may I remind the world that the guy who invented ASCII just died? May he rest in seven bit peace, while the rest of us learn to deal with REALITY....Search not Sort
By Patrick | 07.03.2004 Here's an idea that seems to be coming of age: Search, don't sort. It's the age-old battle enshrined in volume 3 of Knuth... sorting or searching our data? Given that everyone seems to have gigs upon gigs of data on their drives these days (and increasingly in their cell phones, PDAs, etc), the answer seems to be clear: sorting by hand is impossible. Wired has a story on the topic: Searching for the Perfect OS Interestingly enough, as the article points out, there's nothing new about this approach, which is gaining ground in services like Gmail. MIT's David Karger points out that "It's not an exciting new idea. It's something that's been needed for a long time.... I do think it's ridiculous it's taken this long." " No kidding. Apple's Spotlight looks to be a step in the right direction, as does the nascent dashboard project for the Gnome desktop....gung'f arng
By Patrick | 07.02.2004 For the cryptologically and or pseudoalchemically inclined: The Mystery of the Voynich Manuscript. "New analysis of a famously cryptic medieval document suggests that it contains nothing but gibberish." The Voynich manuscript has a long-standing history as being perhaps the most mysterious manuscript in the world. Written in a wacky script and filled with odd illustrations of vaguely hemplike plants, one is tempted to wonder whether the author was in fact ... Well anyway. A lot of NLP-style analysis has been applied to the document, which has a distribution of words and letters that seem unlike human language, but too complex to be random. Now, Gordon Rugg at Keele University in England claims to have come up with an possible explanation for how the thing might have been produced. [via Uncle Jazzbeau’s Gallimaufrey]...About fieldmethods.net
Fieldmethods.net is site about what happens to human language and computers when you put them in a blender and hit frappé.
Category archives
-
Administrivia
woof() Yes this Site Has Issues But It Has a Therapist Greetings earthlings
Fun
Wordcount Okay this is a Stretch
In the Media
Nontransmitting People Remember the Phraselator? Verbs are overrated anyway
l10n and i18n
Open Source and Localization
Machine Translation
Forget the Tanks, We Need Machine Translation
NLP Software News
Another Linux Live distro for NLP
On the Web
Search not Sort An Intro to the Natural Language Toolkit Another new site... Blogos, a New Site New blog on Computational Linguistics
Search
Looking for definitions on the web
Speech
Any Day Now, I Will Hear You Say
Monthly archives
Syndicate
This weblog is licensed under a Creative Commons License.