Advertisement
Promo

Become a member of the ZDNet UK community

Rupert Goodwins

View blog's RSS Feed

Mixed Signals

Any sufficiently advanced information is indistinguishable from noise

Monday 17 December 2007, 2:05 PM

Computing's un-natural language

Posted by Rupert Goodwins

An enduring theme in computer programming is the uncertain role of natural language. On one side, some claim it's bad to have to learn the intricacies of any particular computer language before you can make machines do what you want. Isn't it the computer's job to understand us? Others argue that without the rigour of having to express your intentions formally, you'll never think about those intentions closely enough to be sure you know what they are.

I can see both sides, I lean to the latter. There's a lot of truth in the saying that all a computer programmed in plain English will show is that programmers can't speak English. Given the level of logical analysis shown in most online forums, that's just the start of it.

Computing just doesn't map well onto human language. Take a recent posting on Language Log, the world's most consistently interesting linguistic digest for the intrigued layman. In it, founder Mark Liberman mused on the peculiar business that software can live in or on a system, with no particular reason for either. Does Java run in or on a DVD player, for example? I live and work in London, but would never say on (although I do work on, not in, GMT) - and people run in marathons but on the road. He never mentioned that sometimes, software can run under, with, even outside things too.

I suspect that the choice of preposition depends on the background of the speaker - if you've got a good mental image of how software is structured in layers, or how it actually works in practice, then the choice becomes easier. A program is stored on a disk, but runs in RAM - rarely vice-versa - despite both disk and RAM being forms of memory. It'll be interesting to see what happens when universal memory turns up that does both jobs equally well.

But things may be changing. I asked Google how many times certain phrases occured..,

"Running on DOS": 879
"Running in DOS": 9530
"Running under DOS": 25600
"Running with DOS":5290

"Running on Windows":472000
"Running in Windows":289000
"Running under Windows":263000
"Running with Windows":27300

"Running on Vista":82200
"Running in Vista":11500
"Running under Vista":20300
"Running with Vista":33100

"Running on Linux":329000
"Running in Linux":122000
"Running under Linux":77100
"Running with Linux":916

So, 'on' has become firm favourite after a very slow start, 'under' is far more popular with Vista than Linux (which reflects an interesting point about the philosophies of the two platforms), yet 'with' seems to almost unknown with the open source OS - perhaps reflecting the fact that 'with' doesn't reflect any architectural aspect of the relationship between software and its host OS and thus is distrusted by the more technical.

Lots more work to be done!


Comments on this post

kwaite

Although the phrase "Running despite Windows" gets only 2 hits in Google, I think it is a closer approximation to the relationship between program and OS. (Half a smiley)

Posted by kwaite on Dec 17, 2007 7:05 PM

Rupert Goodwins

I didn't check for "Running? With Vista?" either. The option space for exploration just keeps getting bigger...

Posted by Rupert Goodwins on Dec 17, 2007 7:49 PM

1000238123

A minor issue I ran into programming immediatly after reading this post is the issue of plurality and coding. The seasoned programmer learns to love coding standards because you can predict what variable names will look like without having to carefully letter-by-letter copy from a declaration. The issue I have is with plurality. Collections should be plural, so if I want a list of items, I declare:-

List items; //(i.e. not List item;)

So far so good, just add an s to the singular.
However, should follow English when I want a List of leaves?:-

List leafs; //or
List leaves;

This comes up suprisingly often when I am working with the abstract datatype Tree (very common). I feel the purest programmer should probably go for leafs. It follows a repeatable pattern for every noun. Which makes more sense from the logician mindset, which all computer people should be. On the otherhand its wrong. Natural language eh? Just whats a geek to do?

Posted by 1000238123 on Dec 20, 2007 3:38 PM

Rupert Goodwins

That reminds me of a programmer friend who used to build online systems for the Commodore 64 - one of his favourite anecdotes was "The day we invented trees". Always wanted to use that as a title for a short story. One day, I will. And it was great fun to be a programmer back then, especially when you were self-taught: when you actually discover a basic programming technique by working it out as a solution to a problem, it's much more satisfying (if much more inefficient!) than learning it from a book or lecture.

As for pluralities - yes, it's unsolvable. I think that, since you're creating a new thing when you start to talk about leaves which doesn't need a programmatic relationship to its singular, there's no ambiguity, and adding an S to something is reasonably random anyway, it's better to stick with English usage. If nothing else, it makes things a little easier for those who come after you - although the comments will always make things crystal clear. It's also a chance to show off how well you know English - one iris, two irides, one clitoris, two clitorides.

The same principle - adopt the grammatical conventions of your native language - works for other languages too, which may have more complex and diverse pluralisation rules.

Also incidentally, I was once part of a team of four who were writing a native-mode 386 networked file server. One of my jobs was the co-operative multitasking system which scheduled and dispatched the various filing, networking and support operations - the whole thing was its own OS, network and disk manager. Learned a lot doing that, including discovering a bug in the 386 that, as far as I know, had never been documented until then. Ate a lot of midnight pizza, too...

After playing around with various concepts for the semantics of the multitasker, I decided to nick one of the basic ideas behind the Apollo Guidance Computer, and define task operators and operands as verbs and nouns.

This was partially because I wanted to avoid ambiguity when discussing the operators and operands of the 386 instruction set, partially because I thought it was a good match to the nicely constrained set of actions and data sets the multitasker would handle - there were no conditional actions or complex control structures, just things to do and things to do them with, but mostly because I always wanted to be an astronaut.

It worked very well in the end, but one of the team (a very good coder) did have some problems with it at first and resisted the idea. In pubular discussions, he finally admitted that he'd never really understood what verbs and nouns actually were - and following a pleasant evening kicking around linguistic concepts, all was fine.

Posted by Rupert Goodwins on Dec 20, 2007 4:08 PM

Rupert Goodwins
  • Rupert Goodwins
  • Location, location, location
  • Member since: October 2006
ZDNet Staff

My Blog Archive


Contacts' Latest Discussions

Number of Tracked Discussions: 3,210

Adrian Mars Adrian Mars

Shiny, shiny, shiny

Thursday 3 December 2009, 12:07 PM

1 comment

Contacts' Latest Blogs

Number of Contacts Blogs: 18

Avatar David Meyer

Nokia halves smartphone portfolio

Friday 4 December 2009, 5:03 PM

1 comment

Skip Sub Navigation Links to CNET Brand Links

Help

Become part of the ZDNet community.

Newsletters