Advertisement
Promo

Become a member of the ZDNet UK community

Rupert Goodwins

View blog's RSS Feed

Mixed Signals

Any sufficiently advanced information is indistinguishable from noise

Tuesday 29 April 2008, 12:19 PM

Knuth: multicore engineers 'out of ideas'

Posted by Rupert Goodwins

Donald Knuth (yes, that Donald Knuth) has given a long and interesting interview with the Informit website -- which I hadn't come across before: it's a joint venture between various tech book publishers.

As you'd expect from a septuagenarian who's defined a lot of modern computing concepts, the interview covers acres of ground. From open source - "I believe that open-source programs will begin to be completely dominant " - to how he writes - "My general working style is to write everything first with pencil and paper, sitting beside a big wastebasket. Then I use Emacs to enter the text into my machine" - and his choice of platform "I currently use Ubuntu Linux, on a standalone laptop—it has no Internet connection. I occasionally carry flash memory drives between this machine and the Macs that I use for network surfing and graphics; but I trust my family jewels only to Linux".

There's meatier stuff too, especially on his favourite field, literate programming. This is a way of writing programmes based on the idea that they should be readable by humans first, and contain a decent explanation of what it is you want the computer to do: Knuth says that this may be the only way to approach very complex tasks, and I think that's a fascinating approach. However, I fear he's been stymied by the cllche that once you teach computers to understand English , you'll find programmers can't speak it.

Of most relevance today, though, are his thoughts on multicore. "To me, it looks more or less like the hardware designers have run out of ideas, and that they’re trying to pass the blame for the future demise of Moore’s Law to the software writers by giving us machines that work faster only on a few key benchmarks! I won’t be surprised at all if the whole multithreading idea turns out to be a flop, worse than the "Titanium" [I think he means Itanic. Ed.]approach that was supposed to be so terrific—until it turned out that the wished-for compilers were basically impossible to write."

There's more in that vein.

The trouble is, I think he's right. Ever since Intel introduced hyperthreading in 2002, we've been waiting to see the case for generic multiprocessing on the desktop. That there are plenty of special cases, nobody doubts: multiple processors have been a feature of computing for many decades. But in general? They do nothing, and I say that having sat through more IDF demos of hyperthreading, dual, quad, eight and manycore chippery than is allowed under the Geneva Convention. As Knuth says - the problem is palmed off onto the compilers: compilers nobody can write, because the job is fundamentally impossible.

Programming is at heart mathematics, even though most of it is the most inelegant, ugly and innumerate applied maths ever seen on the planet. I have never seen a mathematical treatment that shows that parallelism is theoretically applicable to linear tasks: I suspect no such thing exists. But if the reverse were true - if a computational mathematician could define the class of problems for which parallelism was useful - it would save us all a lot of time down on Bernard Matthew's Wild Goose Farm And Wrong Tree Barkery.

Wouldn't half shake up Intel's marketing message too.


Comments on this post

1000185600

Am I missing the point here?

I appreciate that a linear task cannot fundamentally benefit from parallelism, but how many of our daily tasks are purely linear? We constantly run pseudo-parallel O/S tasks whether we know it or not; providing the contention for shared resources can be mitigated, multicore processors could provide a massive benefit by genuinely isolating and parallelising these tasks. The same paradigm can apply across any number of desktop applications, can you name an application where every user input must fully complete before any other process can start?

Surely the problem does not lie with compilers, the problem lies with application design. As software and enterprise architecture moves towards service orientation good design should enforce encapsulation of processes and therefore enable parallelism. I’ve never written a compiler, but it feels as though we need two stages: a classic compilation of linear services, and a ‘softer’ compilation (for want of a much better word) that assists with the challenges of parallel design – aggregation, contention, deadlocks, race conditions etc.

Fascinated to hear your thoughts.
Simon

Posted by 1000185600 on Apr 29, 2008 2:29 PM

Rupert Goodwins

The problems lie everywhere on the stack. To some extent, we've fixed the problem at top and bottom - our applications and presentation layers are inherently parallel these days, now we've moved on from everything being a dumb terminal at the end of an RS232 line , and by the time you get to the physical layer then it's a dull old matter of moving a bunch of bits from A to B, and who cares how many processes up the line are filling the bandwidth.

But inside applications, it's such a different story, Take word processing: you might think that it's useful to be constantly indexing, spell-checking and searching in the background, working your way through the words as they're input. But even if that came for free, most of the time it doesn't matter in the slightest. It would be remarkably inefficient to launch a spelling checker on a new thread at every key press: you have to wait for a space or punctuation, and at the speed of modern processors you can then do the check on the fly, before the next keystroke. Searching? When I need to search for something, then the difference between it being pre-searched and me selecting it and waiting for the search to complete is negligible; certainly not worth all the work of searching for everything just in case.

Sure, there are document management tasks that really do benefit from parallelism (in particular, I'd imagine that large-scale OCR would benefit enormously, if you could flash the image in), but then we're back in the large-scale-data-munging world where these things are known good.

And yes, _all_ the parallelizing compilers I know about work by identifying the linear components and parallelising the rest. Only they don't do it very well - and our experience as human beings doing the same job is that it is incredibly hard to do well, even by experts, and it is absolutely an iterative process. One small tweak can make performance changes of orders of magnitude.

(Intel has made great claims for its Ct compiler, but I've been unable to follow the couple of papers I've downloaded on it. That's going to be a three-pipe problem)

I haven't written a compiler either, although I have worked in many associated areas (you do when you're an assembler programmer weaving your way through a world of compiler output, trying to work out what the darn things are doing and why).

I do wonder whether a radically different approach may help - something like evolutionary engineering, where an automated modifying process tries many different approaches to re-engineering linear code, runs it speculatively and then picks the best result, iterating as it goes.

Rather temptingly, that in itself would be a very efficient use of multithreading (even as I type this, a whole set of intrinsically interesting caching and resource management issues presents itself to me from some background task at the back of my brain) - and could provide some unexpected new insights into the issues.

Posted by Rupert Goodwins on Apr 29, 2008 3:01 PM

1000185600

I tell you now, if you can pick apart the workings of Intels Ct compiler in 50 minutes someone should erect a statue in your honour!

I appreciate your point about modern applications not making use of parallellism, but is this not a case of dismissing future development based on current capabilities? Taking the word processing example, sure we have responsive and immediate spell checking now* but what about developments around semantics and contextual searching? If each word can potentially change the semantics of the document, and hence create a number of context changes (additions, removals, modifications) we’re gonna need more power, and preferably uncontested power. Once you have a semantic representation this can be levered; a next step is to perform contextual searches to ensure pertinent information from disperate sources is available on demand – another case for powerful and parallell processes.

I’m meandering into conjecture here, but my point is that hardware development is forward looking and application development will be too.

I like the idea of applying evolutionary engineering into the heart of software development, particularly if the process could suggest areas where paralellism is nearly possible as well as identifying places where it already is, a great feedback loop into the design process which could enhance our natural creativity. But what do you think about moving it further up the design process? Domain specific languages have never really fully delivered on their promise, but how about ‘compiling’ desgins to identify areas of atomicity and reuse before the code itself is created and compiled? If it’s difficult to identify parallel components once the code is cast, surely it makes sense to guide the development process from inception to compilation.


* Although I disagree that a spell checker would have to be created on a new thread every keystroke – it should be a single instance that monitors the text as it is created and reports any of its rule breaches.

Posted by 1000185600 on Apr 30, 2008 2:39 PM

Rupert Goodwins

I see the problem of parallel computing as a chasm. On this side, we're stuck with lots of old ideas which don't benefit much from parallelism: on the other, there's a whole new world of extracting information from data by understanding context, trying multiple speculative operations at once, pattern recognition, beyond-real-time modelling, and other things that probably qualify as true AI. We can see the other side from here, and we can even begin to map out the hows and wherefores, but we can't actually make the leap.

By making the hardware better and the software more creative, we can narrow the chasm from this side. By thinking better and coming up with really new ideas and theories – which could well include pushing parallel thinking up the design process so that the machinery provides parallelisation hints during the initial stages -- we can narrow it from the other. So far, though, there's no clear point on either side from which to start building the bridge which will get us over, and the chasm remains too wide to jump. The question is, can the basic engines of semiconductor development keep going on this side long enough to get us across, one way or the other?

The trouble with DSLs is that they're never really smart enough to help..

Updated by Rupert Goodwins on May 1, 2008 8:20 AM

Simon W

I agree with the analogy. I'm also pretty sure we'll get there, can't think of too many occassions where progress has just 'stopped' due to the challenges!

Thanks for your insight, have enjoyed the discussion.

Updated by Simon W on May 22, 2008 10:37 AM

Rupert Goodwins
  • Rupert Goodwins
  • Location, location, location
  • Member since: October 2006
ZDNet Staff

My Blog Archive


Contacts' Latest Discussions

Number of Tracked Discussions: 3,207

Adrian Mars Adrian Mars

Shiny, shiny, shiny

Thursday 3 December 2009, 12:07 PM

1 comment
roger andre roger andre

Microsoft begins work on Windows 8

Thursday 3 December 2009, 1:02 AM

2 comments
ator1940 ator1940

ACTA

Wednesday 2 December 2009, 12:07 PM

6 comments

Contacts' Latest Blogs

Number of Contacts Blogs: 18


Skip Sub Navigation Links to CNET Brand Links

Help

Become part of the ZDNet community.

Newsletters