Выбрать главу

I agree completely that just looking at the bare code may not be enough. And it’s not a characteristic, I think, of beautiful code, that you should be able to just look at the bare code and see why it’s right. You may need to be told why. But after you have that, now with that viewpoint, that invariant, that understanding of what’s going on, you can see, oh yeah, that’s right.

Seibeclass="underline" Does that put an upper bound on how big a piece of software can be and still be beautiful?

Peyton Jones: I don’t know if it’s a bound on its size. The insights that you need in order to reassure yourself that it’s right, or at least right-ish, are along the lines of being more confident that it’s correct. Any really, really big piece of software is bound to have shortcomings or indeed outright things that you just know are wrong with it. But it’s not economic to fix them at the moment. It’s certainly true of GHC and it’s definitely true of Microsoft’s software.

But what makes big software manageable is having some global invariants or big-picture statements about what it’s supposed to do and what things are supposed to be true. So, to take GHC as another example, having this invariant that each of these intermediate programs should be well typed. That can be checked, actually, at runtime if you like. That’s quite a powerful invariant about what’s going on. So I’m not sure it’s really necessarily to do with size.

Certainly interconnectedness makes big programs eventually crumble under their own weight. Sometimes one of the luxuries that you get from working in research is that you can sometimes take a chunk of code and simply rewrite it in the light of your improved insights into what you were trying to achieve and how you might try to achieve it. We talked about this business of refactoring GHC’s back end. If I was working in a more commercial setting, I might not be able to afford to do that. But I’m hoping that it will make GHC more maintainable and understandable in the long term.

Is there an upper bound on the size? I don’t know. I rather suspect that as long as we can go on building good abstractions we can keep building bridges across the Atlantic. We have software that works—not perfectly, but surprisingly well considering how big it is.

Seibeclass="underline" So the question is, can you build an edifice that’s that large, and works, and is also beautiful.

Peyton Jones: It’s hard for it to maintain its beauty. Bits are often beautiful or at least acceptably non-ugly when they’re first built. In the face of protracted life—maintenance—it’s quite difficult to maintain that. That’s the worst thing about long-lived programs… that they gradually become ugly. So there’s no moment at which they become disfigured but nevertheless after a while they just become crappy.

Seibeclass="underline" And is the only way out to say, “OK, this thing has lived long enough, start over”?

Peyton Jones: I think that eventually you have to reengineer chunks of it. And if you can afford to reengineer a bit as you go along, then—if you don’t do anything for ten years then the result may just be so daunting that you think, “I just have to throw it away and start again.” If you can afford to reengineer a bit as you go along, like the human cells regenerating themselves—that’s what I hope is happening to GHC.

The most depressing thing about life as a programmer, I think, is if you’re faced with a chunk of code that either someone else wrote or, worse still, you wrote yourself but you no longer dare to modify. That’s depressing.

Peter Norvig

Peter Norvig is a broad thinker and a hacker at heart. He once wrote a program to find in Google’s search logs series of three consecutive searches by the same user that, when put together, made a haiku (one of the most memorable: “java ECC / java elliptical curve / playboy faq”).

On his web site Norvig has links to the usual stuff: books and papers he’s written, slides from talks he’s given, and various bits of his code. But there are also links to items he’s had published in McSweeney’s Quarterly Concern, his witty recounting of writing a program to generate the world’s longest palindromic sentence, and his “Gettysburg Powerpoint Presentation,” a send-up of Microsoft’s PowerPoint software, which has been cited by Edward Tufte and which appears on the first page of results if you Google “PowerPoint.”

He is now the Director of Research at Google, after having been the Director of Search Quality. Prior to that he had been the head of the Computational Sciences Division at NASA Ames Research Center and before that, an early employee at the late-’90s Internet startup Junglee. He won the NASA Exceptional Achievement Award in 2001 and is a Fellow of the American Association for Artificial Intelligence and the Association for Computing Machinery.

Between Google, NASA, and Junglee, Norvig has experience with both the “hacker” and “engineer” approaches to building software and talks in this interview about the advantages and disadvantages of each. As a former computer science professor and now an insider at one of the biggest industrial software shops in the world, he also has an interesting perspective on the relation between academic computer science and industrial practice.

Other topics in our conversation included how programming has changed in recent years, why no design technique can make up for not knowing what you’re doing, and why NASA might be better off with less-reliable but cheaper software.

Seibeclass="underline" When did you start programming?

Norvig: In high school. We had a PDP-8, I think it was, at my high school, and there was a class I took—we started in BASIC programming and I went from there.

Seibeclass="underline" What year would that have been?

Norvig: I graduated high school in ’74, so it must have been ’72 or ’73. I remember a couple of things, going back to those early days. I remember the teacher trying to teach shuffling of a deck of cards. Her algorithm was, use a random-number generator to pick two locations and then swap them and keep a bit vector that said, these were swapped, and keep going until they’re all swapped. I remember my reaction being, “That’s stupid. That’s gotta be the stupidest thing in the world. It could take forever because there could be one pair that you never happen to choose.” I didn’t know enough then to say it’s n squared when it could have been order n. But I knew that that was just wrong. Then I was able to come up with, I think, the Knuth algorithm of swapping from 0 to 52 and then from 0 to 51 and so on—an order n algorithm. And I remember the teacher defending her approach. That helped me think, “Well, maybe I have an aptitude for this programming stuff.” It also helped me say, “Maybe teachers don’t really know everything.”

Seibeclass="underline" Was it as soon as she described it that you just said, “Wow, this is wrong”? Or did you play with it for a while and then say, “Gosh it seems like we’re doing a lot of work here”?

Norvig: I think I noticed right away. It’s hard to know what I was really thinking back then but I think right away I noticed there’s a finite possibility that this might not terminate. I’m not sure I knew as much about the expected runtime.

I also remember finding my father’s back issues of Scientific American in the attic and going through them. There was this article by Christopher Strachey on software engineering in which he said that people are going to use these higher-order languages. And he had invented this language that there was never a compiler for—it was a paper language. And he said, “I’m going to write a checkers program using this language.” I remember reading that—it was the first nontrivial program I had ever read because in school we were just learning how to shuffle and so on. I read it again recently and the first thing I noticed was that there’s a bug in it. And it was great because you figure this is Christopher Strachey, he should know what he’s doing, and it’s Scientific American, they’ve got editors and so on—they should probably get those bugs out. But in the prose it says there’s a function makemove which takes a board position and returns a move and then you look in the code and there’s make-move and it takes a board position and an extra parameter. Apparently they wrote the prose first and then they wrote the implementation. And they found out you can’t search infinitely deep so they added an extra parameter which was depth of search and you recurse down to a certain level and then you stop. They had added that in afterwards and hadn’t gone back and fixed the documentation.