Armstrong: Choice of problem, I think. Are you driven by the problems or by the solutions? I tend to favor the people who say, “I’ve got this really interesting problem.” Then you ask, “What was the most fun project you ever wrote; show me the code for this stuff. How would you solve this problem?” I’m not so hung up on what they know about language X or Y. From what I’ve seen of programmers, they’re either good at all languages or good at none. The guy who’s a good C programmer will be good at Erlang—it’s an incredibly good predictor. I have seen exceptions to that but the mental skills necessary to be good at one language seem to convert to other languages.
Seibeclass="underline" Some companies are famous for using logic puzzles during interviews. Do you ask people that kind of question in interviews?
Armstrong: No. Some very good programmers are kind of slow at that kind of stuff. One of the guys who worked on Erlang, he got a PhD in math, and the only analogy I have of him, it’s like a diamond drill drilling a hole through granite. I remember he had the flu so he took the Erlang listings home. And then he came in and he wrote an atom in an Erlang program and he said, “This will put the emulator into an infinite loop.” He found the initial hash value of this atom was exactly zero and we took something mod something to get the next value which also turned out to be zero. So he reverse engineered the hash algorithm for a pathological case. He didn’t even execute the programs to see if they were going to work; he read the programs. But he didn’t do it quickly. He read them rather slowly. I don’t know how good he would have been at these quick mental things.
Seibeclass="underline" Are there any other characteristics of good programmers?
Armstrong: I read somewhere, that you have to have a good memory to be a reasonable programmer. I believe that to be true.
Seibeclass="underline" Bill Gates once claimed that he could still go to a blackboard and write out big chunks of the code to the BASIC that he written for the Altair, a decade or so after he had originally written it. Do you think you can remember your old code that way?
Armstrong: Yeah. Well, I could reconstruct something. Sometimes I’ve just completely lost some old code and it doesn’t worry me in the slightest. I haven’t got a listing or anything; just type it in again. It would be logically equivalent. Some of the variable names would change and the ordering of the functions in the file would change and the names of the functions would change. But it would be almost isomorphic. Or what I would type in would be an improved version because my brain had worked at it.
Take the pattern matching in the compiler which I wrote ten years ago. I could sit down and type that in. It would be different to the original version but it’d be an improved version if I did it from memory. Because it sort of improves itself while you’re not doing anything. But it’d probably have a pretty similar structure.
I’m not worried about losing code or anything like that. It’s these patterns in your head that you remember. Well, I can’t even say you remember them. You can do it again. It’s not so much remembering. When I say you can remember a program exactly, I don’t think that it’s actually remembering. But you can do it again. If Bill could remember the actual text, I can’t do that. But I can certainly remember the structure for quite a long time.
Seibeclass="underline" Is Erlang-style message passing a silver bullet for slaying the problem of concurrent programming?
Armstrong: Oh, it’s not. It’s an improvement. It’s a lot better than shared memory programming. I think that’s the one thing Erlang has done—it has actually demonstrated that. When we first did Erlang and we went to conferences and said, “You should copy all your data.” And I think they accepted the arguments over fault tolerance—the reason you copy all your data is to make the system fault tolerant. They said, “It’ll be terribly inefficient if you do that,” and we said, “Yeah, it will but it’ll be fault tolerant.”
The thing that is surprising is that it’s more efficient in certain circumstances. What we did for the reasons of fault tolerance, turned out to be, in many circumstances, just as efficient or even more efficient than sharing.
Then we asked the question, “Why is that?” Because it increased the concurrency. When you’re sharing, you’ve got to lock your data when you access it. And you’ve forgotten about the cost of the locks. And maybe the amount of data you’re copying isn’t that big. If the amount of data you’re copying is pretty small and if you’re doing lots of updates and accesses and lots of locks, suddenly it’s not so bad to copy everything. And then on the multicores, if you’ve got the old sharing model, the locks can stop all the cores. You’ve got a thousand-core CPU and one program does a global lock—all the thousand cores have got to stop.
I’m also very skeptical about implicit parallelism. Your programming language can have parallel constructs but if it doesn’t map into hardware that’s parallel, if it’s just being emulated by your programming system, it’s not a benefit. So there are three types of hardware parallelism.
There’s pipeline parallelism—so you make a deeper pipeline in the chip so you can do things in parallel. Well, that’s once and for all when you design the chip. A normal programmer can’t do anything about the instructionlevel parallelism.
There’s data parallelism, which is not really parallelism but it has to do with cache behavior. If you want to make a C program go efficiently, if *p is on a 16-byte boundary, if you access *p, then the access to *(p + 1) is free, basically, because the cache line pulls it in. Then you need to worry about how wide the cache lines are—how many bytes do you pull in in one cache transfer? That’s data parallelism, which the programmer can use by being very careful about their structures and knowing exactly how it’s laid out in memory. Messy stuff—you don’t really want to do that.
The other source of real concurrency in the chip are multicores. There’ll be 32 cores by the end of the decade and a million cores by 2019 or whatever. So you have to take the granules of concurrency in your program and map them onto the cores of the computer. Of course that’s quite a heavyweight operation. Starting a computation on a different core and getting the answer back is itself something that takes time. So if you’re just adding two numbers together, it’s just not worth the effort—you’re spending more effort in moving it to another core and doing it and getting the answer back than you are in doing it in place.
Erlang’s quite well suited there because the programmer has said, “I want a process, I want another process, I want another process.” Then we just put them on the cores. And maybe we should be thinking about actually physically placing them on cores. Probably a process that spawns another process talks to that process. So if we put it on a core that’s physically near, that’s a good place to put it, not on one that’s a long way away. And maybe if we know it’s not going to talk to it a lot maybe we can put it a long way away. And maybe processes that do I/O should be near the edge of the chip—the ones that talk to the I/O processes. As the chips get bigger we’re going to have to think about how getting data to the middle of the chip is going to cost more than getting it to the edge of the chip. Maybe you’ve got two or three servers and a database and maybe you’re going to map this onto the cores so we’ll put the database in the middle of the chip and these ones talk to the client so we’ll put them near the edge of the chip. I don’t know—this is research.
Seibeclass="underline" You care a lot about the idea of Erlang’s way of doing concurrency. Do you care more about that idea—the message-passing shared-nothing concurrency—or Erlang the language?
Armstrong: The idea—absolutely. People keep on asking me, “What will happen to Erlang? Will it be a popular language?” I don’t know. I think it’s already been influential. It might end up like Smalltalk. I think Smalltalk’s very, very influential and loved by an enthusiastic band of people but never really very widely adopted. And I think Erlang might be like that. It might need Microsoft to take some of its ideas and put some curly braces here and there and shove it out in the Common Language Runtime to hit a mass market.