Читать онлайн "Coders at Work: Reflections on the craft of programming" - Seibel Peter - RuLit

Now in a pure language, if you have a function from string to unit you would never need to call it because you know that it just gives the answer unit. That’s all a function can do, is give you the answer. And you know what the answer is. But of course if it has side effects, it’s very important that you do call it. In a lazy language the trouble is if you say, “f applied to print"hello",” then whether f evaluates its first argument is not apparent to the caller of the function. It’s something to do with the innards of the function. And if you pass it two arguments, f of print "hello" and print "goodbye", then you might print either or both in either order or neither. So somehow, with lazy evaluation, doing input/output by side effect just isn’t feasible. You can’t write sensible, reliable, predictable programs that way. So, we had to put up with that. It was a bit embarrassing really because you couldn’t really do any input/output to speak of. So for a long time we essentially had programs which could just take a string to a string. That was what the whole program did. The input string was the input and result string was the output and that’s all the program could really ever do.

You could get a bit clever by making the output string encode some output commands that were interpreted by some outer interpreter. So the output string might say, “Print this on the screen; put that on the disk.” An interpreter could actually do that. So you imagine the functional program is all nice and pure and there’s sort of this evil interpreter that interprets a string of commands. But then, of course, if you read a file, how do you get the input back into the program? Well, that’s not a problem, because you can output a string of commands that are interpreted by the evil interpreter and using lazy evaluation, it can dump the results back into the input of the program. So the program now takes a stream of responses to a stream of requests. The stream of requests go to the evil interpreter that does the things to the world. Each request generates a response that’s then fed back to the input. And because evaluation is lazy, the program has emitted a response just in time for it to come round the loop and be consumed as an input. But it was a bit fragile because if you consumed your response a bit too eagerly, then you get some kind of deadlock. Because you’d be asking for the answer to a question you hadn’t yet spat out of your back end yet.

The point of this is laziness drove us into a corner in which we had to think of ways around this I/O problem. I think that that was extremely important. The single most important thing about laziness was it drove us there. But that wasn’t the way it started. Where it started was, laziness is cool; what a great programming idiom.

Seibeclass="underline" Since you started programming, what’s changed about how you think about programming?

Peyton Jones: I think probably the big changes in how I think about programming have been to do with monads and type systems. Compared to the early 80s, thinking about purely functional programming with relatively simple type systems, now I think about a mixture of purely functional, imperative, and concurrent programming mediated by monads. And the types have become a lot more sophisticated, allowing you to express a much wider range of programs than I think, at that stage, I’d envisaged. You can view both of those as somewhat evolutionary, I suppose.

Seibeclass="underline" For instance, since your first abortive attempt at writing a compiler you’ve written lots of compilers. You must have learned some things about how to do that that enable you to do it successfully now.

Peyton Jones: Yes. Well, lots of things. Of course that was a compiler for an imperative language written in an imperative language. Now I’m writing a compiler for a functional language in a functional language. But a big feature of GHC, our compiler for Haskell, is that the intermediate language it uses is itself typed.

Seibeclass="underline" And is the typing on the intermediate representation just carrying through the typing from the original source?

Peyton Jones: It is, but it’s much more explicit. In the original source, lots of type inference is going on and the source language is carefully crafted so that type inference is possible. In the intermediate language, the type system is much more general, much more expressive because it’s more explicit: every function argument is decorated with its type. There’s no type inference, there’s just type checking for the intermediate language. So it’s an explicitly typed language whereas the source language is implicitly typed.

Type inference is based on a carefully chosen set of rules that make sure that it just fits within what the type inference engine can figure out. If you transform the program by a source-to-source transformation, maybe you’ve now moved outside that boundary. Type inference can’t reach it any more. So that’s bad for an optimization. You don’t want optimizations to have to worry about whether you might have just gone out of the boundaries of type inference.

Seibeclass="underline" So that points out that there are programs that are correct, because you’re assuming a legitimate source-to-source transformation, which, if you had written it by hand, the compiler would have said, “I’m sorry; I can’t type this.”

Peyton Jones: Right. That’s the nature of static type systems—and why dynamic languages are still interesting and important. There are programs you can write which can’t be typed by a particular type system but which nevertheless don’t “go wrong” at runtime, which is the gold standard—don’t segfault, don’t add integers to characters. They’re just fine.

Seibeclass="underline" So when advocates of dynamic and static typing bicker the dynamic folks say, “Well, there are lots of those programs—static typing gets in the way of writing the program I want to write.” And then the fans of static typing say, “No, they exist but in reality it’s not a problem.” What’s your take on that?

Peyton Jones: It’s partly to do with simple familiarity. It’s very like me saying I’ve not got a visceral feel for writing C++ programs. Or, you don’t miss lazy evaluation because you’ve never had it whereas I’d miss it because I’m going to use it a lot. Maybe dynamic typing is a bit like that. My feeling—for what it’s worth, given that I’m biased culturally—is that large chunks of programs can be perfectly well statically typed, particularly in these very rich type systems. And where it’s possible, it’s very valuable for reasons that have been extensively rehearsed.

But one that is less often rehearsed is maintenance. When you have a blob of code that you wrote three years ago and you want to make a systemic change to it—not just a little tweak to one procedure, but something that is going to have pervasive effects—I find type systems are incredibly helpful.

This happens in our own compiler. I can make a change to GHC, to data representations that pervade the compiler, and can be confident that I’ve found all the places where they’re used. And I’d be very anxious about that in a more dynamic language. I’d be anxious that I’d missed one and shipped a compiler where somebody feeds in some data that I never had and it just fell over something that I hadn’t consistently changed.

I suppose static types, for me, also perform part of my explanation of what the program does. It’s a little language in which I can say something, but not too much, about what this program does. People often ask, “What’s the equivalent of UML diagrams for a functional language?” And I think the best answer I’ve ever been able to come up with is, it’s the type system. When an object-oriented programmer might draw some pictures, I’m sitting there writing type signatures. They’re not diagrammatic, to be sure, but because they are a formal language, they form a permanent part of the program text and are statically checked against the code that I write. So they have all sorts of good properties, too. It’s almost an architectural description of part of what your program does.