Now you might hope to do some static checking on this. It might be hard or easy. But even having the property written down in a formal way is a real help. You can test it by generating test data, which is, indeed, just what QuickCheck does.
So rather than trying to write down specifications for all that a program does I think it’s much more productive to write down partial specifications. Perhaps multiple partial specifications. And then check them either by testing or by dynamic checks or by static checks. You never prove that your program is right. You just increase your confidence that it’s right. And I think that’s all that anybody ever does.
Seibeclass="underline" So you define however many properties, covering the things you care about. And then you can choose to confirm that those properties actually hold either statically or dynamically, depending on what’s actually feasible. Because we may not know how to statically check them all?
Peyton Jones: Right. But in a functional setting, you have a better chance. But we’ve still been dragging our feet a bit about demonstrating that. Nevertheless—step one is to write down these properties in the first place.
But I think the big thing is to get away from this monolithic, all-or-nothing story about specification and to say that you can do useful static and dynamic tests on partial specifications. These will increase your confidence in the correctness of your program and that is all you can possibly hope for.
Even the allegedly complete specifications miss out—you know, it has to work in .1 of a second. Or must fit in 10KB of memory. Resource things are often not covered. Or timing things. There’s endless little stuff that means the program might actually not function as desired even though it meets its formal specification. So I think we’re kidding ourselves to say we’ve actually proved the whole thing is completely right. Best thing to do is to acknowledge that and say we’re improving our confidence—that’s what we’re doing. And that can start quite modestly—you might have improved your confidence by 75 percent with only 5 percent of the effort. That’d be good.
Seibeclass="underline" Let’s talk about concurrency a little bit. Guy Steele asked me to ask you: “Is STM going to save the world?”
Peyton Jones: Oh, no. STM is not going to save the world on its own. Concurrency, and parallel programming generally, is a many-faceted beast and I don’t think it will be slain by a single bullet. I’m a diversifist when it comes to concurrency.
It’s tempting to say, “Use one programming paradigm for writing concurrent programs and implement it really well and that’s it;” people should just learn how to write concurrent programs using that paradigm. But I just don’t believe it. I think for some styles of programming you might want to use message passing. For others you might want to use STM. For others data parallelism is much better. The programmer is going to need to grapple with more than one way to do it.
But if you ask me, is STM better than locks and condition variables? Now you’re comparing like with like. Yes. I think it completely dominates locks and condition variables. So just forget locks and condition variables. For multiple program counters, multiple threads, diddling on shared memory on a shared-memory multicore: STM. But is that the only way to write concurrent programs? Absolutely not.
Seibeclass="underline" A criticism I’ve heard of STM was that when it really gets down to it, optimistic concurrency won’t allow as much concurrency as you hope. I think the claim was you can fairly easily get in these situations where you’re really not making progress.
Peyton Jones: You do have to worry about starvation. My favorite example here is of one big transaction that keeps failing to commit because a little transaction gets in there and commits first. So an example would be a librarian who’s reorganizing their library. They start optimistically reorganizing their library. And they’ve got two-thirds of the way through and an undergraduate comes along and borrows a book. Well, he commits his transaction successfully because the library reorganization hasn’t committed. The librarian gets to the end and discovers, ah, I saw an inconsistent view of memory because the library changed while I was reorganizing it so I just have to go back and start again.
Seibeclass="underline" In a locks-and-condition-variables program it would probably go the other way—the librarian would lock the library and nobody could check out books until it’s completely reorganized. So you would probably look at this problem and immediately say, “We can’t lock the library until we’re done,” disallowing checkouts so we have to come up with some hairier locking scheme.
Peyton Jones: Right. Make a little sub-library or something—put the commonly borrowed books out there so undergraduates can borrow them while you lock the main library and reorganize it or something. So now you’ve got to think of an application-specific strategy and now you’ve got to express it in some way. Well, the same problem arises in both cases—you need an application-specific strategy so you can reorganize the library despite not wanting to block out all sorts of borrowing. Once you’ve done the hard thinking about how you wish to do it, now you’ve got to express that. What is the medium in which to express it? STM is a clear win. It’s just much better than locks and condition variables for expressing concurrent programs.
Seibeclass="underline" What if I don’t even want to allow for the possibility that someone comes in and looks for the 21st most-requested book and gets blocked? In the physical world you can imagine that when someone checks out a book we swap in a proxy for the book that the librarian then reorganizes and whenever a book comes back we put it back wherever the proxy is now. But you are modifying the library which seems, in an STM world, like it would cause the librarian to have to retry his whole transaction.
Peyton Jones: But there’s something that stayed the same—the key on the book is guaranteed not to change somehow, right? So there’s a number of ways you could do this. One is you could say, “What you do when you replace it with a proxy is you don’t modify the library at all”—that’s unchanged. What you do is you modify the book itself. And you don’t modify its key field—you only modify its value field, where it’s currently living. Now the index can be reorganized while the book is somewhere else. That’s cool—and you can express that perfectly naturally.
With STM, at the end the librarian looks through all the memory locations that he has read and sees if they contain now the same values that they did when he read them. So the locations that he has read will include the key field of the book because that determined where he put it. But he hasn’t read the contents of the book. So he’ll say, “Ah, this book—does this key field still contain 73; oh, yes it does.”
But I don’t want to minimize the problem of starvation because it’s a bit insidious. You need good profiling tools that point you at transactions that are failing to commit because they keep getting bumped so that, rather than the program just silently not doing very much, you get some feedback about it. The same is true of a lock-based program. I hate it when those hourglasses appear.
Seibeclass="underline" I guess in locked-based programs we’ve just learned to try hold locks for as short a duration as possible since that will give us the least contention.
Peyton Jones: Right. But, of course, then it’s harder to program. Finergrained locking is tricky to get right. I think this is one of the huge wins of STM, is it gives you the fine granularity of very fine-grained locking along with very simple reasoning principles.
Here’s a reasoning principle that STM gives you that locks absolutely do not. I’ll establish my top-level invariants—I’ve got a bunch of bank accounts, the total sum of money in all the bank accounts added together is N. Money moves between bank accounts—that’s all. So there’s my invariant. Any transaction assumes that invariant at the beginning and restores it at the end. How do you reason that it does? We look at any one transaction that says, “Take three out of that one and put three into that one.” Good, invariant maintained. How is my reasoning in that done? Purely sequential reasoning. Once I’ve described some top-level invariants, I can reason completely sequentially about each transaction separately.