The human mind affects the body and the outside world by emitting nerve impulses. Therefore a virtual-reality generator can in principle obtain all the information it needs about what the user is doing by intercepting the nerve signals coming from the user’s brain. Those signals, which would have gone to the user’s body, can instead be transmitted to a computer and decoded to determine exactly how the user’s body would have moved. The signals sent back to the brain by the computer can be the same as those that would have been sent by the body if it were in the specified environment. If the specification called for it, the simulated body could also react differently from the real one, for example to enable it to survive in simulations of environments that would kill a real human body, or to simulate malfunctions of the body.
I had better admit here that it is probably too great an idealization to say that the human mind interacts with the outside world only by emitting and receiving nerve impulses. There are chemical messages passing in both directions as well. I am assuming that in principle those messages could also be intercepted and replaced at some point between the brain and the rest of the body. Thus the user would lie motionless, connected to the computer, but having the experience of interacting fully with a simulated world — in effect, living there. Figure 5.2 illustrates what I am envisaging. Incidentally, though such technology lies well in the future, the idea for it is much older than the theory of computation itself. In the early seventeenth century Descartes was already considering the philosophical implications of a sense-manipulating ‘demon’ that was essentially a virtual-reality generator of the type shown in Figure 5.2, with a supernatural mind replacing the computer.
From the foregoing discussion it seems that any virtual-reality generator must have at least three principal components:
a set of sensors (which may be nerve-impulse detectors) to detect what the user is doing,
a set of image generators (which may be nerve-stimulation devices), and
a computer in control.
My account so far has concentrated on the first two of these, the sensors and the image generators. That is because, at the present primitive state of the technology, virtual-reality research is still preoccupied with image generation. But when we look beyond transient technological limitations, we see that image generators merely provide the interface — the ‘connecting cable’ — between the user and the true virtual-reality generator, which is the computer. For it is entirely within the computer that the specified environment is simulated. It is the computer that provides the complex and autonomous ‘kicking back’ that justifies the word ‘reality’ in ‘virtual reality’. The connecting cable contributes nothing to the user’s perceived environment, being from the user’s point of view ‘transparent’, just as we naturally do not perceive our own nerves as being part of our environment. Thus virtual-reality generators of the future would be better described as having only one principal component, a computer, together with some trivial peripheral devices.
FIGURE 5.2. Virtual reality as it might be implemented in the future.
I do not want to understate the practical problems involved in intercepting all the nerve signals passing into and out of the human brain, and in cracking the various codes involved. But this is a finite set of problems that we shall have to solve once only. After that, the focus of virtual-reality technology will shift once and for all to the computer, to the problem of programming it to render various environments. What environments we shall be able to render will no longer depend on what sensors and image generators we can build, but on what environments we can specify. ‘Specifying’ an environment will mean supplying a program for the computer, which is the heart of the virtual-reality generator.
Because of the interactive nature of virtual reality, the concept of an accurate rendering is not as straightforward for virtual reality as it is for image generation. As I have said, the accuracy of an image generator is a measure of the closeness of the rendered images to the intended ones. But in virtual reality there are usually no particular images intended: what is intended is a certain environment for the user to experience. Specifying a virtual-reality environment does not mean specifying what the user will experience, but rather specifying how the environment would respond to each of the user’s possible actions. For example, in a simulated tennis game one may specify in advance the appearance of the court, the weather, the demeanour of the audience and how well the opponent should play. But one does not specify how the game will go: that depends on the stream of decisions the user makes during the game. Each set of decisions will result in different responses from the simulated environment, and therefore in a different tennis game.
The number of possible tennis games that can be played in a single environment — that is, rendered by a single program — is very large. Consider a rendering of the Centre Court at Wimbledon from the point of view of a player. Suppose, very conservatively, that in each second of the game the player can move in one of two perceptibly different ways (perceptibly, that is, to the player). Then after two seconds there are four possible games, after three seconds, eight possible games, and so on. After about four minutes the number of possible games that are perceptibly different from one another exceeds the number of atoms in the universe, and it continues to rise exponentially. For a program to render that one environment accurately, it must be capable of responding in any one of those myriad, perceptibly different ways, depending on how the player chooses to behave. If two programs respond in the same way to every possible action by the user, then they render the same environment; if they would respond perceptibly differently to even one possible action, they render different environments.
That remains so even if the user never happens to perform the action that shows up the difference. The environment a program renders (for a given type of user, with a given connecting cable) is a logical property of the program, independent of whether the program is ever executed. A rendered environment is accurate in so far as it would respond in the intended way to every possible action of the user. Thus its accuracy depends not only on experiences which users of it actually have, but also on experiences they do not have, but would have had if they had chosen to behave differently during the rendering. This may sound paradoxical, but as I have said, it is a straightforward consequence of the fact that virtual reality is, like reality itself, interactive.
This gives rise to an important difference between image generation and virtual-reality generation. The accuracy of an image generator’s rendering can in principle be experienced, measured and certified by the user, but the accuracy of a virtual-reality rendering never can be. For example, if you are a music-lover and know a particular piece well enough, you can listen to a performance of it and confirm that it is a perfectly accurate rendering, in principle down to the last note, phrasing, dynamics and all. But if you are a tennis fan who knows Wimbledon’s Centre Court perfectly, you can never confirm that a purported rendering of it is accurate. Even if you are free to explore the rendered Centre Court for however long you like, and to ‘kick’ it in whatever way you like, and even if you have equal access to the real Centre Court for comparison, you cannot ever certify that the program does indeed render the real location. For you can never know what would have happened if only you had explored a little more, or looked over your shoulder at the right moment. Perhaps if you had sat on the rendered umpire’s chair and shouted ‘fault!’, a nuclear submarine would have surfaced through the grass and torpedoed the Scoreboard.