Roger Penrose’s book Shadows of the Mind confronts these problems head-on, by giving a detailed assessment of the question: Is the human mind a computer? Can it be simulated by a computer, and can its workings be understood and explained only using computation? Penrose is conclusion is a resounding no: he concludes that the human mind is not a computer, and the workings of the human mind cannot be explained by computation alone.
This book is particularly interesting to me because, having read many books and listened to many talks/podcasts on consciousness and the mind, I myself have my own strong views on the answers to these questions. My conclusion is opposite to Penrose’s, namely, it is my belief that the human mind is performing a computation. Normally a conclusion such as Penrose’s would not bother me, because many arguments can be wielded that counter such claims. But Penrose’s book is different, because at its core is a solid logical argument, one that cannot just be ignored or countered easily and quickly.
In this post I will introduce Penrose’s logical argument, and then explain how he exploits the current gaps in the unification of general relativity and quantum mechanics to justify that the human mind must utilise a new type of physics – one that cannot be simulated on a computer.
The logical argument
I will dive straight in to the key argument at the centre of this book. Consider a computation that acts on a single natural number, n. This could be something as simple as the computation “n+1”. Or it could be a vastly complicated and elaborate computation, such as the algorithm that generates the graphics of the nth level on Super Mario.
Next, we need a way of labelling all the different computations that could act on n. We say that C(q,n) is the qth computation on natural number n. It is possible to write down every single possible computation that acts on a single natural number in this way. In other words, the complete set of computations labelled C(q,n) represents every single possible computation acting on a single natural number.
The way a computation works is that it takes in an input, then it runs for some period of time, finally producing an output. When it produces the output we say it stops. However, not all algorithms stop. For example, we could have an algorithm that takes n, then adds 1, then loops around and repeats this indefinitely – i.e. it keeps adding 1 forever. We say that this algorithm does not stop. Given this, it might be of interest to know whether a given algorithm C(q,n) stops or not. To do this, we can design another algorithm, A(q,n), whose job it is to say whether the qth computation on natural number n stops:
If A(q,n) stops, then C(q,n) does not stop.
This statement is true for any values of q and n. So what about taking q to be equal to n? With this, we get:
If A(n,n) stops, then C(n,n) does not stop. (1)
But notice that considering only the cases when q is equal to n, the computation A(n,n) is now only acting on a single natural number, n. Because C(q,n) includes every possible computation and a single natural number, it must also include this computation. We can label it C(k,n). Therefore,
A(n,n) = C(k,n).
Now, this is true for any value of n. In particular, it is true for the case when n equals k. Considering this value only, we have:
A(k,k) = C(k,k). (2)
Now consider equation (1) above, but for the specific value k. This reads:
If A(k,k) stops, then C(k,k) does not stop. (3)
Finally, combining equations (2) and (3), we have:
If C(k,k) stops, then C(k,k) does not stop. (4)
What can we conclude from this? Well, C(k,k) cannot stop, because if it did then according to equation (4) C(k,k) would not stop, which would be a contradiction. But A(k,k) (which is the same as C(k,k)) is the computation that is specifically designed to determine whether C(k,k) stops. Therefore, because A(k,k) cannot stop, it is not possible for a computational procedure to determine whether C(k,k) stops. But we know that C(k,k) does not stop. Therefore, we know something that no computational procedure could ever know!
Now comes the really interesting part: we can repeat the argument above, but instead of C(q,n) representing all possible computations, we can instead think of C(q,n) as representing all possible computational procedures available to humankind. We then find:
If C(k,k) stops, then C(k,k) does not stop.
Therefore we know that C(k,k) does not stop.
But C(k,k) cannot determine this, and therefore the collection C(q,n), which represents all possible computational procedures available to humankind, cannot determine that C(k,k) does not stop. But, as said above, we know that C(k,k) does not stop. Therefore, humans cannot be using a computational procedure to determine this, otherwise there will be a contradiction.
I have skipped some of the more subtle steps in the above explanation. The more precise conclusion would say: humans cannot be using a knowably sound algorithm to determine that C(k,k) does not stop (for brevity I will not introduce what knowably or sound mean – the interested reader can check out Penrose’s book!). To very briefly mentioned some of the more subtle details, instead of concluding that humans cannot be using a computational procedure, we could instead conclude that humans are not using a sound algorithm, or a knowably sound algorithm. But Penrose very convincingly argues that we are using a sound algorithm, and it is knowable, which allows to conclude that humans cannot be using a computational procedure to ascertain mathematical truth.
But if our brains aren’t working computationally, then what are they doing? Are we using some obscure algorithm that computer scientists haven’t come up with yet, or are our minds acting as a quantum computer? In fact, neither of these options are possible, because what we mean by computation includes any process in known physics, including quantum mechanics. Therefore, the inner workings of the human mind must be beyond current physics! So what could this physics be? Is there really any room for new physics, and even if there is, can it really solve the problem raised above?
Gravity-induced collapse of the quantum wavefunction
Those readers not already familiar with the measurement problem in quantum mechanics should read my earlier blog, as otherwise this section won’t make any sense! In short, to make any sense of the process of collapse of the wavefunction – and therefore to make any sense of quantum mechanics – one must subscribe to one of the many different interpretations of quantum mechanics. To choose between them, you must decide whether you think quantum mechanics is complete/correct or not, and whether you think the quantum wavefunction represents reality, or just our state of knowledge. My own view is that quantum physics is correct, and that the theory of decoherence explains why the quantum state appears to collapse. Then, taking the uncontroversial assumption that the quantum wavefunction represents reality, I (and many others) are led to the conclusion that there are multiple parallel universes!
But different people, with different backgrounds, expertise, prejudices, and life philosophies, can take a different perspective. Physicists have not yet successfully combined general relativity and quantum mechanics into a single unified theory, so in a sense they can’t both, in their current forms, be completely correct. Given this issue, a general relativity researcher’s view could be that general relativity is correct. As with quantum mechanics, general relativity has so far never been proved wrong. But as general relativity and quantum mechanics cannot be unified, one of them, or both of them, must be incorrect and in need of modification. And given the unintuitive conclusion of assuming that quantum mechanics was correct (parallel universes!), we should assume that quantum mechanics needs modification. Many scientists might argue that general relativity is more elegant and beautiful than quantum mechanics, so many researchers would think this is a reasonable stance.
How should we modify quantum mechanics? One of the issues of combining quantum mechanics and general relativity is that it is not clear how superpositions of different gravitational fields can be understood. Roger Penrose argues that this is solved by modifying the theory so that superpositions of gravitational fields are unstable – the larger the field, the more unstable – and therefore “large” objects like humans and cats cannot exist in superposition states.
Penrose then argues that there is reason to believe that a theory in which gravity collapses the wavefunction might be non-computational. Therefore, if the human mind utilises such physics, it will be working in a non-computational way.
This line of reasoning then solves many problems in one swoop: the measurement problem is solved because gravity causes collapse of the wavefunction; one of the main problems with unifying quantum mechanics and general relativity is solved because large gravitational fields can no longer be in a superposition; and the problem of what the human mind is doing if it is not using known physics is solved because the human mind might be using the new physics that comes from gravity-induced collapse.
New physics in the brain
A popular view nowadays is that neurons are the basic “computational” element in the brain. A single neuron is a type of cell that either fires (releases an electrical signal) or does not fire, depending on the electrical signals going into the neuron from other neurons connected to it at any given time. While a single neuron is quite limited, the theory is that the 100 billion neurons in the brain, linked together by 100 trillion connections, forms a highly complicated device that produces all the fantastic thoughts and intelligent processes that go on inside our heads. This view is backed up in part by the growing evidence that artificial neural networks – which try to replicate the brain structure – can do quite amazing things, such as recognise faces or play the game of Go better than the human champion. Furthermore, it has been showed that single neurons in the brain can react to specific individual’s faces (e.g. the Jennifer Anderson neuron! https://www.nature.com/news/2005/050620/full/news050620-7.html ).
But this view of how the mind works is bad news for Penrose’s view because neurons are too large and interact too much with their surroundings to sustain any quantum coherence for a decent amount of time. In other words, a neuron cannot both fire and not fire simultaneously, in a superposition, because it interacts so readily with its surroundings that this quantum superposition is almost instantaneously destroyed (https://arxiv.org/abs/quant-ph/9907009). And in order for gravity-induced collapse to play any significant part in the brains workings, a superposition state must be sustained for a reasonable amount of time. So if neurons are the basic element in the brain that are responsible for our thoughts, then gravity-induced collapse could play no part, and the brain would be entirely computational.
However, it has certainly not been proven that the entire working of the human mind is just a neural network – to some extent this is an assumption held by current science. Indeed, neurons are not the smallest things in the brain, and it is entirely possible that each neuron is itself a miniature processor, comprised of many more basic elements that together do the work (normally a good analogy here would be that each neuron is like a miniature computer, but as the whole point is that the brain is not acting as a computer, this analogy cannot be used!).
Things certainly gets quite speculative here – and Penrose himself admits this – but the proposed site for the required non-computational processes is something called a microtubule. I won’t go into what microtubules are, because I don’t understand myself, but what I do know is that they are small enough and sufficiently ill-understood that it is possible both that they play an important part in how our brain works, and that they can sustain quantum coherence long enough so that gravity-induced collapse can play a part in how they process information. Penrose’s theory then is that gravity-induced collapse takes place in the microtubules, which means that human brain is doing something non-computational, which in turn allows us to escape the grasp of the logical argument given earlier.
Penrose’s view has many significant implications for neuroscience and artificial intelligence. Firstly, we would not be able to understand how the mind works without understanding the new physics introduced above, and as we are very far from understanding this physics, a full understanding of the mind would be a very long way off. This seems to be quite opposing to the current view in neuroscience and consciousness studies. In addition, artificial intelligence that comes even close to human cognition would not be possible without understanding this new physics. Again, this is contrary to the current feeling in the artificial intelligence community, in which most researchers are confident that artificial intelligence will continue its impressive upward trajectory and before long many human-level cognitive tasks will be surpassed by artificially intelligent agents.
Can we escape Penrose’s logical argument?
Because of the stark contrast between Penrose’s view of how the mind works, and the artificial intelligence/neuroscience views, it seems important to try to resolve the issues raised in Penrose’s book. But while Penrose’s view is relatively comprehensive, well-structured, and well thought through, it depends heavily on his logical argument. It seems that if the logical argument falls down, then there is no strong scientific reason to believe that the human mind is acting non-computationally. In turn, there is no need to introduce something radical and speculative such as gravity-induced collapse in microtubules.
For many researchers, the whole theory seems to rest somewhat precariously on this logical argument, and it is highly tempting to just disregard it or ignore it. Unless I’m mistaken, this is potentially what almost the entire artificial intelligence and neuroscience communities are doing! To some extent this is understandable, because there aren’t many people in the world who understand logic and computation enough to thoroughly study Penrose’s argument in order to try and find a flaw in it. Until recently, I too was culpable of strongly believing that the human mind was a computer, without myself trying to find a flaw (or trying to find others who’d found a flaw!) in the logical argument. For such a large chunk of scientists to ignore his argument might turn out to be a risky strategy, and arguably more researchers should invest in studying his points more carefully.
I should mention here that after Penrose’s first book on this topic, The Emperor’s New Mind, in which he introduced a similar but slightly weaker logical argument, many researchers tried to find flaws, but Penrose has seemingly quite successfully batted these all away in Shadows of the Mind with a comprehensive half-chapter in which he carefully and thoroughly addresses 20 of such arguments. It’s probably safe to assume that Penrose also received many attempts at counter-arguments after his second book, but I know that Penrose himself has been unconvinced by these, and there is certainly no widely agreed upon and well understood counterargument that the rest of the scientific community wield in order to reject Penrose’s views.
Having now read the key parts of Shadows of the Mind twice, and spending some time pondering it, I myself now have many ideas for where I think the logical argument could fall down. However, I am not a logician and I have certainly not spent enough time to confidently say that I thoroughly understanding the argument, so while there is a remote chance that my ideas are correct, it’s probably more likely that they are not! So I’ll refrain from presenting them here, but I’m very happy to discuss them with anyone interested.
I will, however, present the core of a counter-argument introduced in a very interesting review of Shadows of the Mind by one of the most important consciousness researchers, David Chalmers (http://journalpsyche.org/files/0xaa25.pdf) . The general structure of Penrose’s argument involves taking a number of assumptions, then through the steps presented above he derives a contradiction. In Shadows of the Mind he quite convincingly argues that of all the assumptions are reasonable, except the assumption that the human mind acts computationally. Having derived a contradiction, at least one of the assumptions must be dropped, so Penrose drops the computational one. But Chalmers presents an argument (by McCullough and Löb) that takes all of Penrose’s assumptions, without the one about computation, then after a number of steps a contradiction is derived. Therefore, one of these assumptions is wrong. But this means that one of Penrose’s assumptions – and crucially not the one about computation – is incorrect, and Penrose should drop this one rather than his one about competition. Then there is no reason – at least not based on this logical argument – to think that the human mind is not acting as a computation.
Conclusion
I want to finish on a high, so I will end by saying that this is an exciting and fascinating book to read. It is clear, well-written, entertaining, and thorough. Given the conclusions alone, my first reaction was that they are implausible and outlandish. But following his arguments step-by-step, he only rarely takes radical or outlandish steps. If he is wrong, then he is only subtly wrong, so naturally the reader is led along a path that eventually leads to his conclusions. I still disagree with them, but I had an enlightening and thoroughly enjoyable time reading his book, and trying to work out which subtle points in the logical argument I disagree with, so that my own view – that the human mind is a computer – can be kept intact!
]]>
If you are familiar enough with quantum information, I bet you have already heard of at least one meta-theoretical statement — a very important one. Before we come to that, let us give some examples to clarify the above tripartite categorisation of knowledge. Since this blog is about quantum mechanics, that will be our arena. In this context, our knowledge of the double-slit experiment and of its profoundly surprising results is undoubtedly empirical. Building on this and other experiments, physicists have come up with a mathematical theory of physical phenomena at the microscopic level, called quantum mechanics. Quantum mechanics, with its internal structure made of Hilbert spaces, wave functions, Born rules for measurement and collapse, and so on and so forth, is among the most luminous and successful physical theories, and thus belongs to the second layer of knowledge. One of the most striking phenomena whose existence is predicted by quantum mechanics is entanglement, i.e. the fact that distant particles may be left by previous interactions in a state that produces impossibly strong correlations upon local measurements on the two particles. The precise meaning of the words ‘impossibly strong’ was clarified by J.S. Bell in his groundbreaking 1964 paper: those correlations cannot be explained by any local hidden variable model, i.e. they are fundamentally non-local. I want to argue that this deduction — nowadays known as Bell’s theorem — is the prime example of a meta-theoretical statement. To understand why, note that Bell’s theorem has basically nothing to do with quantum mechanics; rather, it concerns its experimental predictions. It is these predictions, and not merely quantum mechanics, that are claimed to be incompatible with a local hidden variable model. In other words, any alternative theory that happened to predict the same experimental outcomes with the same probabilities (at least in the context of a Bell test) would run into the same problem, the existence of non-local correlations. Since Bell’s theorem applies to a whole class of theories, i.e. those whose experimental predictions of the outcomes of a Bell test coincide with those of quantum theory, it is a truly meta-theoretical statement. Not only: its importance lies precisely in this feature. Every person who is familiar with the formalism of quantum mechanics can see that it is intrinsically ‘non-local’, because of the way post-measurement collapses work. But this observation alone is not worth much; what if the true theory of Nature were not quantum mechanics? We cannot draw conclusions about the deepest mysteries of Nature based on our formalism; instead, we need meta-theoretical statements to certify that the bizarre phenomenon we are facing does not admit a more ‘mundane’ explanation in the context of some alternative theory.
This discussion should convince the reader of the conceptual importance of meta-theoretical results. That being said, how do we study a set of theories that we do not even know? How do we formalise the abstract concept of a ‘physical theory’? An answer to these questions is provided by the machinery of general probabilistic theories, which we now set out to describe.
In quantum information and more generally in information theory it is common to take an operational approach to problems: a concept is defined and characterised by what allows you to do. Thus, a physical theory is for us simply a set of rules that allow to deduce a probabilistic prediction of the outcome of an experiment given the detailed description of its preparation. The above definition is quite broad. Let us discuss some of its main features. For a start, (i) ‘probabilistic’, as opposed to ‘deterministic’, refers to the fact that it is conceivable that the theory determines only the probabilities of recording the various outcomes rather than producing a definite prediction of what the outcome will be. This is the case, for instance, for quantum mechanics, which we definitely want to consider as a legitimate theory. A second observation is that (ii) the above definition depends on the set of experiments under examination; the smaller the set, the larger the number of possible theories. This is in general a desirable feature, as long as we want to be able to declare e.g. Newtonian gravity a fully-fledged physical theory, although it can only explain the motion of slow massive bodies and gets into trouble when one brings electromagnetic phenomena or relativistic speeds into the picture. A third remarkable feature of our definition of a physical theory is that (iii) it does not necessarily describe systems that evolve in time. Instead, it takes an ‘input-output’ point of view: an experiment is described as a procedure starting with a preparation of a physical system and ending with the reading of some digits on a screen. Of course, we can decide to track time evolution by designing a whole set of experiments on identically prepared systems ending at different times, but this need not be the case; depending on the context, we can content ourselves with a much more essential description of a more limited set of experiments.
With this premise, we can now study the simplest physical theories of all: those that describe a single system. Such a system — that can be an atom, an electron, but also Schrödinger’s cat, depending on your stance on the issue of animal rights — can be prepared in a variety of ways, labelled by an index ω ∈ Ω, where Ω is the set of all possible preparations (hereafter called ‘states’). Also the measuring apparatus that we employ to measure the system can be set up in different ways; each 𝝀 ∈ Λ will describe a specific outcome corresponding to a particular setting of the apparatus (hereafter called ‘effect’). In this context, a theory is simply a function 𝜇 that takes as inputs a preparation ω and a measurement outcome (together with the description of the setting of the measuring apparatus) 𝜆 and outputs a probability, i.e. a real number between 0 and 1. In mathematical terms, we write 𝜇 : Ω × Λ → [0, 1].
It may be useful to pause for a second now, and describe the familiar example of quantum mechanics in this language. For the sake of simplicity, we consider a single qubit, i.e. a 2-level quantum system. The set of states Ω comprises all 2×2 density matrices and is well-known to be representable by a 3-dimensional ball — called the Bloch ball. The shape of the set of effects Λ depends on the actual measurements that are available in our laboratory. In the best-case scenario, we can assume that all so-called positive operator-valued measurements (POVM) are experimentally accessible. When this happens, effects are 2×2 matrices 𝝀 that satisfy 0 ≤ 𝝀 ≤ 1, where 1 stands for the identity matrix, and the inequalities are to be intended in the sense of positive semidefiniteness. Finally, the function 𝜇 encodes the Born rule: 𝜇(ω, 𝝀) ≔ Tr[ω 𝝀].
In the most general (non-quantum) case, the function 𝜇 as well as the sets Ω and Λ are a priori unconstrained. However, if they are to describe a physical system, it is very reasonable to postulate that they satisfy some further properties, purely on logical grounds. We formalise these properties as axioms below.
Axiom 1. The function 𝜇 separates points of Ω and Λ, namely, if some ω_{1}, ω_{2} ∈ Ω satisfy 𝜇(ω_{1}, 𝜆) ≡ 𝜇(ω_{2}, 𝜆) for all 𝝀 ∈ Λ, then it must be ω_{1 }= ω_{2 }. Analogously, if for given 𝝀_{1}, 𝝀_{2} ∈ Λ one has that 𝜇(ω, 𝝀_{1}) ≡ 𝜇(ω, 𝝀_{2}) for all ω ∈ Ω, then 𝝀_{1} = 𝝀_{2 }.
In other words: if two states can not be distinguished by any effect, then to all intents and purposes they are the same state. Analogously for the effects. This assumption is not really fundamental. On the contrary, given any pair of sets Ω and Λ and a function 𝜇 that does not satisfy Axiom 1, we can always define two modified sets Ω’ and Λ’ by identifying the elements that are not separated by 𝜇. The resulting quotient function 𝜇’ then satisfies Axiom 1 on Ω’ and Λ’ by construction.
Axiom 2. There is a ‘trivial effect’ u such that 𝜇(ω, u) ≡ 1 for all ω ∈ Ω. Moreover, for all 𝜆 ∈ Λ there is an ‘opposite effect’ 𝜆’ such that 𝜇(ω, 𝜆’) ≡ 1 − 𝜇(ω, 𝜆) for all ω ∈ Ω .
This axiom is also very natural. The trivial effect can simply be obtained by preparing the measurement apparatus to produce a fixed output. Given an effect 𝜆, setting up the apparatus exactly as described by 𝜆 and accepting all outcomes that do not correspond to 𝜆 yields its logical negation 𝜆’.
Axiom 3. Probabilistic mixtures of states (respectively, effects) are again valid states (respectively, effects). That is, for all states ω_{1}, ω_{2} ∈ Ω and probabilities p ∈ [0,1], there exists a state 𝜏 such that 𝜇(𝜏, 𝜆) = p 𝜇(ω_{1}, 𝜆) + (1 – p) 𝜇(ω_{2}, 𝜆) for all 𝜆 ∈ Λ. Same for the effects.
What Axiom 3 is telling us is that given two preparation procedures ω_{1}, ω_{2}, it should be possible to construct their probabilistic combination. This is a third preparation procedure 𝜏 obtained by flipping a (biased or unbiased) coin, preparing the state according to ω_{1} or ω_{2} depending on the outcome, and then forgetting that same outcome. Clearly, 𝜏 must reproduce probabilistically the behaviour of ω_{1} and ω_{2} when tested on all effects 𝜆.
The formalism we have developed so far seems to be rather inconvenient, especially when compared with the quantum mechanical example described above. Let us identify its main weaknesses. (a) In quantum theory, states and effects are represented by objects in a linear vector space; this means e.g. that the state 𝜏 of Axiom 3 is simply given by the convex combination p ω_{1} + (1 – p) ω_{2 }, which eliminates the need to go through the function 𝜇 to determine it. (b) The quantum Born rule 𝜇(ω, 𝝀) = Tr[ω 𝜆] is bilinear in the state and the effect (meaning that it is linear in each argument when the other is kept fixed). This is a very convenient property especially when long computations are involved.
To fix these inconvenient features of the 𝜇-formalism, a theorem by the German physicist Günther Ludwig comes to our rescue. It states that whenever Axioms 1,2,3 are satisfied, the sets Ω and Λ can be thought of as convex sets in linear vector spaces, so that (a) probabilistic mixtures become convex combinations; and (b) the 𝜇 function (generalised Born rule) is bilinear in the state and the effect. For completeness, we state the theorem formally below.
Theorem (Ludwig’s embedding theorem). If Axioms 1,2,3 are satisfied, there is a vector space V such that:
- Ω ⊆ V, Λ ⊆ V^{*} are convex sets, with 0,u ∈ Λ, and Λ = u − Λ;
- ℝ_{+}∙Ω = C is a cone, ℝ_{+}∙Λ = C^{*} ≔ { 𝜑 ∈ V^{*}: 𝜑(x) ≥ 0 ∀ x ∈ C } ⊆ V^{*} is the corresponding dual cone; the set of states satisfies Ω = {x ∈ C : u(x) = 1};
- the generalised Born rule holds: 𝜇(ω, 𝝀) = 𝝀(ω).
Here, the symbol V^{*} represents the dual of the vector space V, i.e. the space of all linear functionals acting on V. Hereafter, we shall refer to C as the cone of unnormalised states, and to the functional u as the order unit. The dual cone C^{*} lives in the dual space, and can be simply thought of as formed by all linear functionals that are positive on the whole C.
Note. For the sake of readability, we stated the theorem in the simple case where V is finite-dimensional. It should be noted, however, that one of the big conceptual contributions by Ludwig was to formulate and prove an embedding theorem that covers also the infinite-dimensional case, and is (unsurprisingly) much more technically involved.
Ludwig’s embedding theorem provides an elegant solution to the main problems of the 𝜇-formalism as discussed above. It is so important because it provides the foundation for the formalism of general probabilistic theories (GPTs). The theorem was proved long ago, in a series of works published between the 1960s and 1970s. Nowadays, it is implicitly used in practically any paper on GPTs. It is indeed common to start any analysis by simply assuming that the state space is a convex set in a real vector space, and that the effects are linear functionals with values between 0 and 1 on said state space. That this assumption causes no loss of generality is precisely the take-home message of Ludwig’s theorem.
We conclude this post by summarising the formalism of GPTs as it emerges from Ludwig’s theorem (and adding the last bits and pieces). It is useful to start with a picture.
The cone formed by unnormalised states (positive multiples of physical states) is a proper cone C in a real vector space V . Here, ‘proper’ means that it is closed, convex, that it spans the whole space (we want to exclude the ‘flat’ case), and that it does not contain any straight line (opening angle less than 𝜋).
The order unit u belongs to the interior of the dual cone C^{*}, in turn defined as C^{* }= {𝜑 ∈ V^{*}: 𝜑(x) ≥ 0 ∀x ∈ C} ⊆ V^{*};
The state space is obtained by slicing C with u at height 1, in formulae Ω = {x∈ C : u(x) = 1}. Statistical mixtures of states are represented mathematically by convex combinations.
Measurements are represented by (finite) collections of effects (_{ }e_{i })_{i∈I} ⊆ Λ that add up to the order unit, i.e. such that Σ_{i}_{∈}_{I} e_{i }= u. The probability of obtaining the outcome i when measuring the state ω is given by the generalised Born rule: p(i|ω) = e_{i} (ω).
In stating (4) we made the implicit assumption that all linear functionals that take on values between 0 and 1 on Ω are legitimate effects. This assumption is commonly known as the ‘no-restriction hypothesis’, and corresponds geometrically to requiring that the effects Λ cover the whole red set on the right side of the above figure. This is not necessarily the case in many conceivable situations. For example, it may happen that our experimental apparatus is subjected to some intrinsic noise that makes sharp measurements impossible. The main justification of the no-restriction hypothesis, beyond its mathematical simplicity and elegance, lies in the fact that it is well-verified at a fundamental level in classical and quantum theory. That is, we believe it possible, at least in principle, to construct a close-to-ideal measurement apparatus that can implement any given sharp quantum measurement. Unless otherwise specified, we will always make the no-restriction hypothesis from now on.
We conclude this post by discussing some examples of GPTs.
This concludes our introduction to the modern formalism of general probabilistic theories. We have seen how this emerges naturally from a very broad definition of the notion of physical theory via Ludwig’s theorem, and we discussed a few (hopefully instructive) examples. For those who would like to know more about the early days of the GPT formalism, and especially delve into some of the mathematical details, I warmly recommend having a look at [1]. The original proof of Ludwig’s theorem can be found in [2] and references therein. I tried to give a self-contained account of it with all the technical details in Chapter 1 of my PhD thesis [3]. Chapter 2 of the same thesis summarises instead the GPT formalism in finite dimension.
[1] A. Hartkämper and H. Neumann. Foundations of Quantum Mechanics and Ordered Linear Spaces: Advanced Study Institute held in Marburg 1973. Springer Berlin Heidelberg, 1974.
[2] G. Ludwig. An Axiomatic Basis for Quantum Mechanics: Derivation of Hilbert space structure, volume 1. Springer-Verlag, 1985.
[3] L. Lami. Non-classical correlations in quantum mechanics and beyond. PhD thesis, Universitat Autònoma de Barcelona, October 2017. Preprint arXiv:1803.02902.
]]>Yet for those doing science with a passion, beautiful equations are perceived as equally if not more touching than a work of art, and creativity is a fundamental requirement for reaching beyond the state of the art on our understanding of how the universe works. Similarly, an increasing number of artists resort to analytical methods to explore novel ways to communicate inner expressions of self. And let’s not forget that a streak of polymaths, in which a combination of scientific and artistic skills has resulted in universally acclaimed masterpieces which made history, can be traced back many centuries: Leonardo da Vinci is perhaps the brightest example.
In contemporary times, despite a somehow worrisome cultural crisis, I like to think that the community of laypeople who do not actively engage in science or arts as their main profession, but who nevertheless display an interest and a capacity to appreciate advances in both domains — and at their interface — is in fact on the rise.
Quantum science, which is my specialty, seems to be one of the subjects which is especially likely to capture the public’s attention. I was once told that, when popular magazines such as New Scientist or Scientific American put the word “Quantum” on their cover, often accompanied by a catchy drawing, it is almost guaranteed that sales of that particular issue will be bigger than the average. While this has led to misuse and sometimes abuse of quantum concepts in such diverse areas ranging from ‘healing’ to household cleaning, it has also stimulated scientists and artists to find more creative ways to merge the best of both their worlds to engage in better dissemination.
In a previous post, I spoke about “Entanglement”, the painting which was gifted to me by artist Pam Ott and which has inspired my own research on quantum entanglement in the past decade or more.
More recently, we had another successful cross-fertilisation between maths and visual arts, with a drawing designed by local artist Joseph Hollis (see more of his science-related work here) to accompany our research on Quantum Darwinism, which was selected by the American Physical Society to be featured on the cover of the October 19 issue of the prestigious journal Physical Review Letters – a great honour for us! The University of Nottingham Press Office covered the news with a blog post entitled “Maths inspired art makes front cover”, available to read here. Curiously, when tweeting about it, the Press Office fell perhaps into a Freudian slip and wrote “Art inspired maths” instead…
While probably unintentional, this hints at how deeply science and arts can mutually influence each other and continue to progress to uncover new horizons.
Our group’s collaboration with Joseph Hollis took to a new level with the very recent production of an illustrated book on the foundations of quantum mechanics. Written by Paul Knott and funded by the Foundational Questions Institute, the book “Our quantum reality” is a gorgeous example of how efficiently (cat-themed) figurative arts can assist science and scientists in reaching out and delivering even complex and mindboggling messages. You can read more and flip through the entire book at the dedicated webpage: https://illustratedquantum.wordpress.com/
What’s next then?
Science and arts will meet in an even grander love story later this month. On Tuesday November 27^{th}, at the Djanogly Theatre, Lakeside Arts Centre, set in our beautiful University Park Campus (University of Nottingham), we will have the pleasure to host a performance of “Entanglement: An entropic tale!”. A fully-fledged opera produced by Roxanne Korda (librettist) and Daniel Blanco Albert (composer) from Birmingham-based company Infinite Opera, inspired by concepts of modern (quantum) physics!
This is the first time I have ever heard of entanglement and entropy mentioned in a theatrical piece, and I am really excited about the upcoming show — described as the “Romeo and Juliet” of particle physics — and the emotional journey it will take its viewers on! Here is me chatting with Roxanne and Dani at Lakeside Theatre a few weeks ago, in preparation for the event…
I will be introducing the opera with a brief presentation, in which I hope to convey successful instances of breaking the (virtual) boundaries between science and arts, as I am arguing in this post. For those of you coming on the evening, there will also be the chance to ask more questions on this and related topics during a Q&A panel session after the performance. For this we will be joined by Professor Mark Fromhold from the School of Physics and Astronomy (University of Nottingham) who has done his fair bit of taking down the science vs arts apparent wall by collaborating with songwriters Lady Maisery on their song Order and Chaos (which mentions entanglement!), by engaging with public outreach at Pint of Science events and more recently at the Royal Society Exhibition on “Quantum sensing the brain”, as you can see here.
If you are in the area and have not booked your tickets yet, please do so! We look forward to welcoming you to an evening which science and arts lovers will hopefully both find intriguing and delighting.
P.S. You might also have the chance to buy a copy of “Our quantum reality” for a very small charge on the occasion of the opera night.
Tickets available here: https://www.lakesidearts.org.uk/music/event/3907/entanglement-an-entropic-tale.html
]]>
Quantum Roundabout 2018 was the fourth iteration of a series of dedicated conferences looking at foundational mathematical topics within quantum physics.
The etymology of the conference stems from the so called “Magic Roundabout” on the outskirts of Swindon. This roundabout boasts five mini-roundabouts arranged around a sixth central anticlockwise roundabout. It was decided that such a strange junction could only be quantum in nature, hence the conception of Quantum Roundabout. The junction has since been named fourth scariest in Britain by a recent poll .
Held at the University of Nottingham every two years, Quantum Roundabout brings together early-career researchers from around the world to collaborate and engage in scientific discussion. The organisation of this conference is a rite of passage for all PhD students under the supervision of Professor Gerardo Adesso, who has overseen the organisation of all four iterations of the conference. Since its conception, attendance to Quantum Roundabout has grown significantly, with the most recent edition boasting over seventy attendees.
A specific motivation behind Quantum Roundabout is to provide a platform for early-career researchers to communicate their research and engage in scientific discussion, with a focus on new and emerging topics within quantum theory.
Each of the three days of the conference focused on a different theme with a morning and afternoon tutorial lead by an expert in the field. This year’s Quantum Roundabout covered the following areas of interest:
Each of the tutorials underpinned the theme of the day, around which participants gave talks based on their current research.
A key goal of the Quantum Roundabout conference is to highlight and promote the role of mathematics in the approach to quantum theory in both its fundamental and applied aspects.
Indeed, mathematics is not just a tool when tackling a physical problem; sometimes it is the motivation behind the research, other times a mathematical theorem could be the ultimate result and, most of the time, the mathematical rigour becomes crucial in the understanding of physical concepts. This has been true in the past and still is today.
Recently, for example, the mathematical structure of resource theories, which has its roots in convex geometry, has found application in the burgeoning field of quantum thermodynamics. The classification of equilibrium states as free states and the energy preserving operations as free operations gives the laws of thermodynamics a mathematical foundation.
Information theory, originated by Shannon’s pioneering work, has provided versatile foundations for a variety of fields, which have attracted an increasing interest in recent years thanks to their potential for technological breakthroughs.
One of the most promising such technologies is quantum metrology, which studies the exploitation of quantum mechanical effects to develop measurement techniques that are higher in sensitivity or resolution than possible using classical systems alone.
As expressed by participants and the invited speakers, Quantum Roundabout 2018 delivered an effective platform for dedicated scientific discussion on emerging topics within the field. This atmosphere of discussion has since led to several collaborations between early-career researchers at international institutions. It is hoped that the continued success of Quantum Roundabout will help fuel future directions in the field of quantum theory and cement the University of Nottingham as a centre for quantum research.
This conference would not have been possible without the financial help of the IMA in addition to our other sponsors (LMS, JPhys.A, IAMP, Uni of Nottingham, IOP, Xanadu and FQXI). The organisers also wish to thank Prof. Gerardo Adesso, Dr Ludovico Lami, Dr Paul Knott and Katie Gill for their advice and support. Benjamin Morris would also like to thank his fellow organisers, Giorgio Nocerino, Buqing Xu and Carmine Napoli for making this year’s Quantum Roundabout a success.
Written by Benjamin Morris
]]>To be more specific, our task is the following:
Given a set of quantum-optics experimental equipment, what is the best way of arranging the apparatus to create a quantum state with certain desirable properties?
What we mean by “desirable” depends on the application in mind. We are interested in designing quantum states of light, which can be useful range of applications such as quantum computing, making high precision measurements, and quantum cryptography. Now, the apparatus we are using can be broken into three categories: states, operations, and measurements. Roughly speaking, we take some states, act on these states with some operations, which modify the states and cause them to interact with one another, then we perform some measurement. This is getting rather complicated, so to simplify things we will use Lego!
The meanings of all the symbols in the white boxes is not so important for the purpose of this blog, but for those already familiar with quantum optics we provide a glossary at the end of this blog that says what all these different symbols means. If we now think of this in terms of Lego, our question becomes:
Given a collection of Lego pieces, what is the best way to arrange them to produce a construction with certain properties?
Just like in quantum optics, there can be a variety of desired Lego constructions, such as making something strong, something beautiful, or making something that looks like a real-life object such as a police car. And just like with quantum experiments, the usual way to design a Lego construction is to use creativity, prior knowledge, and intuition.
But how could a computer design a new Lego construction? The technique we use is known as a genetic algorithm. Genetic algorithms are designed to mimic natural selection, and they work as follows:
First, the computer makes a collection of completely random constructions. This collection is known as the initial population:
We then assess the various designs, giving them all a score:
Next, we select the best designs, and throw away the ones that aren’t so good:
This leaves us with a smaller subset of the population, which we call the parents:
We will use these parents to create a new population, known as the children. First, we take the very best parents, and copy them without any modification. This produces the elite children:
Second, we breed some of the parents together to produce crossover children:
Thirdly, we make small changes to some of the parents, producing the mutation children:
This leaves us with a new population, which hopefully should be significantly better than the initial population:
We then repeat this process:
Each cycle is known as a generation, and after repeated generations we create better and better individuals. Just like in nature, eventually we end up with individuals that are highly suited for our goals (try and work out what the goal was here!):
And the result is the same for us: after a number of generations, our genetic algorithm produces a range of new quantum experiments that often outperform the human-designed experiments!
Acknowledgement: It should be stressed that this post is based on a poster by Rosanna Nichols, who should take most of the credit! A paper based on the genetic algorithm introduced here will be available shortly, but the earlier paper can be found here.
Glossary:
]]>
Superintelligent machines
A superintelligence can be defined as an “agent that possesses intelligence far surpassing that of the brightest and most gifted human minds” (Wiki). Intelligence itself is hard to define, but for our purposes it is fine to use a broad definition that encompasses every possible type of intelligence that you can think of: IQ tests, problem-solving, artistic ability, creativity, game playing, emotional intelligence, and so on. It is already clear that computers can surpass humans in certain, very specific types of intelligence, such as playing chess. There are also a few general things for which computers are vastly superior to humans. Computers are much faster at repeating simple tasks, such as calculations, as can be demonstrated by picking up your calculator and crunching some numbers. I have also found this aspect very useful in my own research, where I used a simple AI to design quantum physics experiments. Furthermore, computers can store and process vast amounts of data – imagine searching on the internet without Google! Note that we normally don’t refer to Google search as “intelligent”, but imagine the reaction when telling the average 19^{th}-century scientist of its abilities.
Despite these successes, the majority of tasks performed by humans are still far superior than those of AI. Simply making a cup of tea, especially in an unfamiliar kitchen with unfamiliar equipment, is challenging for an AI, but trivial for us. The difference often comes down to common sense, of which we have plenty, but AI lacks entirely. For this reason, and others, it is at present hard to imagine an AI that could function in the world as well as we can. But is it impossible for us to build a human level, never mind a superintelligent, AI?
In the short/medium term, there seems to be no reason why development in AI won’t continue its rapid upwards trajectory. AI has already come close to (and sometimes outperformed) humans in tasks that reasonably recently seemed impossible, such as playing Go, competing in quiz games such as jeopardy, driving cars, and language translation. And with AI now proving its genuine commercial potential, it is reasonable to assume that funding in AI will continue for the foreseeable future, and probably increase. It is likely that much of this progress will require faster and faster computers, but again in the short/medium term there is no reason why computer power and memory etc won’t continue to increase, even if that just means building bigger and bigger supercomputers.
However, it is notoriously difficult to predict the future, and despite the confidence with which many commentators express their opinions, any long-term predictions should only really be treated as educated guesses. Indeed, we might find some major obstacle to developing human-level AI, or Moore’s law might run out. Or other more extreme things might happen such as nuclear war or a unanimous international agreement that we don’t need to develop AI. My personal opinion is that there is no clear reason why human-level intelligence can’t be developed on a computer, and I would guess that this would happen sometime this century. A survey of relevant experts (presented in Nick Bostrom’s excellent book Superintelligence) mirrors this, with 50% of the people surveyed saying that human-level intelligence will be attained by 2040, and 90% saying it will be attained by 2075.
Even so, no one really knows whether and when human-level AI will be developed. For this reason, I think that a much more important question is not will it be developed, but could it be developed. For me there is a clear answer to this: to the best of our knowledge it is not impossible to develop AI with human-level intelligence. Even if some experts think it is impossible (see below), any rational analysis that takes probability and human biases seriously should take an overview of opinions and therefore give the probability of developing human-level intelligence as greater than zero.
If human-level intelligence is developed then, assuming that at this point we still want to develop AI further, we should expect an intelligence explosion, i.e. a rapid increase in the intelligence of the AI system. The reason for this is that an AI with human-level intelligence (in every department) would also possess human-level abilities to research and develop AI. Then, we only need the AI to be slightly better than humans at developing itself in order to create a self-development loop of increasing ability and speed, leading to an explosion in the AI’s capabilities. Thus, quickly a human-level AI would quickly develop into a superintelligent AI.
Before moving on, I should mention that some think that there are fundamental reasons why human-level intelligence cannot be developed. The main criticism that I know of is based on Roger Penrose’s Gödel-like argument. I’m not an expert on this, and will only mention it briefly, but in short Penrose uses a logical argument to argue that the human mind is non-computable. At least in his book Shadows of the Mind, Penrose seems to largely use this argument to rule out AIs being conscious, rather than ruling out AIs being as intelligent as humans (consciousness and intelligence are completely different, and should not be conflated – I will discuss this in a future post). Even so, there are two reasons why I think his arguments aren’t a major problem: firstly, even if the human mind isn’t computable, then Penrose still thinks it can be explained within the laws of physics. Then, it seems reasonable to assume that eventually we will understand the human brain, and when we do we could in principle construct a computer to replicate the non-computable aspects of the brain. Secondly, as far as I know the aforementioned non-computable tasks are very specific, and are not necessary for most relevant tasks we deem “intelligent”. A future AI not possessing these extra non-computable abilities could still function just as well as humans in many arenas, and would vastly surpass humans in many. Truly human-level intelligent AIs might then not be possible, but for all practical purposes and for most tasks, human-level intelligent is still possible.
To conclude this section, at the very least the development of superintelligent AIs might be possible, and to many it is probably possible. As such, we should take the potential dangers associated with them seriously. If you disagree with me so far, please comment and let me know why!
Why would a superintelligent AI take over and exterminate us?
To demonstrate why a superintelligent AI might cause a threat, Nick Bostrom introduced the paperclip-machine argument. Suppose we have a superintelligent AI, and suppose that we give it the simple and seemingly harmless goal of creating as many paperclips as it can. Quite quickly we can see that this will get out of hand: making paperclips requires resources, such as materials and energy, and if the machine’s only goal is to create paperclips, then it will soon devour the resources of our planet, and continue on into outer space, harnessing all available resources and creating paperclips as it goes. If this scenario were to actually happen, humans would soon see that the paperclip machine is causing trouble, and presumably shut it down. However, remember that the AI is superintelligent – it surpasses humans in all types of intelligence, including social intelligence. It will therefore, very quickly, realise that the humans might want to shut it down, and this would clearly foil its paperclip-making plans, and so it would take every measure to make sure it is not turned off by humans. A good strategy might then be to destroy the humans, and put their atoms and molecules to good use.
This scenario is clearly hypothetical – why would we want a superintelligent AI to just create paperclips?! But we run into the same problems if we give the AI other more “beneficial” goals. We could give the AI the goal of making all humans smile. But how do we define a smile? If it is a specific facial shape, then a logical solution for the AI could be to create a metal structure that can be connected to the face, forcing the shape of a smile. Now to give a less silly example, what about giving a superintelligent AI the goal of eliminating all suffering in the world. Again, this can easily be misinterpreted. One solution to eliminate all suffering is to eliminate all humans, which clearly wasn’t the point. In a similar fashion, the goal of maximising human happiness could be achieved by putting humans’ brains in jars, and pumping them full of serotonin!
From this we can conclude that we need to be extremely careful about what we ask the AI to do. And furthermore, we need to be extremely careful that our commands aren’t misinterpreted. So far, this problem, often called the value alignment problem, is far from being solved. It is a revealing exercise to try to think of a goal, or a set of goals, to give a superintelligent AI; and then to find creative ways that these goals could be misinterpreted, resulting in potential disaster. Anyone interested in this (or anyone skeptical of this argument) should read Nick Bostrom’s book Superintelligence.
Despite the challenges, there are some ideas that seem to be heading in the right direction in specifying appropriate goals for superintelligent AIs. One example, which I think comes from Eliezer Yudkowsky, is to give the AI the goal of first trying to discover what humans would decide we want, if we had infinite time and infinite resources to think about it. And then to implement whatever it thinks humans would want. This still has its problems: it is not clear how we could program such a goal into an AI, and even if we did program such a goal, the AI might run forever and never come to a decision; and clearly not all human wishes align, so it might prioritise the creators’ wishes only, which might be devastating for everyone else.
Even if you are not convinced so far, note that everything above assumed that the creators of the AI will be friendly and want the best for humanity. This might of course not happen, for example if less well-intending governments pumped vast sums of money into AI research. A superintelligent AI would clearly be a powerful, probably even unstoppable, weapon.
Worrying about superintelligent AI taking over the world has only been taken seriously recently, and initially only by a tiny proportion of AI researchers. However, now it is becoming much more mainstream, and has been widely endorsed as a serious problem, e.g. see https://futureoflife.org/ai-principles/. Some researchers and commentators are still sceptical, but as with the development of superintelligence itself, there are no fundamental reasons why we should be certain that we can control a superintelligent AI. Because there are no fundamental reasons, we should, as a species, assume that we might develop AI, and that we might not be able to control it.
The only convincing counter-argument to this that I know of is based on the Fermi paradox. From our understanding of human evolution, and also the evolution of the first life on Earth, the probability that an Earth-like planet could produce intelligent life seems to be high enough so that we are not the only intelligent life forms in our galaxy. Then, in the entire observable universe there would be trillions of intelligent life forms. The probability that we were the 1^{st} to emerge would be small, but if other intelligent life forms exist then they would presumably develop AI, and if there are no major hurdles to developing superintelligent AI they would have probably done it by now already, in which case we would know about it, because the superintelligent AI would utilise the resources of the observable universe to achieve its goals whether they are beneficial or dangerous. Of course, there are many assumptions here, any which could be false, but one possibility is that superintelligent AI is not so easy to develop after all. But even if this argument seems convincing, it would be unwise to use it (or any other argument for that matter), to rule out the dangers of superintelligent AI with 100% certainty.
A glorious future?
Everything good in our world has been created, directly or indirectly, by intelligence – namely our own intelligence. What, then, would the world look like if we created a superintelligent AI and did manage to control it? It seems reasonable that such a system could, given enough time, solve all of our problems: disease, starvation and war; incompetent governance; other existential risks such as nuclear war, biotechnology, nanotechnology, and global warming; and even more subtle issues like why many of the well-off people in the world – all the people with sufficient food and shelter and friends – can still be miserable, sometimes to the point of depression or suicide. Presumably, a world in which all these problems are solved would be a wonderful place to live in, and therefore in such a world we should be happy to produce many more humans, maybe even as many as possible, to live rich and happy lives. How many of such humans could exist in the future? Bostrom gives various estimates, depending on different assumptions, that range from 10^{35} to 10^{43}, and even as high as 10^{58} human lives! Given that our successes or failures in AI research over the next hundred years or so might be the critical determinant in whether these lives exist, or whether all humans are wiped out, we might just be at the most important time and place in our universe. I will give Nick Bostrom the final word:
“If we represent all the happiness experienced during one entire such life with a single tear drop of joy, then the happiness of these souls could fill and refill the Earth’s oceans every second, and keep doing so for a hundred billion billion millennia. It is really important that we make sure these truly are tears of joy.”
References: I got most of the inspiration and ideas in this blog from Nick Bostrom’s book Superintelligence; Bostrom and Cirkovic’s Global Catastrophic Risks; and various episodes of Sam Harris’s (fantastic) podcast, in particular those with Max Tegmark, Stuart Russell and David Chalmers https://samharris.org/podcast/.
]]>Quantum mechanics is often presented as being “our most successful theory ever”. Despite 100 years of stringent experimental tests it has never been proved wrong. It has been confirmed to an accuracy of 1 part in 10^{12} and has even now been tested in space-based experiments. It underlies much of modern technology, including pretty much the whole information and computing industry. And it has predictive power in an extreme range of scenarios, from the smallest constituents of our universe to a fraction of a second after the Big Bang. For these reasons, it seems reasonable to assume that quantum mechanics is correct, and does not need to be modified, at least not yet.
We can then ask the question: If quantum mechanics is correct, then what does it tell us about the universe? But before answering this, we should probe whether this is a reasonable question in the first place. There is another option: quantum mechanics could just be a tool that is extremely powerful for predicting the outcomes to experiments, and we could say that it’s equations don’t directly tell us about the world itself. To explore this option, we can ask what it would mean for other scientific endeavours. Is the theory of dinosaurs only useful as a tool for predicting what kind of fossils we will dig up in the future, or does it tell about tell us about real creatures that existed in our past? When DNA was first discovered, was the theory of its double-helix structure only useful as an explanation of why a certain x-ray diffraction pattern was observed, or does it tell us about a real molecule that exists in ourselves? And are our theories of stars and galaxies only relevant to explain why we detect certain patterns of light in our observatories, or do they tell us about fantastic objects beyond our solar system? Of course in all these cases we assume that our theories tell us about real objects that exist independent of us. In that case, it seems reasonable to assume that quantum mechanics is telling us about the universe in which we live, about its structure and behaviour and our place in it.
What, then, does quantum mechanics tell us about the structure of the universe? To answer this I will refer to the Schrödinger’s cat thought experiment, which I introduced in more detail in two previous posts (here and here). Briefly, a cat is placed in a sealed box containing a radioactive atom and a vial of cyanide. If the atom decays, the cyanide will be released, killing the cat. It is straightforward in quantum experiments to put the atom into a strange state known as a superposition state, in which it has both decayed, and not decayed, at the same time. Now comes the bizarre prediction of quantum mechanics: if the atom is in a superposition, then the cat will also end up in a superposition of being dead and alive.
It is reasonable to wonder whether the cat can really be dead and alive simultaneously. One option here is to assume that the prediction must be incorrect, and therefore that quantum mechanics is incorrect and must be modified in some way. But, as we have seen above, modifying quantum mechanics should not be taken lightly. Before taking such a drastic step, we should first explore whether the situation can be explained without having to modify quantum mechanics. Yet another option is to say that the equations in quantum mechanics aren’t telling us about the actual state of the cat, and are only a tool that tells us what would happen if we observe the cat. But again, it is a drastic departure from how we normally view science, and should only be used if we can’t make sense of the situation otherwise. I discussed these alternative options here and here.
But despite being wildly counterintuitive (I will return to this later), what is the problem with saying that the cat is dead and alive simultaneously? The problem comes because we know that when we open the box we will either see a dead cat or an alive cat. If the cat was truly dead and alive, shouldn’t we expect to see a weird fuzzy cat when we open the box? The answer is no. Remember we are attempting to use unmodified quantum mechanics directly, so we should use it now to get us out of this riddle. What, then, does quantum mechanics predict we will see? The answer is even more obscure than before: quantum mechanics does not predict that you will see a fuzzy cat, but rather it predicts that you will enter a superposition of seeing the alive cat and seeing the dead cat. Again there seems to be an immediate intuitive objection to this. Surely I cannot be in a superposition – surely I would notice this and be able to tell in some way that I am in this obscure state.
Again we can use quantum mechanics to resolve this, but it turns out that it takes a significant amount of work and effort to do this. Nonetheless, the now well-established theory of decoherence, which I explain in a previous post and in this article, provides the answer. To cut a long story short, decoherence shows that the two parts of the superposition cannot interact or interfere with one another, and for all practical purposes (FAPP) they can never know of the other’s existence. We therefore interpret this superposition state as follows: FAPP there are two versions of you, one who has seen a dead cat, and the other who has seen an alive cat. These two versions cannot interact with each other, and will never be able to know of the other’s existence. When you open the box we say that the universe “branches”. From your perspective you either see the cat dead or alive. But there will be another version of you somewhere out there, on another branch of the universe, who sees the other outcome.
Another reasonable objection here could be to question whether it makes sense to interpret both parts of the superposition as being real. But with some straightforward calculations it can be shown that it is perfectly reasonable to do this: the physics is completely the same whether there is just one version of you or two. It can be shown that both parts of the superposition obey the laws of physics as usual, and essentially live in what appear to be normal universes.
Initially there was just one cat. But after the radioactive decay and cyanide release the universe branched into two “sub-universes”, one containing the dead cat and one containing the alive cat. These sub-universes are called branches, but they are also popularly termed “worlds”, hence the theory being called many worlds theory. They are even called parallel universes, in which case the totality of universes is the “multiverse”. But this terminology can be misleading, because there is really only one world, and one universe, which contains a superposition state of a dead and alive cat. Proponents of this theory prefer to call it “Everettian quantum mechanics”, named after the great Hugh Everett, who first realised that quantum mechanics could be interpreted in this way.
The superposition then spreads out in the following way: Before opening the box there was only one version of you. But on opening the box you join the branches of the superposition state. On one branch the cat is dead and you are observing the dead cat, whereas on the other branch the cat is alive and you are observing it as such. If you then talk to your friend they will also join the branches. On one branch they might see you as relieved that the cat was alive, whereas in the other branch they are consolidating you or berating you for your unethical choice of experiment. As you and your friend interact with more people, and more objects, the branching spreads. In fact, it often only takes a tiny amount of information to be distributed for branching to happen. A single photon that has interacted with you can fly out of the window hitting a nearby building, at which point the building also splits and joins the two branches. This process carries on – as information spreads more and more things join the branches.
I am unsure whether it is known the exact manner by which the branching spreads, and at what rate etc, but the basic theory is now pretty advanced in predicting that, despite the obscure superposition state that quantum mechanics predicts we live in, the normal classical reality emerges and we are completely oblivious to the other branches (e.g. see quantum Darwinism). I should also briefly mention that this branching process happens almost continuously all around us. The fundamental constituents of our bodies – the electrons, quarks, and photons – are constantly splitting into superpositions, and often these superpositions lead to branching in a similar way as in the Schrödinger’s cat thought experiment. For example, in chaotic systems or systems at a critical point small changes in the microscopic initial conditions can lead to vastly different macroscopic final states. Then, if the initial conditions are in a superposition, which they often will be, a branched macroscopic world will emerge from this.
I will now address some of the common criticisms against many worlds theory. I invite comments to this blog of more criticisms, and I will try to address these in a future blog. I will try to keep things un-technical, but near the end things might get a little technical in order to give more complete explanations.
Common criticisms
“Many worlds is far too radical.”
On the contrary, the basic formalism of many worlds theory can be seen as our most conservative option! As explained above, the other options are to assume that quantum mechanics is incorrect, and therefore modify it in some way. Or to assume that the equations in quantum mechanics do not tell us about reality. As I argued above, the first option is radical in the sense that it disregards our most successful theory ever, and tries to change it. Whereas the second option goes against pretty much all of our scientific thinking to date. Many worlds theory, on the other hand, takes quantum mechanics seriously, at face value, and we have seen above that such a theory can indeed explain the apparent paradoxes presented in the Schrödinger’s cat thought experiment. The only catch is that the picture of reality predicted by such a theory is completely counterintuitive…
“Many worlds is unintuitive and ridiculous.”
Should intuition be used to rule out theories? It used to be counterintuitive that the world was round, or that the Earth orbits the sun, or that we evolved from monkeys. But our intuition seems to have been refined to encompass these. It is still counterintuitive that all objects around us are largely empty space, or that our bodies are filled with billions of microbes that are essential for our survival, or that our planet is flying through space at millions of mph. The last point bears a resemblance to the explanation of Schrödinger’s cat above: wouldn’t we expect to feel the wind rushing past us if we were flying through space? At first sight maybe yes, but after more careful analysis scientists have made perfect sense of this conundrum.
For me, the examples just given make it very clear that our intuition should not be used to rule out theories. Sure, if our theories predict contradictory things, such as 1+1=1, then we should seriously question our assumptions and calculations. But the picture of reality that many worlds predict is self-consistent and fits perfectly with our observations.
“Many worlds theory unnecessarily invents the existence of almost infinite numbers of worlds.”
Many worlds theory doesn’t invent the existence of other “worlds”. We don’t take the Schrödinger’s cat paradox and say “we can solve this paradox if we invent multiple parallel worlds”. Rather, we take the reasonable assumptions that quantum mechanics is correct and that we can take it literally, and from this we predict the existence of multiple worlds.
“Surely the universe isn’t that big.”
What we thought of as our “universe” used to be much smaller. Ancient Europeans used to believe that the whole world was just Europe, and other cultures thought in similar ways. Then eventually people discovered that the world was actually round and was far bigger than just one continent. Then we discovered that the sun was not just a part of our sky, but rather we were part of a huge solar system. Nowadays, it is assumed that the universe is humongous, maybe infinite, and contains billions of stars. At each stage our picture of the world grew exponentially. Sure, the jump that many worlds theory makes is unimaginably bigger than these other jumps, but it still fits with the pattern that the universe is actually much bigger than we would otherwise expect.
“Many worlds is untestable.”
As discussed above one of the main alternatives to many worlds theory is that we must modify quantum mechanics. This has the obvious perk that it can be tested. Attempts are already being made to see whether gravity collapses the quantum state (thus leaving Schrödinger’s cat either dead or alive, not both). In short, a large mass must be put into a superposition, and it must be sufficiently isolated from the environment so that decoherence doesn’t already lead to the appearance of collapse. Then, because the mass is large, gravity-collapse theories would predict that the mass will spontaneously collapse. Observing this spontaneous collapse would provide strong evidence for this theory.
In many worlds theory, we cannot ever interact with or interfere with other branches of the universe. In this sense, we cannot directly test the existence of these other “worlds”. However, there seems to be double standards in favour of collapsed theories, because if the experiments to test collapsed theories came out negative, then this would rule them out and provide strong evidence that quantum mechanics is correct as it stands. If this pattern continued eventually leading to experiments putting massive objects, or conscious objects, or highly complicated objects such as quantum computers into superposition states, then this would also provide strong evidence that quantum mechanics is correct as it stands. On the other hand, and most importantly, if these experiments proved collapse theories to be correct, then this would directly disprove many worlds theory. Many worlds theory is therefore falsifiable.
One other main alternative to many worlds theory, which was introduced above, is to say that quantum mechanics is just a tool for predicting outcomes to experiments, and it’s equations don’t tell us about reality. This is an alternative philosophy of quantum mechanics, or an alternative interpretation, and in this sense there is no experimental way to distinguish between them. However, in my opinion, if we are eventually able to put massive and maybe even conscious objects into a superposition state, and if repeated experiments prove that decoherence does indeed mean that different branches in many worlds theory cannot ever interact with each other, then these facts would make many worlds theory much more acceptable and plausible. My prediction is that eventually many worlds theory will be the textbook way to interpret quantum mechanics!
Some more-technical objections
The following two issues seem to have been, for a long time, two of the most important technical objections to many worlds theory. However, these seem to have now been solved (at least according to many worlds practitioners!). I won’t go into details, but I’ll try and briefly summarise the objections and their resolutions. However, so that this post doesn’t go on too long I will use a bit of technical jargon at times and assume some more detailed knowledge of quantum mechanics.
Probability in many worlds
Again imagine Schrödinger’s cat just before you open the box. This experiment can be set up so that the amplitudes of the two parts of the superposition (dead and alive cat) are equal. In the usual way quantum mechanics is presented, we say that there would be a 50% chance of seeing the dead cat when the box is opened, and a 50% chance of seeing the alive cat. But in many worlds both outcomes happen. In fact, the Schrödinger equation is deterministic, so many-worlds quantum mechanics is an entirely deterministic theory! So how does the probability come in? In many worlds we say that the probability that “you” end up in the branch with the dead cat is 50%, and similarly for the alive cat. I put “you” in quote marks because there are two versions of you. Just from your perspective you only see one outcome. So in that sense if you see the alive cat, then you will consider this version of you to be the real you! This would be the same for any quantum experiment: the probabilities don’t tell us the probabilities of the different outcomes happening, but rather they tell us the probability that you will end up in a branch with a given outcome.
When the different branches have equal probabilities then the basic idea of probability is somewhat intuitive: initially there was one version of you, but in the end there are two versions of you. Both versions are equally real. Therefore, it makes sense that there will be a 50% chance that the version of you that you experience is the one that sees the alive cat, and so on. However, if the two parts of the superposition have unequal weightings, say 80% alive and 20% dead, then this is no longer intuitive. These probabilities would mean that you have an 80% chance of ending up in the branch with the alive cat. But there are still two versions of you, and they are both real. How can one have a bigger “weight” than the other one? Is one more real than the other?
To resolve this problem, many worlds practitioners (who seemingly largely work at Oxford University!) have provided rigorous proofs to demonstrate that probability does actually make sense in many worlds. They do this in two ways. Firstly, using decision theory, it can be shown that a rational agent should make decisions based on these weightings. For example, if the weightings are 80% and 20%, then a rational agent would be willing to bet £1 at 1/4 odds that they will end up in the 20% branch. Secondly, it can be shown that, if you repeat an experiment many times, in the limit the number of times a certain outcome happens will fit with these weightings. In this sense, you do have an 80% chance of ending up in the branch with an 80% weighting. I acknowledge that this explanation might still feel unsatisfactory, but the maths works!
In addition, many-worlds practitioners will argue that when properly scrutinised, we are far from fully understanding probability anyway (I won’t go through the arguments here, but e.g. watch this). They then argue that certain unanswered questions in classical probability theory can be answered in many worlds theory. In particular, the Born rule can be derived from some simple assumptions (rather than being merely postulated, as it is usually). I’m sure that many readers will be unsatisfied with this brief answer, but the details are beyond the scope of this blog post! Interested readers are referred to David Wallace’s book “The emergent multiverse”.
The preferred basis problem
Historically a serious technical objection to the many worlds theory was the so-called preferred basis problem. I will only give a brief introduction here – I give a much more detailed (and more clear) explanation in this article. In quantum mechanics, there are different bases in which we can represent quantum states. A qubit has two states, which here I will call one and zero. However, we can write these states in a different basis to get plus = one + zero, and minus = one – zero. The two different bases are equally valid, and equally real, and in the lab it is just as easy to prepare the state one as it is to prepare the state plus. Does this same principle hold in the macroscopic world? The answer is no: we only ever see a dead cat or an alive cat. Mathematically it would be equally valid to write these states in the basis plus = dead + alive, and minus = dead – alive. But there is a big difference between these different bases. We know from the Schrödinger’s cat thought experiment that the plus state exhibits branching in many worlds theory. But the dead state does not – the dead state does not branch into a plus branch and a minus branch, as we might expect from the maths. We say that the dead, alive basis is the “preferred basis”. But why is it preferred, and what mechanism makes one basis preferred over and other?
The theory of decoherence now provides rigorous and comprehensive answers to these questions. In short, we can’t just treat the cat as an isolated system, but rather we have to factor in the interaction between the cat and the photons and particles in the box. Once this interaction has been accounted for, it can then be shown that there is one particular basis that remain stable and does not exhibit branching – this is known as the preferred basis. It can then be shown (though not directly) that the preferred basis for the cat is indeed the dead, alive basis. If the cat is prepared in a superposition of these basis states then branching will quickly occur. Whereas if the cat was, for example, alive, then branching would not occur. The same goes for us when we open the box: if the radioactive atom and cyanide were removed from the box, meaning that the cat would always be alive, then we would simply open the box and see the alive cat. No branching would happen. But if the cat is in a superposition then when we open the box branching occurs.
Conclusion
I hope that, at the very least, I have convinced you that many worlds theory is a reasonable and rigorous theory for how we should understand our quantum universe. By taking quantum theory seriously, and literally, we can overcome the paradoxes such as Schrödinger’s cat, and provide a self-consistent picture of our universe. At worst this picture is unintuitive, but at best it paints a fantastically interesting and endlessly fascinating picture of our universe. If you are still not happy with many worlds theory, rest assured that somewhere in the multiverse there is a version of you that is!
Acknowledgements: My general interest and knowledge about many worlds theory comes from David Wallace’s comprehensive book “The emergent multiverse” and Max Tegmark’s popular book “Our mathematical universe”. Specifically for this post, much inspiration and information came from David Wallace’s YouTube talks: https://www.youtube.com/watch?v=2OoRdyn2M9A&t=21s and https://www.youtube.com/watch?v=8turL6Xnf9U&t=2s . I’ve been told that David Deutsch’s books are also excellent!
]]>In the Schrödinger’s cat thought experiment, a cat is placed in a box with a device that contains a radioactive atom and a vial of poison. If the atom decays, then the device is designed to release the poison, thus killing the cat. It is now well known that such an atom can be put into a state in which it has decayed, and not decayed simultaneously – this is known as a superposition state. Now, if this system is studied using the central equation in quantum mechanics, the Schrödinger equation, then the following result will be found: if the atom is in a superposition state, then this will lead to the cat being in a superposition state. The cat will be dead and alive simultaneously! Now suppose you open the box – what will you find? The Schrödinger equation again predicts that, if the cat was in a superposition of being dead and alive, then when you open the box you will also enter into a superposition. You will be in a superposition of either seeing the dead cat, whilst simultaneously seeing the alive cat.
This clearly does not fit with our experience of the real world. We never see objects in superpositions, and indeed we never seem to experience superpositions ourselves. And while the above experiment is far too challenging to perform using a real cat, conceptually similar experiments have been performed in which an object is put into superposition, and then observed. The result of these experiments fits with our experience and intuition: we never see a superposition state. So what has gone wrong here? Have we misapplied the Schrödinger equation? Is the Schrödinger equation incorrect? The standard resolution, which can be found in most quantum mechanics textbooks, is to introduce the “collapse postulate”: On observation a superposition state collapses, meaning that only one outcome of an observation, or a measurement, is ever observed. I.e. we only ever see the cat as being dead or alive. But the collapse postulate raises as many problems as it solves. What exactly constitutes an observation or measurement? If macroscopic objects are made of quantum particles, what is so special about a measuring device or a conscious human observer to cause collapse? (This questions are together often termed the measurement problem.)
Is the Schrödinger equation sufficient to solve the problem?
Despite how quantum mechanics is often discussed, there is now a widely accepted and carefully studied solution to these problems that utilises the Schrödinger equation alone, and does not have to introduce the troublesome collapse postulate. The solution lies in the theory of decoherence. Elsewhere I give a more thorough introduction to decoherence, in particular in relation to Schrödinger’s cat. But in this post I will try to give a simple and minimal introduction that still captures the main ideas.
Imagine you have a single atom and some cutting-edge experimental equipment capable of putting this atom into a superposition of two locations, A and B. The crucial question here, which is at the heart of decoherence, is: how do you know it is in a superposition? If you directly measure the atom then you will either see it at position A, or position B, but not both. To confirm the superposition a more advanced step needs to be taken: we must do an interference experiment. This involves the idea of constructive and destructive interference of waves, which can be seen by throwing two stones into a pond close to one another. The waves coming from one stone interfere with the waves coming from the other stone. If two peaks meet they reinforce one another creating a larger peak, whereas if a peak and trough meet they cancel each other out. Quantum mechanical objects, such as the atom we are trying to interfere, are described by equations known as wave functions. As the name suggests, these particles act like waves, and just like the stones in the pond they can demonstrate interference. I’m unsure myself how the exact experiment would work to interfere the two parts of an atom that has been put into a superposition, but by measuring the interference between the two wavefunctions the superposition can indeed be confirmed, and this is now an extremely well measured phenomenon in experiments.
Now suppose you are given two atoms, and you prepare the atoms in the following superposition state: both atoms are in position A, in superposition with both atoms in position B. Again, how can we confirm the superposition? If we directly measure the position of the atoms, then we either find both of them in position A, or both in position B (this is known as an entangled state – the position of the first atom is “entangled” with the position of the second, because we always find them together). Again we do not see, and cannot confirm, the superposition in this way, and we must perform an interference experiment. Now comes the crucial point: the interference experiment must be done on both atoms simultaneously, otherwise we will never see an interference pattern. If we just take the first atom, and try and interfere it with itself, then this will not work. (I explain this in more detail here.)
We can now return to Schrödinger’s cat. The cat is in a superposition state of being dead and alive, but how can we confirm the superposition? First imagine that the only thing in the box is the cat –it is in a complete vacuum with no air particles or photons or anything. In this case, it is in principle possible to perform an interference experiment with the cat. The dead part of the superposition interferes with the alive part of the superposition, and an interference pattern would be observed, confirming the superposition. This is not practically possible because we would have to interfere every single particle in the cat, and this involves precisely controlling and manipulating every single particle. But according to the laws of physics this is at least in principle possible.
But it is not realistic that the cat could be in a complete vacuum, and no matter how hard we tried there would always be at least a few particles in the box with the cat. These unwanted particles (and photons etc) are often termed the environment, and we assume that we do not have control nor access to them. Now again put the cat into a superposition. The cat will inevitably interact with the unwanted particles in the box, and as soon as they interact the cat and unwanted particles will become entangled with one another. Then, if we want to do an interference experiment, we would have to not only interfere all the particles in the cat, but also all the extra particles and photons in the box. We would have to precisely control and manipulate all of these particles, but as stated above we are assuming that we cannot control them and cannot access them. Therefore, in this case it is not even in principle possible to do an interference experiment. We cannot ever confirm that the cat was in a superposition.
Now what happens when we open the box? As soon as we look at the cat we become entangled with it, and enter into the superposition. The cat is dead and we see a dead cat, in superposition with the cat being alive whilst we see an alive cat. But again there will be unwanted particles and photons, and very quickly the cat and ourselves will become entangled with these particles and photons. Again, if we want to confirm that we are in a superposition, we would have to be able to manipulate and control all of these particles and photons, which is clearly not possible. Therefore, again, we cannot ever confirm that we are in a superposition. Furthermore, it is likely that some of the photons that have interacted with you will escape from the room through window, flying off to space at the speed of light! In this case, seeing that we can’t travel at the speed of light to collect these photons, it is not even in principle possible to confirm the superposition.
We have now solved the main problems in the Schrödinger’s cat thought experiment. Is the Schrödinger equation wrong? No – we can explain our observations, i.e. that we never see the cat in a superposition, just using the Schrödinger equation. Why do we never see the cat in a superposition? You must do an interference experiment to confirm the superposition, but this is not possible when we factor in the other particles in the box with the cat. The question of “what constitutes a measurement?” has not really been answered yet, but I will address this in a future post in which I defend the many worlds interpretation.
I have not yet fully addressed what happens when you open the box – whether you are really in a superposition, and if so, why you don’t “experience” this superposition. The answer to this really depends on how you interpret quantum mechanics, and this is what I will turn to next.
Many worlds interpretation
The introduction above to Schrödinger’s cat and decoherence has, in a sense, been written in the language of the many worlds interpretation. In the many worlds interpretation we firstly assume that the Schrödinger equation is sufficient in itself to explain paradoxes such as Schrödinger’s cat, and secondly we assume that quantum mechanics is a theory that tells us about real objects in the real world. The first of these points is justified in the above introduction to decoherence, and nowadays this explanation is widely accepted. The second point is the usual way we interpret science – normally we assume that our equations and theorems are telling us something about a real world that exists independent of ourselves.
These two assumptions might seem quite straightforward, but they lead to quite a radical picture of the world in which we live. For example, in the many worlds interpretation we say that the cat is indeed in a superposition of being dead and alive. There is technically just one cat, but it is dead and alive simultaneously. However, we have seen that the two parts of the superposition cannot ever interfere with each other. Interference is the only way of confirming that an object in is in a superposition, so the dead and alive cats cannot ever know of each other’s existence. Furthermore, the equations of quantum mechanics are such that the future life of the cat (at least the alive one) does not depend on whether the cat is in a superposition or not. Therefore, for all intents and purposes we can think of this as two cats, one dead and one alive. This is where the idea of “many worlds” comes from. For all intents and purposes there are two worlds, one containing an alive cat and one containing a dead cat.
The same idea holds when you open the box. You split into a superposition of seeing a dead cat and seeing an alive cat. But again the two parts of the superposition cannot ever know of the other’s existence, because they would have to interfere with one another to confirm this, and this isn’t possible. Therefore we can again treat this as being two separate worlds, one in which the cat is dead and you are presumably emotionally and morally scarred by the experience, and another in which the cat is alive and you will be relieved.
This picture of the universe is clearly unintuitive, and often people reject many worlds outright and come up with all kinds of criticisms of this interpretation. In my opinion most of the standard criticisms are either ill-founded or result from a lack of understanding of the basic theory, and in a future post I will try to flesh out many worlds theory and provide straightforward responses to many of the criticisms.
QBism – does the wavefunction represent reality?
Before continuing, an important comment is needed. Just before uploading this post I was in contact with Chris Fuchs – one of the founders and main promoters of QBism. To cut a long story short, he said (politely but firmly) that (referring to my previous post) “you capture none of the flavor of QBism at all in what you write. You present QBism as a kind of lifeless prediction machine (a positivism or instrumentalism), rather than as an attempt to make a deep statement about the character of the world”. He recommended reading https://arxiv.org/abs/1601.04360 and https://arxiv.org/abs/1207.2141. I have decided to keep my description of QBism in this current post unedited, but bear in mind Chris’s comment when you read this! And please comment on this post if you have an opinion about whether/how I misrepresent QBism…
As introduced above, we can represent quantum mechanical objects using an equation known as a wavefunction. The wavefunction tells us everything we know about this object. For example, we could write down the wavefunction for a single particle in an (equal) superposition of two locations. This wavefunction can then be used to predict what we will see if we perform certain measurements. For example, using the wavefunction we can calculate that, assuming the superposition is equal, if we measure the position of the particle then it will be in position A with 50% probability, or position B with 50% probability. Furthermore, we can use the wavefunction to predict what will happen if we perform an interference experiment. In particular, it will tell us the properties of certain outcomes: it will say that if we perform interference experiment X, then outcome Y will happen with probability Z.
Numerous experiments over the years have confirmed that quantum mechanics is extremely good at correctly predicting outcomes to experiment. But, in a sense, QBism says that this is all that quantum mechanics is good for. It says that we should not interpret the wavefunction as describing a real object, and therefore it is meaningless to ask if the cat is really dead and alive simultaneously. We simply cannot know – all we know is the probability of what will happen if we open the box. More specifically, the wavefunction represents our state of knowledge. It tells us what we know, not what exists. This is similar to Bayesian probability theory, in which probabilities this represent our knowledge of the world, not the world itself. For this reason QBism can also be called quantum Bayesianism.
I certainly have some sympathy with QBism. It takes quantum mechanics seriously, and in particular the Schrödinger equation, and does not try to modify the formulae. And it certainly has a strong point: how do we ever really know what exists? The answer is that we observe it, and we perform measurements on it, and we devise clever experiments to perform measurements on the extremes of scale and energy. But until we measure anything, we cannot truly know what it is, and whether it exists. So in this sense QBism is right that quantum mechanics is just a toolbox for predicting experiments.
But is this all quantum mechanics is? Throughout most of human history the goal of science has been to learn more about the world. We do astronomy and astrophysics to learn about stars and galaxies; we smash particles into one another in colliders to learn about what matter is made of; and we do quantum experiments to learn about the weird and wonderful properties of the quantum world. QBism therefore is a radical departure from how we normally treat the scientific endeavour. It is not necessarily the wrong way to interpret quantum mechanics, but Qbists should at least acknowledge that it is an extreme philosophical position.
To take this further, imagine the Schrödinger’s cat thought experiment, but with your friend opening the box rather than yourself. QBism is perfectly good at predicting what your friend will see when they open the box. But, presumably, you believe that your friend exists, and you might be interested in what happens to them when they open the box. QBism cannot tell us this – you can write down the wavefunction for your friend, but this is only a tool for calculating what you will see when you interact with your friend. Many worlds, on the other hand, is perfectly well-equipped to ask questions about your friend. The answer may be disturbing – that they in effect split into two versions – but at least it is a consistent and coherent answer. And this idea can be extended: many worlds theory predicts that almost continuously the world – and therefore your friend – splits into almost infinite parts of a vast superposition, which we can think of as parallel universes.
Would my assumption that my friend exists be incorrect? Perhaps. Maybe in the “real” world it is meaningless to ask about the state of things before we interact with them. But my friend certainly does exist in my head – I can imagine them walking towards the box, opening it, and looking inside. We can then call this world the “imaginary” world. Even though it might not exist outside my mind, I am still interested in what my imaginary friend is doing in this imaginary world. Removing yourself from the picture, now imagine a scene familiar to yourself, such as your house, or your pet, or your favourite sports team. I wonder what they are doing right now? Are there near infinite numbers of them, in near infinite parallel universes? Or is it meaningless to ask what they are doing right now, and only meaningful to think about what happens when you interact with them in some way?
My favourite thing about QBism is this: the wavefunction is normally written using the Greek letter psi, which is often pronounced “sigh”. Ontology is the study of the existence of things, whereas epistemology is concerned with knowledge rather than existence. Therefore, a Qbist is a psi-epistemist. Whereas someone like me who believes in many worlds and therefore that the wavefunction is real, can be termed a psi-ontologist. It deeply troubles me that I am a psi-ontologist (say this sentence out loud to yourself if you don’t get the joke!).
Collapse theories
Until reasonably recently it was not fully appreciated that the Schrödinger equation alone can lead to the appearance of collapse. Therefore, to explain why we either see the cat as dead or alive a “collapse postulate” was introduced into quantum mechanics. Initially it was just a postulate, and no explanation was given of how collapse takes place, or what causes it. But this introduces many difficult questions: What causes the collapse? It is usually assumed that a measurement causes collapse: but what is a measurement? Often it is said that a “measuring device”, or even a conscious observer, is what causes the collapse. But if macroscopic objects are made of quantum particles, what is so special about a measuring device or a conscious human observer to cause collapse?
Over the years various theories have been introduced to explain collapse with the hope of answering the above questions. Various mechanisms have been proposed: complexity causes collapse – the more complex a system, the more likely it is to collapse; or consciousness itself causes collapse; or gravity causes collapse – the larger the mass, the more likely collapse will occur. These models therefore can explain why Schrödinger’s cat is never seen, or measured, as being in a superposition state.
But now, with the theory of decoherence that I introduced above, we can explain the appearance of collapse without having to add extra postulates into the theory. Collapse theories are therefore unnecessary to explain our observations. So why do they still exist? I have never met anyone who both understands decoherence, and thinks that it is wrong, so collapse would presumably happen in addition to decoherence. And if you are uncomfortable with the conclusion that the cat is in a superposition (many worlds), or that it is meaningless to ask about the state of the cat (QBism), then you can modify quantum mechanics – specifically, modify the Schrödinger equation – so that the state collapses. But for me this seems like a case of changing the science in order to fit our wishes.
This might not be a problem if quantum mechanics was a young and underdeveloped theory. But this is certainly not the case, and the Schrödinger equation itself is responsible for quantum mechanics often being termed “our most successful theory ever”. Do we really want to modify such an equation? Quantum mechanics also works relativistically (i.e. combining it with Einstein’s special relativity), and it has been extended to quantum field theory, which has successfully predicted the Higgs boson. But collapse theories are far from achieving such extensions.
To be fair to gravity-induced-collapse, at some point quantum mechanics, as with any other theory, will be surpassed by some other theory. Quantum mechanics will still be an excellent approximation in many regimes, but in the extremes it will surely break down. But what are these extremes? Potentially the fact that general relativity and quantum mechanics cannot yet fit together gives a clue to this. In this case, might gravity in fact collapse the wavefunction? In my understanding this is at the heart of Roger Penrose’s suggestions to both explain collapse and unify general relativity and quantum mechanics.
For me the main positive to collapse theories is that they are testable. This is especially true for gravity-induced-collapse. If we put bigger and bigger systems into a superposition, while sufficiently isolating them from the environment so that decoherence doesn’t cause the appearance of collapse, then eventually at a certain mass threshold these systems should spontaneously collapse. These experiments should be possible in the relatively near future, and will serve to either confirm this theory, or give extra weight to non-collapse theories such as many worlds.
Consciousness-induced-collapse is in principle testable, but this is far beyond current experiments. To confirm this we would have to put a conscious entity into a superposition. We would have to isolated sufficiently it from the environment so that there is no decoherence, and we would have to be able to control and manipulate every particle in the conscious entity so that we can do an interference experiment. If the consciousness spontaneously collapses, thereby preventing interference, this will be strong evidence that consciousness does induced collapse. The best route to this could be using quantum computers. If we can simulate consciousness on a computer, then we could upload this program to a quantum computer, and subsequently put the consciousness into a superposition. But we don’t even know what consciousness is and such a test is infeasible for now. In addition, I argue elsewhere that if consciousness did cause collapse then the reality this would lead to would be far more bizarre and absurd than even many worlds theory predicts!
Pilot wave theory
Einstein famously stated that “God does not play dice”. He simply couldn’t believe that a fundamental theory of nature such as quantum mechanics could really be probabilistic. For example, generally in quantum mechanics we would say that on opening the box containing Schrödinger’s cat it would be random whether the cat is observed as dead or alive (with a certain probability of each). In many worlds theory both outcomes may exist, but it is random whether you end up in the part of the superposition with the dead cat or with the alive cat, so in this sense it is still random. In contrast, theories such as general relativity and Newtonian mechanics are deterministic. For example, if you know all of the positions and velocities of the planets in the solar system, then you can predict with certainty where the planets will be at any given time in the future.
To prevent the randomness of quantum mechanics a “deterministic hidden variable theory” was devised (named Bohmian/De Broglie/pilot wave theory). Taking again the example of the cat, in this theory there are additional variables beyond those in the Schrödinger equation. If we knew the values of all these variables, then we would know with certainty whether the cat will be dead or alive when we open the box. However, these variables are “hidden”, meaning they are fundamentally beyond our measurements and observations. We cannot, and will not, ever be able to determine these values, and therefore quantum mechanics will always appear to be random.
For me this is an even worse case than collapse theories of changing the science so that it more closely fits with our intuition. For protagonists of this theory it is so important that nature must not be random that they are willing to invent an underlying deterministic world that we cannot ever even in principle see. But why should nature be deterministic? In addition, the Schrödinger equation itself is deterministic, so in fact many worlds theory is a deterministic theory. We know with certainty that the cat will be dead and alive. The randomness just comes in when you ask “which universe will I end up in?”. But it is still, from the outside, deterministic.
There are some further complications/criticisms to this theory. John Bell famously showed that, if these hidden variables exist, then they must communicate with one another faster than the speed of light. Furthermore, in a recent paper Renato Renner showed that hidden variable models cannot be self-consistent (although this might not necessarily mean that they are wrong?!).
Conclusion
There are many other interpretations of quantum mechanics, and many more seem to be invented year-on-year. My personal view is that quantum physicists need to stop inventing new interpretations, and consolidate the old ones. Indeed both many worlds and QBism have some features that are unsatisfactory to some and unintuitive to all. But in my understanding there is nothing fundamentally wrong with either of these. Sure there are small problems that need to be ironed out, but this is the same for any theory. My personal prediction is that in 100 years from now, if we survive existential risks such as nuclear war or artificial intelligence taking over the world, pretty much every quantum physicist will either be a Qbist, or believe that we live in a fantastic quantum multiverse!
]]>On the other side of the spectrum, scientists are asking seemingly disparate questions within the field of quantum information. How much entanglement do I need to pass on this message? How can I stop my quantum computer from losing its quantum-ness? What’s the best way I can make a quantum superposition?
Now it may seem that there can be no possible link between the answers to these questions. That a steam engine has nothing to do with quantum entanglement, well that is mostly true. However, all of questions are concerned with the same problem, the extractability and conservation of a given resource. On the thermodynamic scale, engineers want to know how heat and work behave. On the quantum scale, theorists and experimentalists alike are interested in defining and conserving the quantum-ness of a given system.
In order to bridge a gap between these fields, we can start by defining a state. Any system, whether it be a quantum or classical, can be described via a state. There are many different ways in which to write down a state, but in order to emphasize the specific resources under consideration I will be writing my states in matrix form.
There are many interesting mathematical and physical motivations for writing states in this way, however I will only be employing two properties of its form: (i) the elements of the matrix correspond to probabilities; (ii) it provides a good pictorial description of the system.
Arguably the most important state in the field of thermodynamics is the thermal state. This is the final state of any interacting thermodynamic system. What do I mean by this? For example, if I was to leave a hot cup of tea in a cold room, they would eventually reach the same temperature as they exchanged heat. This final state of the overall system would be a thermal state.
When writing the thermal state in matrix form, the system orders its state such that the diagonal values of the state become more or less populated depending on their energy. Where the lower energy levels become more populated in comparison to the higher ones.
Given any isolated state, if you temporarily attempt to extract work from this state and then let your system relax into whatever state it wants to, if the final state of the system is a thermal state, then you know that you have completely extracted all possible work.
The next obvious question to ask is, what is the opposite of a thermal state? The state from which the most amount of work can be extracted from. This state is called a pure state and be written as:
What makes the pure state so special is the thermodynamic context is that only a single element of the matrix is being populated. It’s as if the components that make up the system have all crowded into the highest energy element possible. Work can then be extracted from this state as the other lesser energy states populate themselves from this one.
Now we have explored the full range of thermodynamic energy states we can now start to think about quantum resources. The quantum resource we will focus on is called quantum coherence. This is a foundational quantum resource that is responsible for a wide range of quantum effects, such as quantum supposition and multipartite entanglement. So how can we possibly grasp any understanding of this complex quantum feature? Well, if you were wondering what happens when the off-diagonal elements are not zero, that’s quantum coherence!
So, the state in thermodynamics whose resource has been fully extracted is the thermal state. What is its equivalent within the resource theory of quantum coherence? It’s called an incoherent state and is written as:
Any state whose elements are entirely concentrated on the diagonal are incoherent states, this includes the thermal state and pure state. Therefore, you know that you have completely extracted all of your available quantum coherence when you end up in an incoherent state.
So what is the state with the maximum amount of extractable coherence, the analogue to the pure state in thermodynamics? It’s called the maximally coherent state and can be written as:
Crucially for the maximally coherent state, every element is identical. As operations are performed on this state that reduce the amount and size of off diagonal elements, the coherence of the state is extracted. This can be repeated until all the coherence is extracted and forms an incoherent state.
So what can we do with all these definitions? Is there some way to bridge the gap between the resources of quantum coherence and extractable work. Well to some degree this is still an active area of research and one with which I’m currently engaged. However, we can at least make a start by attempting to classify the states and attempt to bridge between the resources.
For example, if we order the states from most to least resourceful we produce the following spectrum of states.
Ordering the states in this fashion prompts us to ask some questions.
It appears that coherent states exist past the boundary of what states would normally be considered when extracting work from your thermodynamic system. However, recent work suggests that thermodynamic resources can be extracted from the coherence of a state. Does this mean that the full hierarchy of thermodynamic resource states stretch into the quantum realm?
There are several different classifiers that determine where a state appears on this spectrum of extractable resources. For the part of the spectrum considered in thermodynamics we can compare states via a property called majorisation, which determines if one state can be transformed into another without the input of resources. Interestingly, in coherence resource theory, the property of majorisation is used when considering pure to pure state transformations. Could this be because pure states seem to be the boundary states between the two parts of the spectrum?
This is made more interesting when considering that some of my recent work has developed thermodynamic like relations for the resource of coherence for a pure to pure state transformation. Do thermodynamic relations for coherence resource theory only exist when considering the pure states that exist on the boundary?
It is hoped that the answers to these questions will not only help our fundamental understanding of thermodynamic and quantum theory, but also on the boundary between these two fields (if one exists). So perhaps as we extend our thermodynamic theories further and further into the quantum realm, it may not be too long till your train is powered by the quantum realm after all.
]]>
Schrödinger’s cat
One of the founders of quantum mechanics, Erwin Schrödinger, proposed the following thought experiment: A cat is placed in a sealed box with a device that contains a radioactive atom and some poison gas. If the radioactive atom decays, then the device is designed so that it detects the decay of the atom and subsequently releases the poison gas into the box, and this tragically kills the cat. Our intuition says that there are two options here. Either the atom decays and the cat is dead, or the atom does not decay and the cat remains alive. But quantum mechanics tells a different story. In quantum mechanics objects can have more than one property simultaneously, and in particular it is possible to put the atom into a state where it has both decayed, and not decayed, at the same time. But it doesn’t stop there: quantum mechanics also predicts that if the atom has both decayed and not decayed, then this leads to the poison being released, and not released, at the same time. In turn, quantum mechanics predicts that the cat will be dead, and alive, simultaneously!
What do you think would happen if you were to open the box and look at the cat? Would you see the cat as being both dead and alive simultaneously? The answer is of course no – a large object such as a cat has never been seen in such a bizarre state. But why not? Quantum mechanics predicts that the cat can be dead and alive, and quantum mechanics has never been proved wrong. There seems to be a paradox here! But we need not fear, because there are a range of different theories that solve this riddle, and I will introduce some of the main theories below. Bear in mind that none of these theories have yet been proved wrong, and so you are free to choose whichever theory you like…
Collapse theories
Perhaps the most intuitive explanation is to say that the description above is not quite correct, and we must add an additional rule that prevents objects such as cats from having multiple properties, such as being dead and alive, simultaneously. In other words, quantum mechanics must be modified slightly, and once this is done it will better fit with our view of reality. So how exactly should we modify quantum mechanics? What should this new rule look like? There are different theories of precisely what this rule is, but they all involve the idea of the quantum state “collapsing”. Using the example above, they say the cat cannot be dead and alive simultaneously, and therefore the state of the cat must “collapse” into being either dead or alive, but not both. Now, however the collapse works, we know from experiments that small objects such as atoms can have multiple properties simultaneously, so the collapse does not happen at this scale. So what are the main differences between cats and atoms that mean that the cat collapses but the atom doesn’t?
Gravity causes collapse. Cats are vastly more massive than atoms. One collapse theory, developed by Roger Penrose and others, exploits this to say that gravity causes collapse. Specifically, the more massive an object is, the more likely it will collapse. This theory says that atoms are small enough so that they can have multiple properties, for example having decayed and not decayed, simultaneously. This is precisely what we see in experiments. However, the cat is so large that, with near certainty, its state will collapse into being either dead or alive.
Complexity causes collapse. The main theory of this sort is known as the Ghirardi–Rimini–Weber theory, and it is actually quite similar to gravity causing collapse. It basically says that the more particles an object is made of, the more likely it will collapse. A cat is made of many many particles, and therefore, again with near certainty, its state will collapse into being either dead or alive.
Consciousness causes collapse. Now, we don’t actually have a universally agreed upon definition of what consciousness is, and so the theory that consciousness causes collapse is far from being precisely formulated. It of course depends on precisely which creatures (or artificial intelligences!) are said to be conscious. Many would agree that a cat is conscious, and this theory would roughly then say that such a conscious creature cannot have multiple properties simultaneously, and therefore its state will collapse into being either dead or alive. However, if you think that a cat is not conscious, or you replace Schrödinger’s cat with Schrödinger’s microbe (or something else you deem to be not conscious), then this theory would predict that whatever is in the box is dead and alive simultaneously. Only when you open the box, and your consciousness interacts with its contents, would the state collapse, and you would be left in either an elated state of seeing an alive creature, or a devastated state of seeing a dead one.
The theory that consciousness causes collapse is perhaps the most compelling, and to some people the most intuitive, explanation of why we never see the cat as being dead and alive simultaneously. However, as I argue here, the picture of reality that this theory predicts is fantastically bizarre and obscure, and far from intuitive!
Many worlds theory
The collapse theories introduced above all modify quantum mechanics in some way, and by doing this they can explain why we never see a cat that is simultaneously dead and alive. But is it really necessary to modify quantum mechanics? According to many worlds theory the answer to this is no. However, as explained above, unmodified quantum mechanics predicts that the cat is dead and alive, so there is clearly some explaining to do to unify this prediction with our view of reality.
Before opening the box, the cat is dead and alive. Technically there is only one cat, which is simultaneously dead and alive. But the great insight of Hugh Everett, who first proposed this theory, was that we should actually treat it as two cats, one dead and one alive. Can we really do this? To show that we can, some calculations need to be done, in particular using a framework known as decoherence, but this is too technical to introduce now; see here for an introduction to decoherence in the context of many worlds. The important conclusion from these calculations is that the dead and alive cats can never interact with each other: the alive cat cannot see the dead one, and it can’t smell it nor touch it; as far as it is concerned the dead cat need not exist. For this reason, the usual terminology is that there are two “worlds”, one containing a dead cat and one containing a living cat. This is the idea of “many worlds”. Another way to put it is that there are two parallel universes, with one cat occupying each. But whatever terminology you like to use, the important point is that it is completely consistent within quantum mechanics to say that both cats are equally real, and for all intents and purposes they exist isolated from one another.
What then happens when you open the box? The answer is that you split into two versions of “you”, one that sees the alive cat, and one that sees the dead cat. Again these two versions of you can never interact, and have no way of measuring each other’s existence. They are, for all practical purposes, in separate parallel universes.
Many worlds theory in fact predicts that our reality is almost continuously splitting into multiple parallel universes. In each parallel universe there will be a different version of you. There will be almost infinite versions of you, each going about their day oblivious of all the others. This may seem completely far-fetched, but just because something is not at all intuitive does this mean that it is wrong? It used to be considered absurd that the world is round, or that the universe is vastly larger than our solar system, or that our bodies contain billions of microscopic organisms without which we couldn’t survive.
QBism – what do our quantum mechanical equations really tell us?
Are we looking at all this in completely the wrong way? Imagine the state of the cat before the box is opened. Using quantum mechanics, it is in principle possible to write down an equation representing the state of the cat. What would this equation really tell us? In many worlds theory, and indeed in most ways of thinking about quantum mechanics, this equation tells us what state the cat is in. Specifically, we are assuming that the cat does exist, and that our equation tells us something about it.
But we can take a different perspective of what this equation represents. To see this, we can ask the question: what do we normally use this equation for? The answer is that we use this equation to tell us the probability that, when we open the box, we will see an alive cat. We cannot use the equation to tell us with certainty whether the cat will be alive or dead – it only ever tells us the probability. For example, it will be possible to set up the thought experiment so that there is a 50% chance of seeing an alive cat once the box is opened, and a 50% chance of seeing a dead cat. Now, over 100 years of experiments have shown that quantum mechanics is extremely good at predicting the probabilities of different events happening in experiments. In fact, as quantum mechanics has never been proved wrong, it is so far perfect at predicting probabilities of outcomes to experiments. Therefore, we know the probability of what will happen when we open the box, and repeating the experiment many times would indeed show that half the time the cat was alive, and half the time the cat was dead.
But what makes us think we know what is happening inside the box before we open it? One way of looking at quantum mechanics, which is often called “QBism”, is to say that our equations do not directly tell us what happens inside the box before we open it. The equations just tell us the probabilities of different events happening. In particular, our equations don’t directly tell us that the cat does exist, and that it is both dead and alive simultaneously. The same can be said for all other quantum experiments. For example, when we measure a radioactive atom we can use quantum mechanics to calculate the probability that it will decay. And with today’s simple quantum computers we can calculate the probability that, given a certain input, we will measure a certain output. But our equations do not tell us the state of the atom or the quantum computer before the measurement.
This way of thinking about quantum mechanics has similarities to the question if a tree falls in the woods with no one around, does it still make a noise? If we replace the tree with the cat, and the woods with the box, then the QBism answer is that we cannot know anything about the cat before we open the box! Normally, we think of science as telling us something about a real world independent of us, that still exists regardless of our presence in it. QBism takes a different view: quantum mechanics is just a toolbox for predicting probabilities of events.
To many this will seem like a limited view, or perhaps a pessimistic view of the capabilities of science. But how do we really know what happens before we observe/measure anything? The extreme version of this viewpoint says that we can never truly know anything other than our own conscious thoughts. How do we know we aren’t in the matrix? How do we know that the signals entering our brains aren’t just fed into us? The much more conservative version of this view is that a real world does exist independent of us, but quantum mechanics doesn’t tell as anything about it. Either way, the riddle of Schrödinger’s cat is no longer a problem: Is the cat really dead and alive before we open the box? The answer is that we do not, and cannot, know. It is a meaningless question!
“Shut up and calculate”
Still unsatisfied? Are you not willing to modify a theory that has never been proved wrong? Or believe in almost infinite parallel universes containing almost infinite versions of you? Or is it unsatisfactory to reject the existence of things before we measure them? There are some other ways of looking at quantum mechanics which I haven’t mentioned, such as pilot wave theory or relational quantum mechanics, but in my view each of these has significant overlaps with some of those introduced above. Therefore, if you completely reject all of the above viewpoints, then maybe you are destined to never be satisfied!
But is this really a problem? Quantum mechanics works, and it works extremely well. It is often stated as being “our most successful theory ever”, owing to the extremely precise predictions of quantum mechanics that have been vindicated, and the vast number of successful experiments over the past 100 or so years. One further viewpoint, then, is that we shouldn’t care whether the cat is dead, or alive, or both. Instead of being distracted by parallel universes and bizarre thought experiments, we should focus on using quantum mechanics better. This is particularly relevant at the moment: the “quantum technology revolution” is making great headways towards fulfilling its promise of transforming future technologies. Quantum cryptography is said to make communication 100% secure; quantum metrology promises to make ultra-precise measurements allowing us to investigate previously-inaccessible phenomena; and quantum computers have the potential to exponentially speed up our computations, thereby revolutionising the whole computing industry. Should people like me therefore stop quibbling about philosophical obscurities, and knuckle down to the real business. Indeed, should we shut up and calculate?
]]>