Philosophy of Science

The two leading views of scientific explanation, Salmon’s “bottom-up” view and the Friedman-Kitcher “top-down” view, give what appear to be prima facie incompatible characterizations of scientific explanation. According to the bottom-up view, we explain a given phenomenon when we uncover the underlying causal mechanisms that are responsible for its occurrence. The top-down view, on the other hand, maintains that we explain a phenomenon by deriving it from the general principles or laws that best unify our knowledge. In this paper, I focus on theoretical explanations in physics, i.e., explanations of physical laws. I first show that, as Salmon suggests (1989, p. 180-182), it seems promising to treat these two approaches not so much as different views about explanation but rather as descriptions of two distinct types of scientific explanations; there are clear cases of laws that have bottom-up explanations (BUEs) while others receive only top-down explanations (TDEs). I then argue, using explanations of mass-energy equivalence in Special Relativity (SR), that this disparity (why should some laws receive only TDEs after all?) is best understood as a symptom of a deeper distinction, first introduced by Newton, between two levels of physical theory. At one level, there is the collection of general principles and definitions of physical terms, i.e., a theoretical framework, from which one derives general constraints for all physical processes. At a lower level, there are laws that identify and describe specific physical interactions like gravitation and electromagnetism. I use Einstein’s distinction between ‘principle’ and ‘constructive’ theories, which offers a fruitful way to distinguish the two levels of theory, to argue that only lower level theories, i.e., constructive theories, yield BUEs. These explanations, furthermore, invariably rest on higher level laws that receive only TDEs from a principle theory. Thus, I conclude that Salmon’s challenge to characterize the relationship between BUEs and TDEs can be met by recognizing the close relationship between types of theoretical explanation and the structure of physical theories.

We need not look far to see that the explanations of some laws in physics are best described as BUEs. The ideal gas law, as explained by the Kinetic Theory of Gases (KTG), is such an example. According to Salmon, KTG explains this law because it gives us knowledge of how the causal interactions between a collection of molecules result in the observed correlation between pressure, volume, and temperature. Salmon is very clear that it is not the derivation of the macroscopic gas properties from the molecular hypothesis and Newtonian mechanics that is the explanation. This is because, for Salmon, the derivation only gives us predictive and descriptive knowledge. However, the identification of the gas with a collection of molecules, and the recognition that it is their causal interactions that bring about the observed quantitative relation between the state variables, gives us causal knowledge. Like all BUEs, the explanation of the ideal gas law from KTG yields the kind of knowledge that “opens up the black boxes of nature to reveal their inner workings” (Salmon, 1989, p. 182).

On the other hand, there are many examples where the closest thing we have to an explanation of a law is a TDE. Consider Newton’s derivation of the Law of Conservation of Momentum from the three laws of motion in Principia. According to the Friedman-Kitcher top-down view, derivations are explanatory only if they contribute to the unification of our knowledge. In Friedman’s (1974) original formulation, this means, roughly, that only derivations that reduce the number of facts we have to accept as brute are explanatory. Kitcher (1989) has more recently argued for an account that characterizes the unification achieved by explanatory derivations as the best ‘systematization’ of our knowledge. In either case, it seems both Friedman and Kitcher would agree that Newton’s derivation of the conservation of momentum can be easily shown to be an explanation according to their view; it is, in other words, a TDE. Together, these examples suggest that it may be helpful to distinguish two types of theoretical explanations because the salient features of the explanations of these two laws are quite different.

There are two questions that seem natural at this stage. First, if it is true that there are both BUEs and TDEs in physics, do all physical laws have both a BUE and a TDE? Salmon (1989, p. 180 f.) suggests that singular events may have both types of explanations and that each explanation may contribute to our understanding of the phenomenon. Is this also the case when the explanandum is a physical law? Second, if some laws receive only one type of explanation, then why do these laws have this type of explanation?

An instructive example for answering both of these questions is the explanation of Einstein’s famous mass-energy equivalence, E=mc2. There are two main ways to arrive at this result in SR. First, one can derive the equivalence of mass-energy, as Einstein (1905) does the first time he proves this result, using a thought experiment that appeals to Maxwell’s theory of electromagnetism[1]. Einstein considers a body that emits two pulses of light of equal energy in opposite directions with respect to a given coordinate system. By invoking a variety of physical principles to analyze the act of emission with respect to another inertial coordinate system[2], he arrives at his celebrated result: “If a body gives off the energy L in the form of radiation, its mass diminishes by L/c2” (1905, p. 71)[3]. Prima facie it seems we have a BUE for mass-energy equivalence because we have gained knowledge of the causal process, viz., the emission of electromagnetic energy, that underwrites the law we are trying to explain.

The distinctive feature of the second way of deriving the equivalence of mass-energy is that

E = mc² is derived directly from general physical principles without appealing to Maxwell’s theory. Historically, this approach appears shortly after Einstein’s first derivation in the work of a variety of physicists, including Einstein, and culminates in the work of Ehlers, Rindler, and Penrose (1965). Typically, it is part of attempts to develop a relativistic dynamics of point particles. Thus, unlike Einstein’s first derivation, these derivations of E=mc2 use not just the kinematical framework of SR but also the dynamical quantities, relativistic momentum and relativistic kinetic energy. In Einstein’s (1935) derivation, more than half of the paper is devoted to arriving at the mathematical expressions that define these quantities. With the definitions of these quantities at hand, the derivation of mass-energy equivalence is straightforward. If we grant that this derivation unifies our knowledge, at least insofar as it is obtained from physical principles that can yield a variety of other laws, then it seems we have also a TDE for E = mc². However, to conclude that we have both a BUE and a TDE of mass-energy equivalence would be a mistake. To see why this would be a mistake, we have to recognize that the distinction between the two types of theoretical explanation is really indicative of a deeper distinction in the structure of modern physics.

One of Newton’s lasting contributions to physics is a distinction between two levels of physical theory. In the Principia, Newton explicitly separates the framework of principles that govern all possible forces, i.e., his three laws of motion and their consequences (what we call Newtonian mechanics), from the force laws that describe specific physical interactions within this framework, like his law of Universal Gravitation. This kind of distinction is so ingrained into the fabric of modern physics that it tends to escape the scrutiny of philosophers of science. Some philosophers have recognized something like this distinction in the context of SR. For example, Earman points out that “STR is not a theory in the usual sense but is better regarded as a second-level theory, or a theory of theories that constrains first-level theories” (Earman, 1989, p. 155). However, Earman gives no indication that this kind of distinction is pervasive in the history of modern physics nor does he give a detailed analysis of the distinction. Einstein’s (1919) proposal for distinguishing between what he called ‘principle’ and ‘constructive’ theories offers a fruitful way to describe the distinction we have inherited from Newton.

For Einstein, principle and constructive theories are distinguished according to their starting-points. At the origins of a constructive theory we find ‘hypothetical constituents’ whose combined behaviour gives rise to the observed phenomena we wish to explain. These theories “attempt to build up a picture of the more complex phenomena out of the materials of a relatively simple formal scheme from which they start out” (Einstein, 1919, p. 228). Einstein cites philosophers’ perennial favourite, KTG, as a paradigmatic constructive theory because its starting-points are molecules, i.e., hypothetical constituents of matter[4]. One of the simple “formal schemes” used by KTG is to treat molecules as perfectly elastic, rigid spheres of small, but finite, diameter and to analyze their collisions within the framework of Newtonian mechanics. By combining the behaviour of these hypothetical elements according to this formal scheme, one can deduce the familiar gas law.

At the foundations of a principle theory, on the other hand, one finds “general characteristics of natural processes” (Einstein, 1919, p. 228), i.e., the principles or postulates of the theory. Einstein’s favourite examples of physical principles of this kind are the “universally experienced fact that perpetual motion is impossible,” the Principle of Relativity (Einstein, 1919, p. 228), and Galileo’s law of inertia (Einstein, 1936, p. 307). This requirement that the principles be general features of physical processes precludes raising phenomenological laws, and laws that describe specific physical interactions but which are not obviously phenomenological, to the status of ‘principle’. For example, one cannot treat Maxwell’s laws of electromagnetism as principles, in the way that one can treat Newton’s three laws of motions as principles, because the former apply only to electromagnetic interactions.

Einstein’s distinction between principle and constructive theories is threefold. First, Einstein distinguishes principle and constructive theories by what these theories postulate as their starting points. One can regard this as an ontological distinction: constructive theories postulate the existence of ‘entities’ (with specific properties) while principle theories postulate general physical principles that govern the behaviour of matter[5]. Second, principle and constructive theories are distinguished by how we come to know their starting points[6]. This is an epistemological distinction. The ‘principles’ or ‘postulates’ of a principle theory are empirically discovered[7]. Einstein tells us that “the scientist has to worm these general principles out of nature by perceiving in comprehensive complexes of empirical facts certain general features which permit of precise formulation” (Einstein, 1914, p. 221). On the other hand, the starting points of a constructive theory are ‘free creations of the human mind,’ as Einstein might say, and thus are not empirically discovered. Finally, principle and constructive theories also differ because they play distinct conceptual roles in scientific theorizing. Principle theories establish constraints that the theoretical descriptions of phenomena offered by constructive theories must satisfy. For Einstein, the postulates of a principle theory “give rise to mathematically formulated criteria which the separate processes or the theoretical representation of them have to satisfy” (1919a, p. 228).

Einstein’s insight was to recognize that principle theories constrain the theoretical descriptions of phenomena offered by constructive theories. For example, once we accept Newtonian mechanics, which is a principle theory, and the empirical evidence codified by Kepler’s laws, we have little choice but to conclude that gravitation is an inverse square force. This is evident from the first steps of Newton’s argument for Universal Gravitation in Book III of Principia. Similarly, once we agree to treat molecules as perfectly rigid spheres that obey Newton’s laws, we are forced to accept that if, on average, the kinetic energy of the molecules of a gas increases, the pressure on the walls of its container will increase provided the volume of the container is held constant. It is this relationship of constraint, I want to suggest, more than anything to do with what the starting points of these theories are, or how they are discovered, that isolates the two levels of theory. Principles theories establish general constraints for the theoretical descriptions of all physical processes[8]. A constructive theory, on the other hand, identifies and describes a specific physical interaction, like gravitation, while satisfying the constraints imposed by the principle theory.

With this description of the distinction between the two types (or levels) of theory we have inherited form Newton, we can understand why we find different types of theoretical explanations in physics. Einstein’s constructive theories are precisely the kind of theories that yield BUEs. Beyond purely descriptive and predictive knowledge, constructive theories can provide causal knowledge because they identify the specific forces and interactions that bring about the observed phenomena. On the other hand, principle theories are restricted to offering TDEs. This is because a principle theory can never offer a detailed description of a chain of events in the way a constructive theory can. From the postulates of a principle theory, one can only derive other laws that are satisfied by all physical processes. If this is correct, then we can begin to understand why the first two examples I cited receive the explanations they do. KTG is a constructive theory and thus offers a BUE of PV=nRT. Newtonian mechanics, on the other hand, is a principle theory that can yield only a TDE for the law of conservation of momentum. How can this illuminate my earlier claim that it is a mistake to regard E=mc² as having both types of explanations?

In his first derivation , Einstein explains the equivalence of mass-energy by appealing to a constructive theory, viz., Maxwell’s theory of electromagnetism. Physicists often remark that Einstein’s original derivation is undesirable precisely because it depends on Maxwell’s theory. Einstein himself expresses this sentiment in the opening paragraph of the article that contains his second derivation:

The question as to the independence [from Maxwell’s theory] of those relations [like E=mc2] is a natural one because the Lorentz transformation, the real basis of the special theory, in itself has nothing to do with the Maxwell theory and because we do not know the extent to which the energy concepts of the Maxwell theory can be maintained in the face of the data of molecular physics (Einstein, 1935, p. 223).

The methodological prescription implicit in these remarks is that whenever one has a result that applies to any physical process whatsoever one should look for a derivation at the level of the principle theory, i.e., one should look for a TDE. As Einstein explains, this is particularly important because a general result like E=mc2 should not depend on the correctness of a constructive theory like Maxwell’s theory[9]. If this is correct, then we have good grounds for rejecting Einstein’s first derivation as a good explanation. What Einstein’s second derivation shows is that mass-energy equivalence has nothing to do with Maxwell’s theory. Instead, E = mc² is a direct result of changes to the dynamical quantities that must be made in light of the changes to the structure of spacetime introduced by SR. The remarkable aspect of this is that, although the equivalence of mass-energy is confirmed consistently in experiments with sub-atomic particles, it is not a consequence of a (nuclear) theory of matter, which might provide a causal explanation. Thus, the closest thing we currently have to an explanation of E = mc² is a TDE.

I have argued that the distinction between BUEs and TDEs is a manifestation of a deeper distinction between two levels of physical theory we have inherited from Newton. Principle theories only offer TDEs while constructive theories offer BUEs. Recognizing this allows us to understand the choice of representative examples selected by proponents of either view. For example, it should come as no surprise now that Salmon’s paradigm of a theoretical explanation is KTG’s explanation of the ideal gas law. Furthermore, since the type of explanation one finds for a given law depends crucially on whether this law is part of a constructive theory or a principle theory, we can also see why the respective views of explanation work so well where they do. For example, one of the challenges for the top-down view is to account for the problem of asymmetry. But once we recognize that TDEs only occur at the level of the principle theory, asymmetry is no longer a problem. This is because at this level explanations are symmetric, which should come as no surprise since these explanations are non-causal. If we had a fully axiomatized principle theory, we would have a great deal of freedom in choosing what laws to count as primitives (or principles) of the theory. Though physicists do not work with fully axiomatized theories, there is still some freedom in choosing the ‘primitives’ of a principle theory. For example, in Newtonian mechanics, one could treat the law of conservation of momentum as a principle, i.e., a primitive, and not a derived consequence as in Newton’s original formulation. Where asymmetry is really necessary for an explanation is at the level of a constructive theory where BUEs achieve it by their appeal to causes.

Finally, if I am correct, then we can meet Salmon’s challenge to characterize the relationship between BUEs and TDEs. The fundamental laws of a principle theory are required in order to tell the causal stories that underwrite BUEs. We simply would not be able to account for the pressure of a gas in KTG, for example, without making an appeal to the principle of conservation of momentum. It is not just that we need to appeal to these principles, implicitly or explicitly, to tell our causal stories. We simply cannot carry out the detailed calculations required to derive the gas laws without appealing to fundamental physical principles. So, while we cannot say that BUEs are “constrained” by TDEs, as we say of the two corresponding levels of theory, we can say that BUEs require fundamental laws. And these fundamental laws, in turn, accept only TDEs. In short, our scientific understanding of a law belonging to a constructive theory inevitably rests on the TDEs the fundamental laws receive from principle theories.

[1] Although a number of authors, including Jammer (1961) have argued that this proof does not work, Torretti and Stachel (1982) have more recently shown that this proof does not contain any errors. Still, like other proofs of this result, what cannot be established is that all the mass of the body can be convertible to energy. This must be accepted as an additional hypothesis in this proof.

[2] Einstein appeals to the law of conservation of momentum, the principle of conservation of energy, a transformation equation for the energy of an electromagnetic pulse, and the existence of a Newtonian limit in his derivation.

[3] Einstein points out that the result is entirely general because, since all types of energy are convertible, it does not matter that the energy emitted by the body is in the form of electromagnetic radiation.

[4] For Einstein, molecules are correctly regarded as hypothetical constituents because when the theory was first postulated we did not yet have sufficient grounds to believe in their existence.

[5] One need not take “matter” to mean physical bodies. One can also include fields, like the electromagnetic field, as is traditionally done in SR.

[6] Einstein is clearly dealing here with a question regarding the ‘context of discovery’ of physical theories. Both principle and constructive theories (or our beliefs in the principles and ‘entities’ they postulate) are justified empirically.

[7] This is, of course, only a necessary and not a sufficient condition. Also, Einstein was well aware that what actually become the physical principles which become codified as mathematical criteria are arrived at through a process of idealization that may well go beyond our past, current, and future experimental results.

[8] Einstein is fond of saying that the necessary conditions established by a principle theory must be satisfied by the theoretical description of any physical process. This requirement seems a bit too strong. Physicists can, and sometimes do, chose to ignore one or more of the requirements imposed by the principle theory. However, this is not taken lightly because of the weight of evidence accumulated for most principles. It is a delicate philosophical, and scientific, matter to decide the merits of such an exchange. For example, one would be hard pressed to give up a well-entrenched physical principle, like the principle of conservation of energy, merely for the purposes of “saving the phenomena.”

[9] This was particularly pressing in this case because of then recent discoveries regarding the quantization of energy.