[On Freedom and Free Enterprise: Essays in Honor of Ludwig von Mises (1956)]
In an interesting, though apparently neglected, aside, Professor Hayek has remarked that “ … neither aggregates nor averages do act upon one another, and it will never be possible to establish necessary connections of cause and effect between them as we can between individual phenomena, individual prices, etc. I would even go so far as to assert that, from the very nature of economic theory, averages can never form a link in its reasoning.”[1]
Now, any serious doubt concerning the validity of aggregates and averages is a dagger aimed straight at the heart of much current empirical research and statistical analysis in economics. Therefore it deserves close and systematic attention, even if this involves, in the opinion of some dedicated empiricists, an annoying interruption of the “front-line” activity of measurement for the mere purpose of “armchair” discussion of methodological issues.[2] Yet such is our contemporary spirited march on “objective data” that one who begins to suspect that a wrong turn may have been made sometime back almost naturally feels guilty for harboring this traitorous thought, and, if he expresses it at all, must expect to be regarded as a ruminant obstructionist in the company of men of action.
But methodology should need no apology. In the first place, anyone concerned with policy and the “planned economy” should be the least able to deny the need for a “planned economics.” Second, as even a little reflection will show, so great a proportion of our data is presently accumulating in the form of aggregates and averages[3] that it would be, in prudence, uneconomic to ignore a possibility which, if true, would largely vitiate their usefulness.[4] Nor can economics avoid the difficulty by leaning, methodologically, on the physical sciences. On the one hand, it is by no means established that the physical sciences can make more than tentative, hypothetical use of statistical inference and probabilistic reasoning; and, on the other, even if they could, it would not necessarily follow that the kind of problem which is posed by economics is amenable to the same treatment.[5] That the implications of the concept of “law” in the natural sciences rule out applicability to the social sciences has been pointed out by too many[6] to need further discussion here. Our task here is not to discuss any of the broad methodological and even philosophical aspects of economic science, important and interesting as these doubtless are; it is, rather, the relatively narrow one of inquiring into some of the characteristics of averages and aggregates which may escape the attention of research workers in our field, and to attempt to clarify a few of the implications in their use, since this use is so integral a part of empirical research.
It might be well for us to note at the outset of our discussion something frequently pointed out about the statistical method: that it divides into two easily distinguishable, though, as we shall see, not entirely unrelated parts — (1) the description of phenomena, and (2) the drawing of inferences as to meaningful relations among these phenomena.[7] Inasmuch as the second of these aspects necessarily depends upon, and makes detailed use of, the first as its data, it follows that any averages which are an important part of descriptive statistical inquiry must also, eo ipso, enter the second, or inferential, stage. Nor is this the full extent of their involvement in statistical inference. Averages also enter inferential statistical analysis independently of their descriptive value; analysis has come to depend upon them not merely because its data are usually in that form, but also, more significantly, because it appears to be able to draw tighter inferences concerning probability distributions of data when these data are in the form of averages than when they are in the “raw” form of individual observations. The possibility that this seeming initial advantage may ultimately result in error or distortion is precisely the point of our discussion.
Regard for space and for the reader’s patience does not permit an exhaustive examination of the many problems posed by the use of aggregative materials in general in our field; this discussion will therefore restrict itself to some reflections on averages as a special type of aggregate. It seems best to proceed by listing several characteristics of averages and examining their implications.
1. The Average Is a Special Type of Aggregate
It has been noted[8] that an average is merely a certain value of the variable which it measures and is therefore necessarily of the same dimensions as that variable. Thus, if the variable is age, or income, or a percentage, an average of that variable is expressed, respectively, as an age, income, or percentage.[9] In this fact, otherwise so convenient, may lurk some danger. If averages are really theoretical constructs at some remove from reality (as further discussion will attempt to show), then the fact that they are expressed in the same terms as the variable itself may give them an illusory realism and may lead the incautious to confuse shadow with substance.[10] Of course, this danger is very much smaller in those areas where realities are unmistakably in discrete units; no one would be likely to overlook the fact that an average family of, say, 2.73 members is merely a symbol and describes no real family. But where reality is less discrete or the units less shudderingly indivisible, the danger persists. And it is just as serious, where the explanation of causality is concerned, in the case of even a perfectly continuous variable as in that of a clearly discontinuous one — it is merely much less obvious.
In its own way, the average admits of the cumulative addition which is more usually associated with other aggregates. It is commonplace procedure, wherever the number of individual observations is large, to avoid the pedestrian task of adding individual items by calculating from frequency distributions; yet this procedure will be seen necessarily to involve averaging within each class interval on the basis, not of specific and exact information (which may even be no longer available), but by making some broad assumptions about the distribution of individual values within the class grouping. These assumptions tend to introduce into our calculations a systematic error which even ingenious mathematical manipulations (like “Sheppard’s corrections” for example) cannot entirely eliminate.[11] In current economic research, many entities which inspection would show to be themselves averages are then combined into further aggregations or super-averages of which the price-indices are, perhaps, the arch-example. If the process of averaging in any way logically involves a retreat from causally effective specificity, this process of cumulative averaging presents us more and more with a Gordian knot which only the usual drastic surgery, and not mere statistical adjustment, can undo.
2. Averages Are Mental “Constructs”
An average is not an immediate datum of experience but an indirectly apprehended “summary” of the data of perception. In this regard, it seems fair to say that an average is of the form of a proposition,[12] and one whose determinacy may depend, as Marschak points out,[13] on “a priori information” in our possession even before we collect the data on which we base it. The immediately apprehended data of experience are relatively independent of concepts and theory; averages are not, but are, rather, described fact.[14] An average cannot, therefore, be regarded as a simple aggregation of individual observations; it attempts to summarize and thus necessarily sacrifices a certain measure of realism for the sake of numerical accuracy.[15] The desire for this form of “accuracy” is, of course, part of the age-old conviction in economics that, if we can quantify economic phenomena, we can then formulate “laws” applicable to them;[16] but it has not always been recognized that the requirements of quantification and of the formulation of laws may tend to subordinate the basically individual nature of phenomena — that is, to regard them as merely representative illustrations of laws.[17] Perhaps because of the special significance of differences in the social sciences, the suppression of the individuality of things as scientifically unimportant which Max Weber termed “naturalistic monism in economics” appears to have especially important consequences, among which may be the veneration of averages which we are discussing.[18]
In this connection, it is perhaps very important to realize that the average is, in a sense, the denial of the significance of differences and changes.[19] The notion of an average necessarily suppresses, in the dimension averaged, whatever variations or “deviations” there may be among its components; and, even if the fact of the individual differences is not deliberately thrown away, those differences, so long as the average substitutes for the original data in further computations, are rendered entirely indeterminate and thus causally inoperative. When we say that the average income of a group of, say, ten families is $4,000, and go on to use this figure in our explanations of economic results, we are implicitly transferring the causative power of the individual incomes which went into that average to a group of ten fictitious families each of which is presumed to have an income of $4,000. Some, indeed, even appear ready to proceed to draw imporant inferences as to the “propensity to consume” of this group as contrasted with that of another group whose average income is, say, $5,000. But it is possibly inconsistent to insist on the consequences of a difference between two averages in this respect while leaving out of account the differences within each of them; the latter type may even be the more causally significant of the two. In any event, there may be at least as much economic “force” explained by the fact that, within each group, there may be a very wide range of difference of individual incomes than by the necessarily attenuated differences between averages. If, in the first group in our example, there were 9 families with incomes of $1,000 and 1 with $31,000, (average: $4,000), and, in the second group, 4 with $11,000 each and 6 with $1,000 each, (average: $5,000), there would conceivably be much more causative “potential” present than is shown in the comparison of the averages. The “average propensity to consume” is thus possibly one of our crasser abuses of the average.
To the extent that economic action is ultimately dependent for explanation on individual differences,[20] the employment of averages puts us out of reach of such explanation simply by understating these differences. For an average, by its nature, can only minimize if not entirely eliminate, differences; it can never magnify them. There is thus no possibility of drawing comfort from any compensating effect of large numbers; for the distortion brought into play by the use of averages cannot, ironically, itself be “averaged out.” The least distortive possibility for an average is neither to minimize nor magnify — and this only in the case of identical components (in our example, that of ten families each with an actual income of $4,000 or $5,000) — in the very case, in other words, in which the average loses most, if not all, of its representative usefulness. For the “construct” of average will be seen on reflection to owe its very existence to differences; there would be no need or even usefulness in its calculation or use were it not for such differences. The student of statistics who experiences any surprise whatever in reading that the sum of deviations from the arithmetic mean always equals zero either never before really understood the meaning of average or is momentarily dazzled by a new terminology; the statement is merely tautological — it is true “by construction.”
“Methods like these will make even a poor painting look good — but only so long as we neither come closer nor open our eyes.”Why any averages, then? Precisely because the individuality of cases (in the physical as well as the social sciences) has often proved intractable for those intent on the discovery of exact laws to describe and predict events.[21] There is a principle of “safety in numbers” even in science, it appears, and when the unit is recalcitrant to exact ordering, we retreat into consideration of great masses of such units and appear to find regularities in their group behavior to compensate for our frustration vis-a-vis the single unit.[22] And this, to repeat, is not true of the social sciences alone;[23] the reaction of modern physics, for instance, to the Heisenberg principle of uncertainty was the recasting of subatomic hypothesis along probabilistic lines[24] — and only time will tell whether this turns out to be a form of mere temporizing, since the methodological, and even philosophical, implications of this approach have yet to be fully faced.[25] In our science, the individual economic reality has shown itself to be even less docile than the single electron; for, while it is at least possible to posit average behavior for particles which are by hypothesis identical in structure and unchanging in composition over time, human action remains undeniably individual and capriciously changeable.[26] Yet it is the same mass analysis to which we have resort; and it is curious that social scientists — with less reason to be — appear much more comfortable in their adoption of the aegis of “large numbers” than are the physicists.[27]
The average is thus part of our response to the elusiveness of economic reality. And what is the price we pay for the elimination of the troublesome differences? One is that these differences are not really eliminated but merely made indeterminate. By extending our use of averages into “distributions” we appear still to have a hold on the differences; we can express a whole “population” with only two parameters : the mean and the standard deviation. Many assure us that a distribution is entirely determinate if only these two parameters are known; yet it is not often pointed out as clearly as it deserves that, in the first place, these are almost never known with any exactitude, but only within degrees of probability or “confidence limits,” and, secondly, that within any realistic range of empirical practice, these are much lower scales of probability than obtain in other disciplines. It is not of much use for the proponents of the aggregative statistical approach to remind us that, after all, we know nothing inductively with absolute certainty; we may readily admit this and still be unable to order our economic affairs on the basis of the probabilities they offer; we can admit that we can expect night to follow day only with a very high degree of probability and still wish we were just as “uncertain” about market behavior. The inadequacy of current economic statistical inquiry cannot be avoided by simply substituting probability for certainty — where it was true that we were not able to derive laws with any certainty, it is now equally true that we cannot derive them with a sufficiently high degree of probability to be of any practical use. While the difference may be less embarrassing, it is no less real.
Another cost of abandoning research to the frenzied accumulation of averages and other aggregates has been the resulting loss of specificity in our data.[28] An average is indeterminate. Once it is computed, if the component individual items are not retained, it tells no unique story; there are literally an infinite number of constellations of data which might have resulted in this same average figure. It is, therefore, also irreversible. It is impossible to reason back from an average to the original items which formed it; it is freely admitted that there is a “loss of information” involved. But it should be borne constantly in mind that this loss is, on the one hand, irretrievable — we cannot have recourse to averages as we do to logarithms: for ease of computation at the end of which we reconvert to real terms; and, on the other, that the loss may be precisely in the area where we can least afford it — that of particular differentials where economic causality appears to originate.[29] The attractive stability which aggregates, including averages, exhibit in contrast to individual events may thus be purely illusory; this “stability” appears to increase directly with the inclusiveness of totals and may be nothing more than the result of the progressive elimination of significant causative differences. If we average over long enough periods of time, even the business cycle itself will disappear. Therefore, even apart from other shortcomings of averages, there is a point beyond which even their most enthusiastic supporters must beware of going, or risk leaving all meaningfulness behind, regardless of the degree of mathematical sophistication. It is possible that this same phenomenon of loss is significant, though, of course, in minor fashion, in the simplest average; it is undoubtedly so in procedures which compound, out of already complex averages, still larger ones.
3. The “Superiority” of the Mean as a Measure of Location
It is common for texts in statistical method to point out that the mean is, for most purposes, the best of the available measures of “central tendency.” This claim of superiority for the mean appears to be based primarily on the often observed phenomenon that it exhibits more stability over a number of samplings than do other measures of location.[30] In practice this stability shows itself in the fact that the means computed from a number of samplings tend to be clustered more closely than is usually the case with either individual observations or with other measures of location, like the median or the mode. Now this proves to be a crucial claim which deserves to be examined closely and critically,[31] since it relates not only to the average as a tool of description, but, even more importantly, to its use in statistical inference.
One might begin by asking whether the stability or clustering involved here inheres in the subject matter described by the mean or is contributed, partly or wholly, by the measure itself. When we say that the mean is a better measure of central location, are we praising it as a more accurate description of the distribution of the actual variable, or as a construct which, by its very composition, tends to manufacture more central tendency than may possibly inhere in the observations of reality as actually made? The illuminating fact that the means of samples drawn from certain populations show more clustering than the single observations themselves appears to be more indicative of the second possibility than of the first. An interesting further aspect of this phenomenon will be discussed in a later section; here we content ourselves with inquiring into what assumptions, if any, would appear to underlie the presumed superiority of the mean in its descriptive aspect.
It is perhaps worthy of note, in this connection, that if we assume a population which is perfectly “normal” in the statistical sense, the purely descriptive superiority of the mean over, say, the mode and the median largely disappears. In such a case, the three measures coincide completely and the mean would offer no descriptive advantage; indeed, since it is somewhat more laborious to compute, quite the reverse would appear to be true. Its sampling (i.e., clustering) superiority would, of course, remain, but this, as we have said, may be extraneously introduced by the very concept of averaging. It is only as we begin to leave “normality” of distribution that the descriptive superiority of the mean asserts itself. Let us see what this implies.
The two salient characteristics of a normal distribution are its symmetry and unimodality. If we consider small departures from normality by introducing some asymmetry into our distribution (but retaining, for the moment, its unimodality), the three measures will cease to coincide. Under this condition, the mode will still describe the most typical value, but will no longer be located at the center of the distribution; the median will no longer fall at the most typical value, but will still indicate the center (though now only of the number of cases and not the center of total value); the mean will no longer lie at the typical class, nor at the numerical center, but will still indicate, so to speak, the “center of gravity” of the distribution (that is, the total value of the distribution divided by the number of cases). Now, it is clear that each of these measures has retained, according to its nature, a different kind of descriptive centrality; therefore, it is logical to assume that the claim of superiority for the mean must be based on the conviction that it retains the kind of centrality which is deemed most important to accurate description, in this case, namely, the centrality of total value. A little reflection will show that this conviction must, in its turn, be based on some notion of the additive nature of the phenomena measured (and we thus return, by another route, to recognition of the mean as one type of aggregate). But, at least in economics, it is no light matter to assume the additive nature of things; there are many who would deny vigorously, and with impressive arguments, such a possibility in any body of material relating to human valuations. Here, certainly, is an issue which should have been definitively settled before we could proceed to settle upon the average as a favored tool of calculative analysis; yet it was not. Until it is, it is at least permissible for some economists to regard the modal value, since it occurs more frequently in actual experience than a theoretically adjusted, virtual value like the average, as more useful for their field. The argument that the mean is representative of the whole distribution (while the mode is not) and can thus enter further algebraic calculation should not deceive us. In the first place, ease of further mathematical treatment is not, by itself, sufficient to justify the average; in the second, the representativeness alluded to may ultimately depend on the unsupported assertion of the additivity of economic phenomena.
If we depart from unimodality as well as from symmetry in distribution, the descriptive value of the mean recedes even further from actual cases and becomes more clearly a purely theoretical symbol whose superior applicability to problems of both description and estimation admittedly diminishes.[32] It turns out, therefore, that the area of superiority of the mean is the relatively narrow one determined by distributions which differ only mildly from complete normality. The scope of this paper does not permit any detailed examination of the important corollary which suggests itself: the question as to the extent to which real economic phenomena naturally arrange themselves in the shape of quasi-normal distributions; this consideration alone would take us far afield into such intricate matters as the theory of probability,[33] the nature of causality and even the nature of reality.[34] For our special purpose here it is perhaps sufficient to recognize that much current research appears to be based on the proposition that near-normal distributions accurately describe many important economic realities. We must therefore cope, in the next two sections, with the possibility that some aspects of this seeming regularity in the statistical material we use may perhaps have been inadvertently introduced by ourselves in the very act of adopting averages as a tool of analysis.
4. Some Assumptions about Phenomena Implicit in the Use of Averages
We have seen that the justification of the use of averages may depend to a great extent on the validity of the assumption that the phenomena so treated are de natura usually distributed in a manner more or less approximate of the normal curve.[35] This assumption implies, in turn, a number of propositions about the nature of the average and its components; it is therefore perhaps useful to examine each of these briefly to determine whether or not they appear to be valid, especially in the case of economic data.
(a) “Continuous” variables. In any strict sense, a variable cannot actually be perfectly normally distributed if it is of the discontinuous type.[36] As an illustration of this which will again be useful later on, let us consider the well-known convergence of the binomial and normal distributions. The binomial expression often used in the elementary theory of probability as applied to two events is:
(p + q)n
The expansion of this binomial, as the exponent n is increased, produces coefficients (of p and q and their intermediate terms) which arrange themselves in a symmetrical and unimodal fashion. The resulting histogram — if one were to draw it to aid visualization — while it approaches the normal distribution[37] as the exponent increases, can never actually become identical with the smooth curve of the statistically-perfect normal distribution because the intervals of the variable n do not, in this case, decrease infinitely; in order to arrive at the normal distribution in this instance it is necessary to imagine n as being able to take any value, no matter how small; in other words, to become a continuous variable. The eminent French mathematician, Henri Poincaré, has generalized the demonstration of this by showing that what leads to this distribution is a property possessed by any continuous variable, namely, that its derivatives are limited.[38]
Now, how characteristic of economic phenomena is continuous variability? Are actual prices, production, income, market demand, or any of the other important data continuously variable? Not conceivably; therefore, the normal distribution can only apply to them theoretically (i.e., by a species of conceptual interpolation), and this fact should be carefully borne in mind in assessing the validity of any of the instrumentalities of analysis based on the dimensions of the perfectly normal curve — and that considerable part of statistical inference which depends, through the employment of some types of test of significance, for its validity on the “theory of errors” and other probability distributions should perhaps head the list.
(b) Independence. It is an important qualification of the application of the binomial we have been considering to the theory of probability that the events to which it refers be statistically independent; that is, that the occurrence of one have no effect whatever on the possibility of occurrence of others.[39] This is clearly the case when we are dealing with the tossing of a perfect coin or the throwing of perfect dice; but one can reasonably wonder about the cogency of applying this sort of independence to economic, or any other social events. Of how many human actions can we predicate the needed statistical independence, even when sampled at random?[40] The study of the social behavior of the individual is ever bringing to light new interrelationships in the economic responses of the gregarious social animal; we thus appear to be going toward the recognition not of less, but actually of more interconnection among social phenomena.[41]
“To adopt a probabilistic explanation of phenomena is tantamount to the flat denial of causality.”(c) Mutual exclusiveness. Another requirement for the binomial is that the events p and q must be mutually exclusive; that the occurrence of both p and q together must be impossible. Again, this quality applies much more clearly to coins and dice than to people and their actions. The statistician may imagine he can satisfy this requirement by merely seeing to the form of his proposition: e.g., A either buys or does not buy. But how often is the real case one of buying less, or buying a substitute, which, when reduced to the terms of this proposition, is equivalent to both buy and not-buy. We can easily construct mutually-exclusive semantic categories which satisfy every analytical requirement except the crucial one of corresponding to everyday actualities.
(d) Exhaustiveness. Not only is the probability of p in our binomial exclusive of that of q, it is also necessary that, between them, they be exhaustive of the total probability. In the usual mathematical formulation, the entire range of probability is contained from 0 to 1, and what is required of p and q here (or, in the case of a multinomial, of the whole set of terms) is that they must invariably add exactly to unity.[42] Now it is patently impossible for the social scientist even to conceive of all the possibilities in his subject, much less to compute the probability-weight of each of them. And this certainly not for lack of trying; current economic literature gives eloquent, if inconclusive, evidence of heroic attempts to approach all-inclusiveness by the use of “models,” or systems of simultaneous equations — a method which appears to be able to explain nothing unless it explains everything. One cannot avoid the impression, in this regard, that economists may have been guilty of trying to arrive directly at the equivalent, in their field, of a Unified Field Theory without yet having formulated the component laws of gravitation and of electro-magnetism. We can readily admit that there is, after all, no science like omniscience and yet question the practical value of this approach as an avenue of knowledge for mortal man.[43]
(e) Homogeneity. It is a consequence of the additive implications of the average that the items entering it be homogeneous, or “of the same genus.”[44] This requirement assumes greater importance, indeed insistence, in measure as we are engaged — as is currently frequently the case — in compounding averages without always fully assessing their comparability; for most economic statistics are what R.G.D. Allen has called[45] “mixed bags” of heterogeneous items, whose claim to any homogeneity is either partial or contrived or both. In this regard, the statistical analyst must constantly guard against gross misinterpretation of the scope of his measurements; for a person of “average” income may be average in nothing else and, as we have seen, may be very far from typical even in that. Moreover, where data spanning an appreciable interval of time are concerned, certainty of homogeneity requires checking to exclude the possibility that any of those directly unobservable variations which have been termed “structural changes”[46] have entered to vitiate any real comparability of data. Consequently, the homogeneity requisite for valid quantification of economic data is, or should be, one of the most discouraging obstacles to mathematical analysis in economics.[47] One is never sure, for example, whether the prices (probably the most frequently used numerical quantities in our field) paid by different individuals, or by the same individual at different times, really differ by more or by less than their ratio seems to indicate, since the unit in which they are expressed is itself the object of varying individual appreciation. An average made up of prices with different valuation-meanings would have only a superficial homogeneity and therefore dubious validity.[48] This may very possibly be one reason for the puzzling inability of even the most elaborately devised price-index to furnish us a coefficient which can then be exactly and meaningfully applied to the very same individual data out of which the index itself was computed.
5. The Average Is a “Multiplier”
It is commonly thought that one of the clear advantages of the average is that it summarizes the information of many individual observations into the relatively brief compass of a single representative figure. In one sense this is undoubtedly true; where there were previously a number of items there appears now to be only one — and we have discussed some of the implications of the descriptive power of this “single” figure. Yet a curious fact emerges if we start with a finite number of actual observations and then consider the total number of possible averages which this finite number of items can produce. It is that, for any number of original observations in excess of two, the total number of averages possible: (a) exceeds the number of original items, and (b) rapidly outdistances the latter as these increase in number. Ordinarily, since we tend to regard the number of averageable events to be infinite, or at least indefinite,[49] this aspect of the matter is not apparent and we are likely to go on unquestioningly accepting the average as a distillation or summarization of information. (The concept of infinity, necessarily vague and elusive for us, is a poor frame of reference for our finite minds and experience; a larger finite number is, for instance, not perceptibly any nearer to infinity than a smaller, and the deduction that it is will be found to be based on a comparison of the finite numbers with each other and not with infinity.) Let us therefore, in the following discussion, consider only a finite and definite number of events or observations, say ten, and, in order to avoid any unintended numerical connotations, let us further designate these ten by A, B, C, … J.
Now, how can we determine the number of averages which these ten make possible? Here the mathematical theory of combinations comes readily to our aid;[50] according to this principle, the total number of combinations, of n things taken r at a time is:
Now, n things can variously be taken 0, or 1, or 2 … or n at a time, so that by simply solving the above formula successively for r = 0,1,2, … n and adding the results will give us the total number of possible combinations. In our example, in which n was taken to be 10, the results can most graphically be shown by reference to the famous Pascal triangle:
Entered below the last line (that corresponding to our example of ten items) are the symbols indicating, respectively, 10 things taken 0 at a time (), 1 at a time (), and so on to 10 at a time (). The number directly above each of these symbols and along the line n = 10 indicates the number of different combinations possible in each case.
The first and second diagonal columns along the left side of the triangle will be seen to relate to the cases r = 0 and r = 1 respectively (that is, to things taken O and 1 at a time). Since neither of these types of combination can be considered an average, we have, for our purpose, excluded them by drawing a line between them and the rest of the triangle, the latter representing the whole gamut of combinations which can properly be considered as averages. Moreover, since the r = 0 column gives the value of unity throughout and the r = 1 column a value equal in each instance to the corresponding value of n, we can easily adapt the above formula for the total number of combinations so as to give us the total number of averages, (A rn):
Tabulating these results as n is increased from 1 to 10:
No. of events (n) : | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Total no. combinations (Σ Crn): | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 256 | 512 | 1024 |
Total no. averages (Σ A rn): | 0 | 1 | 4 | 11 | 26 | 57 | 120 | 247 | 502 | 1013 |
Inspection of the progression of the total number of combinations shows that this figure is a function of n:
Σ Crn=2n ;
as is, too, the total possible number of averages:
Σ A rn=2n– (n+1).
It is clear that the total number of averages it is possible to form out of a given number of items increases at a rate only slightly less than the number 2 raised to a power equal to the number of items. A glance at our tabulation shows that even for the relatively small number of components in our example (10) the number of possible averages has already exceeded 103. It is therefore not necessary to go beyond our very modest example to see the steeply multiplicative effect of averaging. Not only are averages, therefore, theoretical constructs and not data of experience, but they are also increasingly more numerous than the items of which they are usually presumed to be summaries. Here, perhaps, is loss of information of another sort; so that if, for example, we are given a set of 5 averages (still within the 10 ultimate items of our example) we have, in one sense, much less of the total picture (5/1023) than we have if we are given five actual observations (5/10).
But it is a central part of much statistical inference that averages tend to cluster much more closely than do single observations themselves.[51]. How can we account for this in the face of the multiplicative tendency just discussed? A brief examination of our Pascal triangle will show that the multiplicity of combinations, far from being evenly distributed, are heavily concentrated about the point r=n/2 as a center. This concentration is such that at n = 10 the three middle classes of combinations include well over half of all the possible combinations for that interval. It is, perhaps, especially worth noting that this triangle reports nothing other than the coefficients of expansion of our old friend the binomial
(p + q)n
“Socialism itself can be defined as the political form of central tendency;it uses the concept of average not only as a means of computation but also as an end”as n increases. The relation of the concentration of averages to the approximation of the normal curve appears, in consequence, to become a little clearer. Averaging does not simply multiply cases; it multiplies them according to a principle which progressively approximates the normal curve. It is therefore not a wondrous quality of phenomena that their averages cluster; it is something we may have extraneously introduced by the very act of applying the average to their measurement. For in admitting averages and their greater multiplicity we have also admitted a great deal of concentrated overlapping or repetition. If we take as a crude measure of this overlapping the number of times any same single item (say, A of our original A, B, … J), plus the number of times each double (e.g., AB), plus the number of times each triple (e.g., ABC), and so on, are repeated within the total of combinations shown in our triangle, it can be shown by calculation with which we shall not further impose on the reader, that the greatest degree of overlap lies at the center of the average-size and decreases symmetrically on either side of it. Taking the last row of our triangle (i.e., at n = 10), and restricting ourselves to those combinations which can qualify as averages (i.e., from r = 2 to r = 10), and noting below each the total number of “repetitions” (singles, doubles, etc.) as described above, we have:
No. of averages: | 45 | 120 | 210 | 252 | 210 | 120 | 45 | 10 | 17 |
Total no. “repetitions”: | 10 | 45 | 120 | 210 | 252 | 210 | 120 | 45 | 10 |
Enough has perhaps been said to show that when it is pointed out by statistical workers that sample means approximate the normal distribution even if the observations themselves are skewed, what they may be saying is, in effect, that the semblance of symmetry can be introduced into non-normal distributions[52] by applying to the latter a device which has a built-in tendency to multiply differentially so as to lend centrality and unimodality to the data.
6. Averages, Aggregates and Public Policy
We have seen that it is characteristic of averages and other aggregates (1) that they tend to suppress individual differences and actual typicalness[53] for the sake of quantification or “summarization,” and (2) that they represent, in economics as elsewhere in science, an attempt to deal with phenomena in the mass. In part, this latter is a reaction to the inability to deal, with any degree of certainty, with individual events and represents a compromise with epistemological difficulties.[54] Being unable to paint in clearly the details of our picture, we appear to have been content to back away from it by adopting the use of mass analysis and, further, to squint at reality through the half-closed lids of probabilistic reasoning.[55] Methods like these will make even a poor painting look good — but only so long as we neither come closer nor open our eyes. Ultimately, however, all will have to be judged in clear light and at close range; whatever we may do to disguise it, economic reality remains distressingly individual and particular. Moreover, it is unfortunately not yet widely enough appreciated — even by some scientists — that to adopt a probabilistic explanation of phenomena is tantamount to the flat denial of causality.
But in part, too, the current resort to aggregates of all kinds is a facet of our hastening approach to central control as an ideal in economic affairs. Bureaucracy requires classification of economic fact into relatively few broad bands of manageable “homogeneity”; it abhors differences because it simply cannot operate in a field of bewildering individual complexity. In a sense, socialism itself can be defined as the political form of central tendency;[56] it uses the concept of average not only as a means of computation but also as an end. In the fully developed ideal socialist state the “average” individual will no longer be a statistical device of the sort discussed here, but an accurate description of every actual individual.[57] This accuracy, however, will not have been attained by the refinement of descriptive method so as to fit actuality better, but actually the reverse. The aggregative approach in economics suits this program very well. The word “average” even etymologically betrays its redistributive reference — in this case specifically the redistribution of losses of cargo in transit.[58] And our contemporary treatment of whole aggregates like “income,” “wages,” “capital,” and the like is implicitly in the same vein. Within each of these aggregates lie innumerable functioning differences which have been merely suppressed by classification.[59] It is one thing to use these aggregates as a rough summary measure of past social and economic outcomes; it is quite another to regard them as causally operative upon one another.[60] Yet this appears to be what we are doing, and in no small measure as a result of the confusion as to the limitations of statistical devices in wide use. Our concern in this section is specifically with the average and with the somewhat desperate claim made by some that it is indispensable for the operation of controls in effecting public policy.[61] But this is a tenuous argument: one should be free to question the desirability of central planning and control — and therefore to point out that we cannot submerge the moral falsity of the assertion that the ends justify the means by the simple expedient of making the latter geometric or harmonic.
[bio] See [AuthorName]’s [AuthorArchive].
You can subscribe to future articles by [AuthorName] via this [RSSfeed].
Notes
[1] F. A. Hayek, Prices and Production, London, 1935, 2nd ed. rev., pp. 4–5.
[2] Even Marshall comes close to this view in his advice to Schumpeter; (cf. P. A. Samuelson, “Economic Theory and Mathematics — An Appraisal,” Amer. Econ. Rev., vol. xlii, May 1953, no. 2, p. 65).
[3] Cf., e.g., R. A. Gordon, “Business Cycles in the Interwar Period: The Quantitative-Historical Approach,” Amer. Econ. Rev., vol. xxxix, May 1949, No. 3, pp. 51–3. Gordon points out that both the econometric “models” of Tinbergen and the Cowles Commission group and the cycle studies of the National Bureau as well as other forms of statistical approach find it difficult to cope with information which cannot be quantified and expressed in the form of averages.
[4] We cannot, I think, simply accept “aimless floundering” as inevitable for the social sciences because of their “youth” as Miss Wootton appears to do (cf. Testament for Social Science, New York, 1950, p. 71). Indeed the very fact, which this writer rightly deplores, that “many blind alleys are long ones, and … we do not always recognize this till we have gone a very long way off the right track” is evidence of the ultimate economy (much like that of all indirect production) of pausing for methodological issues.
[5] Cf. F. S. C. Northrop, The Logic of the Sciences and the Humanities (New York, 1949), pp. 33, 240–3; P. A. Samuelson, Foundations of Economic Analysis (Cambridge, Mass., 1948), pp. 91, 93, 226, 351–2.
[6] Cf., e.g., M. Weber, The Methodology of the Social Sciences (Glencoe, Ill., 1949), pp. 73–5, 86; J. Marschak, “Probability in the Social Sciences,” in P. F. Lazarsfeld, ed., Mathematical Thinking in the Social Sciences (Glencoe, Ill., 1954), pp. 190, 194; Northrop, op. cit., pp. 212, 243–9, 261, 263; T. G. Connolly and W. Sluckin, An Introduction to Statistics for the Social Sciences (London, 1953), p. 101.
[7] For a discussion of the application of this double aim to physical science generally, cf. P. Duhem, “Representation vs. Explanation in Physical Theory,” in P. P. Wiener, ed., Readings in the Philosophy of Science (New York, 1953), pp. 454ff. Cf. also J. M. Keynes, A Treatise on Probability (London, 1921), pp. 3, 327.
[8] G. U. Yule and M. G. Kendall, Introduction to the Elementary Theory of Statistics (New York, 1950), 14 ed. rev. and enl., p. 112.
[9] It is interesting to note that even a, the other determining parameter of a distribution besides the mean, is itself not free of the difficulties of averaging. We compute each deviation from the mean in arriving at the variance, but in computing the standard deviation we extract the square root of the average of the squared deviations, thus causing extreme cases to affect the variance and the standard deviation unequally. Cf., e.g., L. Cohen, Statistical Methods for Social Scientists (New York, 1954), p. 46; W. E. Deming and R. T. Birge, On the Statistical Theory of Sampling (Washington, D.C., 1937), p. 147; P. G. Hoel, Introduction to Mathematical Statistics (New York, 1954), 2nd ed., p. 52.
[10] Curiously, Yule and Kendall (op. cit., pp. 113–4), appear to base their claim of easy comprehensibility for the average precisely by ignoring this danger; their example of an average income in this connection admittedly assumes an equalizing redistribution (statistically, of course) of income. Cf. also F. A. Hayek, The Counter-Revolution in Science (Glencoe, Ill., 1952), pp. 36–43.
[11] Cf., e.g., R. G. D. Allen, Statistics for Economists (London, 1949), pp. 86–7; J. F. Kenney, Mathematics of Statistics, Part One (New York, 1947) 2nd. ed., p. 78.
[12] Cf. Northrop, op. cit., pp. 35, 39, 247, 261.
[13] Cf. Marschak, op. cit., pp. 198–9.
[14] Cf. L. Robbins, An Essay on the Nature and Significance of Economic Science (London, 1935), 2nd ed., p. 105; C. V. Langlois and C. Seignobos, Introduction to the Study of History (London, 1898), p. 218; Hayek, The Counter-Revolution in Science, pp. 38–9.
[15] Cf. L. von Mises, Human Action (New Haven, 1949), pp. 347–54; M. J. Maroney, Facts from Figures (Harmondsworth, Middlesex, 1951), p. 43.
[16] Cf., e.g., A. Standen, Science is a Sacred Cow (New York, 1950), p. 82: “If the idols of scientists were piled on top of one another in the manner of a totem pole, the topmost one would be a grinning fetish called Measurement.” Cf. also Hayek, The Counter-Revolution in Science, pp. 50–1.
[17] Cf. Northrop, op. cit., pp. 241 ff., 268; Kenney, op. cit., p. 81; G. J. Stigler, Five Lectures on Economic Problems (New York, 1950), p. 43; R. A. Fisher, The Design of Experiments (London, 1937), 2nd ed., pp. 4, 119; C. E. Weatherburn, A First Course in Mathematical Statistics (Cambridrge, 1946), p. 30; Maroney, op. cit., p. 37. Hayek points out (op. cit., p. 214, note 45) that the use of mathematics has no necessary connection to the attempts to measure social phenomena, but may be used merely to represent relationships to which numerical values cannot ever be assigned.
[18] Cf. Weber, op. cit., pp. 73, 75, esp. 86; Standen, op. cit., pp. 204–6.
[19] Cf. Mises, op. cit., pp. 223–4, 410–11.
[20] Cf. R. M. MacIver, Social Causation (Boston, 1942), pp. 27, 65, 377; Kenney, op. cit., p. 84.
[21] Cf. Weber, op. cit., p. 119.
[22] Cf. Fisher, op. cit, pp. 45, 225–6; T. C. Koopmans, “The Econometric Approach to Business Fluctuations,” Amer. Econ. Rev., vol. xxxix, May 1949, no. 3, p. 64; J. A. Schumpeter, “Science and Ideology,” Amer. Econ. Rev., vol. xxxix, March 1949, no. 2, p. 345. See especially Mises, op. cit., pp. 106–17, 396; the distinction here made between “class” and “case” probability appears to apply pertinently to this problem.
[23] Cf. P. A. Samuelson, “Economic Theory and Mathematics — An Appraisal,” Amer. Econ. Rev., vol. xlii, May 1952, no. 2, pp. 61–2.
[24] Cf. Northrop, op. cit., pp. 201–12; M. R. Cohen, Reason and Nature (Glencoe, Ill., 1953), 2nd ed., p. 224; K. Pearson, The Grammar of Science (London, 1937), pp. 128–9; MacIver, op. cit., pp. 54, 60 n.
[25] Northrop, op. cit, pp. 343–7.
[26] Cf. ibid., pp. 245, 248–9, 261–3; Connolly and Sluckin, op. cit, p. 101; P. A. Samuelson, Foundations of Economic Analysis, pp. 21–7; Marschak, op. cit., pp. 190–2.
[27] Cf. ibid., p. 194; Standen, op. cit., pp. 146, 155–6; MacIver, op. cit., p. 263; Wootton, op. cit., pp. 17, 21, 25, 30–1, 34–5.
[28] Cf. Allen, op. cit., p. 17.
[29] Cf. Hayek, “The Use of Knowledge in Society,” Amer. Econ. Rev., vol. xxxv, September 1945, no. 4, pp. 521–4.
[30] Cf. ibid.; also, Hoel, op. cit., pp. 50–1.
[31] In this section is discussed only the descriptive side of this claim; the inferential side will be examined later. Cf. Keynes, op. cit., p. 336; Deming and Birge, op. cit., p. 160.
[32] Cf. Connolly and Sluckin, op. cit., p. 29; L. Cohen, op. cit., pp. 40, 155; Hoel, op. cit., pp. 50–7. One extreme example is the so-called Cauchy distribution whose theoretical moments are infinite and hence where the median becomes a far better measure of location than the mean.
[33] Cf., e.g., A. Eddington, The Philosophy of Physical Science (New York, 1939), p. 61; Marschak, op. cit., pp. 2–3; C. S. Peirce, “The Doctrine of Necessity Examined,” in P. P. Wiener, op. cit., pp. 485–96.
[34] It is virtually impossible to discuss statistical distributions without being led, as most writers are, into probability theory. The works cited here are, of course, no exceptions; cf., e.g., Hoel, op. cit, p. 30; L. Cohen, op. cit., pp. 89–100; Yule and Kendall, op. cit., pp. 207–12, 312, 335–43; Connolly and Sluckin, op. cit., pp. 79, 87–8, 102; Lazarsfeld, op. cit. pp. 9, 168, 188, 423; Deming and Birge, op. cit., pp. 131, 137; Fisher, op. cit., p. 19; Kenney, op. cit., p. 131; Weatherburn, op. cit., pp. 34–5; Northrop, op. cit., pp. 210, 218; Samuelson, Foundations of Economic Analysis, p. 23. Also see especially H. Poincaré, Science and Hypothesis (New York, 1952), eh. XI, pp. 183–210 and Science and Method (New York, 1952), pp. 64–6, 74–90, 87–8, 284–8 (both in English transl. by F. Maitland).
[35] Cf. Connolly and Sluckin, op. cit., pp. 70–1; Yule and Kendall, op. cit., pp. 180, 185, 437; Kenney, op. cit, pp. 114–119; Fisher, op. cit, pp. 40–51; Poincaré, Science and Hypothesis, pp. 206–7.
[36] Cf., e.g., L. Cohen, op. cit., p. 61; Yule and Kendall, op. cit., p. 176; Keynes, op. cit, pp. 48–9.
[37] Cf. Yule and Kendall, op. cit., pp. 171–6; Poincaré, Science and Method, p. 79; L. Cohen, op. cit., pp. 71–2; Northrop, op. cit., p. 207; Connolly and Sluckin, op. cit., p. 69; Maroney, op. cit., pp. 91, 96, 129.
[38] Poincaré, Science and Hypothesis, pp. 193–200; Science and Method, pp. 78–84. Cf. also Weatherburn, op. cit., pp. 34–5; R. von Mises, “Causality and Probability,” in Wiener, op. cit., pp. 501–4.
[39] Cf. L. Cohen, op. cit., pp. 64–5. For a special feature of the Poisson distribution in this connection, see Maroney, op. cit., pp. 97–100.
[40] Statistical independence can also be described as “obedience to the multiplication theorem of probability,” (cf. Weatherburn, op. cit., pp. 26–7, 81); the distinction made by the latter between “statistical” and “functional” independence does not, I believe, necessarily eliminate the difficulty mentioned in this section. Cf. also Keynes, op. cit., p. 54.
[41] Cf. Marschak, op. cit., pp. 202–4; MacIver, op. cit., pp. 93, 300, 309; Fisher, op. cit., pp. 222–3.
[42] That this applies all the way to the limiting case of a perfectly continuous variable is illustrated by the similar equating to unity of the area under the normal curve; (cf., e.g., Maroney, op. cit., p. 113).
[43] Cf. Hayek, “The Use of Knowledge in Society,” Amer. Econ. Rev., vol. xxxv, September 1945, no. 4, p. 521.
[44] Cf. Maroney, op. cit., p. 35.
[45] Op. cit., p. 19.
[46] Cf. T. C. Koopmans, ed., Statistical Inference in Dynamic Economic Models (New York, 1950), p. 266; A. G. Hart, “Model-building and Fiscal Policy,” Amer, Econ. Rev., vol. xxxv, September 1945, no. 4, p. 538; P. A. Samuelson, Foundations of Economic Analysis, pp. 354–5; L. Cohen, op. cit., pp. 131–2; A. Marshall, Principles of Economics (New York, 1925) 8th ed., pp. 36–7.
[47] The instantaneous or timeless character of mathematics has no “passage” or duration and cannot represent, in its equations, the irreversibility of time; (cf. MacIver, op. cit., pp. 66–7). Cf. also Koopmans, op. cit., p. 3; Samuelson, op. cit., p. 4; and L. von Mises, op. cit., p. 56.
[48] Cf. Marschak, op. cit., p. 175; Northrop, op. cit., pp. 33, 239–43.
[49] Cf. Yule and Kendall, op. cit., p. 333. On the resort to probability analysis as a method of dealing with what we are ultimately ignorant of, cf. Poincaré, Science and Hypothesis, pp. 184–5, 189–90, 208–9; and Science and Method, pp. 64–5, 87–90, 284–8. Further, reasoning from probability — and tests of significance based upon it — may have only a permissive force; cf. Connolly and Sluckin, op. cit., pp. 87–8, 102, 153–5; L. Cohen, op. cit., pp. 89–99; Yule and Kendall, op. cit., pp. 207–12, 312, 335, 423, 437; Lazarsfeld, op. cit., pp. 9, 168, 188, 423; Deming and Birge, op. cit., pp. 131, 137 ff; Fisher, op. cit., p. 19; Maroney, op. cit., pp. 219–20.
[50] This is on the supposition that the order of the events is not a factor; (cf., e.g., Hoel, op. cit., p. 293). If the order of events entering the average were germane (as it very possibly could be in economics), we would have to deal not with “combinations,” but with “permutations” — an even more numerous group.
[51] Cf. Yule and Kendall, op. cit., pp. 382-7; 434-7; L. Cohen, op. cit., pp. 87-90; Connolly and Sluckin, op. cit., pp. 28, 81-5, 92-3; Deming and Birge, op. cit, p. 123; Weatherburn, op. cit., pp. 119-25; Allen, op. cit, p. 117; Keynes, op. cit, pp. 337-66.
[52] Cf. Hoel, op. cit., pp. 103-5; Maroney, op. cit., pp. 94, 135-40.
[53] Cf., e.g., Weber, op. cit., pp. 100-1; Fisher, op. cit., pp. 45, 225-6.
[54] Cf. L. von Mises, op. cit., pp. 39, 47, 57, 64, 86, passim.
[55] Cf. Hoel, op. cit., pp. 15, 29-30; Northrop, op. cit., pp. 210 ff. One part of this has been the resort to randomness and the related assumption of the equiprobability of whatever is not known; cf. Fisher, op. cit., pp. 23 ff; Poincaré, Science and Method, pp. 9-10, 66, 74-5, 80-1; Koopmans, op. cit., pp. 2-6; Connolly and Sluckin, op. cit., p. 79; Samuelson, op. cit., p. 23; Keynes, op. cit., pp. 7-15, 21-4, 42-4, 61-4.
[56] Cf. Northrop, op. cit., p. 355.
[57] Cf. K. Marx, Capital (New York, n.d.), Modern Library edition, p. 22; Samuel-son, op. cit., p. 223; Hayek, The Counter-Revolution in Science, pp. 53-63; L. von Mises, op. cit., pp. 257, 697-9, 706-11.
[58] Cf. Maroney, op. cit., p. 34.
[59] Cf. A. N. Whitehead, An Introduction to Mathematics (New York, 1948), p. 32 ff; Keynes, op. cit., pp. 328-9. For a very recent and rather extreme example of faith in classification as the road to knowledge in economics, see E. C. Harwood, Reconstruction of Economics (Great Barrington, Mass., 1955), pp. 8-9; Mr. Harwood finds great comfort in the identification of “knowing” with the “naming transaction” as made by John Dewey and A. F. Bentley (Knowing and the Known, Boston, 1949, p. 296), and while he admits that “ … nothing just said enables economists or anyone else to use the word ‘knowledge’ for the purpose of specifying (scientifically naming) anything in particular”, he nevertheless asserts that, as a result of this approach, “ … the economists can at least climb down their various trees of ‘knowledge’ and survey the relatively firm ground of knowing and the known.” [One hastens to add that they had been in the trees for epistemological rather than atavistic reasons.]
[60] Cf., e.g., Samuelson, op. cit., pp. 9, 99, 118, 223-7, 351-2; Connolly and Sluckin, op. cit., pp. 118-35.
[61] For some very optimistic expectations expressed by writers on statistics in this regard, cf. Kenney, op. cit., p. 2. Cf. also. Yule and Kendall, op. cit., p. 206.