the Week of Proper 26 / Ordinary 31
Click here to join the effort!
Bible Encyclopedias
Probability
1911 Encyclopedia Britannica
(Lat. probabilis, probable or credible), a term which in general implies credibility short of certainty.
The mathematical theory of probabilities deals with certain phenomena which are employed to measure credibility. This measurement is well exemplified by games of chance. If a pack of cards is shuffled and a card dealt, the probability that the card will belong to a particular Subject. suit is measured by - we may say, is - the ratio 1:4, or 1; there being four suits to any one of them the card might have belonged. So the probability that an ace will be drawn is as out of the 52 cards in the pack 4 are aces. So the probability that ace will turn up when a die is thrown is 6. The probability that one or other of the two events, ace or deuce, will occur is 3. If simultaneously a die is thrown and a card is dealt from a pack which has been shuffled, the probability that the double event will consist of two aces is 1 X4 divided by 6 X52, as the total number of double events formed by combining a face of a die with a face of a card is 6 X52, and out of these 1 X4 consist of two aces.
The data of probabilities are often prima facie at least of a type different from that which has been described. For example, the probability that a child about to be born will be a boy is about 0.51. This statement is founded solely on the observed fact that about 5 r % of children born (alive, in European countries) prove to be boys. The probability is not, as in the instance of dice and cards, measured by the proportion between a number of cases favourable to the event and a total number of possible cases. Those instances indeed also admit of the measurement based on observed frequency. Thus the number of times that a die turns up ace is found by observation to be about 16.6% of the number of throws; and similar statements are true of cards and coins.' The probabilities with which the calculus deals admit generally of being measured by the number of times that the event is found by experience to occur, in proportion to the number of times that it might possibly occur.
The idea of a probable or expected number is not confined to the number of times that an event occurs; if the occurrence of the event is associated with a certain amount of money or any other measurable article there will be a probable or expected amount of that article. For instance, if a person throwing dice is to receive two shillings every time that six turns up, he may expect in a hundred throws to win about 2 X 16.6 (about 33.3) shillings. If he is to receive two shillings for every six and one shilling for every ace, his expectation will be 2 X 166+1 X 16.6 (50) shillings. The expectation of lifetime is calculated on this principle. Of loon males aged ten say the probable number who will die in their next year is 490, in the following year 397, and so on; if we (roughly) estimate that those who die in the first year will have enjoyed one year of life after ten, those who die in the next year will have enjoyed two years of life, and so on; I Cf. note to par. 5 below.
then the total number of years which the 1000 males 2 aged ten may be expected to live is I X 1000 + 2 X (1000-490) + 3 (1000 -490-397) +. Space as well as time may be the subject of expectation. If drops of rain fall in the long run with equal frequency on one point - or rather on one small interval, say of a centimetre or two - on a band of finite length and negligible breadth, the distance which is to be expected between a point of impact in the upper half of the line and a point of impact in the lower half has a definite proportion to the length of the given line.3 Expectation in the general sense may be considered as a kind of average. 4 The doctrine of averages and of the deviations therefrom technically called "errors" is distinguished from the other portion of the calculus by the peculiar difficulty of its method. The paths struck out by Laplace and Gauss have hardly yet been completed and made quite secure. The doctrine is also distinguished by the importance of its applications. The theory of errors enables the physicist so to combine discrepant observations as to obtain the best measurement. It may abridge the labour of the statistician by the use of samples.5 It may assist the statistician in testing the validity of inductions.6 It promises to be of special service to him in perfecting the logical method of concomitant variations; especially in investigating the laws of heredity. For instance the correlation between the height of parents and that of children is such that if we take a number of men all of the same height and observe the average height of their adult sons, the deviation of the latter average from the general average of adult males bears a definite proportion - about a half - to the similarly measured deviation of the height common to the fathers. The same kind and amount of correlation between parents and children with respect to many other attributes besides stature has been ascertained by Professor Karl Pearson and his collaborators.' The kinetics of free molecules (gases) forms another important branch of science which involves the theory of errors.
The description of the subject which has been given will explain the 4livision which it is proposed to adopt. In Part I. probability and expectation will be considered apart from the peculiar difficulties incident to errors or deviation from averages. The first section of the first part will be devoted to a preliminary inquiry into the evidence of the primary data and axioms of the science. Freed from philosophical difficulties the mathematical calculation of probabilities will proceed in the second section. The analogous calculation of expectation will follow in the third section. The contents of the first three sections will be illustrated in the fourth by a class of examples dealing with space measurements - the so-called " local " or " geometrical probabilities. Part II. is devoted to averages and the deviations therefrom, or more generally that grouping of statistics which may be called a law of frequency. Part II. is divided into two sections distinguished by differences in character and extent between the principal generalizations respecting laws of frequency.
Part I. - Probability And Expectation Section I. - First Principles. 1. As in other mathematical sciences, so in probabilities, or even more so, the philosophical foundations are less clear than the calculations based thereon. On this obscure and controversial topic absolute uniformity is not to be expected. But it is hoped that the following summary in which diverse authoritative judgments are balanced may minimize dissent.
2. (I) How the Measure of Probability is Ascertained. - The first question which arises under this head is: on what evidence are the facts obtained which are employed to measure probability ? A very generally accepted view is that which Laplace has thus expressed: It is more usual to speak of the mean expectation, the average number of years per head. Below, par. 88.
4 For more exact definition see below, par. 95.
' See Bowley's Address to Section F. of the British Association (5906).
6 Edgeworth, " Methods of Statistics," Journal of the Statistical Society (Jubilee volume, 1885).
7 See Biometrika, vol. iii. " Inheritance of Mental Characters." " The probability of an event is the ratio of the number of cases which favour it to the number of all the possible cases, when nothing leads us to believe that one of these cases ought to occur rather than the others; which renders them, for us, equally possible." 1 Against this view it is urged that merely psychological facts can at best afford a measure of belief, not of credibility. Accordingly, the ground of probability is sought in the observed fact of a class or " series " such that if we take a great many members of the class, or terms of the series, the members thereof which belong to a certain assigned species compared with the total number taken tends to a certain fraction as a limit. Thus the series which consists of heads and tails obtained by tossing up a well-made coin is such that out of a large number of throws the proportion giving heads is nearly half. 3. These views are not so diametrically opposed as may at first appear. On the one hand, those who follow Laplace would of course admit that the presumption afforded by the " number of favourable cases " with respect to the probability of throwing either five or six with a die must be modified in accordance with actual experience such as that below cited 3 respecting particular dice that they turn up five or six rather oftener than once in three times. On the other hand, the series which is regarded as the empirical basis of probability is not a simple matter of fact. There are implied conditions which are not satisfied by the sort of uniformity which ordinarily characterizes scientific laws; which would not be satisfied for instance by the proportionate frequency of any one digit, e.g. 8, in the expansion of any vulgar fraction, though the expression may consist of a circulating decimal with a very long period.' 4. The type of the series is rather the frequency of the several digits in the expansion of an incommensurable constant such as A / 2, log i i, ir, &c. 5 The observed fact that the digits occur with equal frequency is fortified by the absence of a reason why one digit should occur oftener than another.6 5. The most perfect types of probability appear to present the two aspects: proportion of favourable cases given a priori and frequency of occurrence observed a posteriori. When one of these attributes is not manifested it is often legitimate to infer its existence from the presence of the other. Given numerous batches of balls, each batch numbering say 100 and consisting partly of white and partly of black balls; if the percentages of white balls presented by the set of batches averaged, and, as it were, hovered about some particular percentage, e.g. 50, though we knew as an independent datum, or by inspection of the given percentages, that the series was not obtained by simply extracting a hundred balls from a jar containing a mélange of white and black balls, we might still be justified in concluding that the observed phenomenon resulted from a system equivalent to a number of jars of various constitution, compounded in some complicated fashion. So Laplace may be justified in postulating behind frequencies embodied in vital statistics the existence of a " constitution " analogous to games of chance, " possibilities " or favourable cases which might conceivably be " developed " or discussed. 7 On the other hand, it is often legitimate to infer from the known proportion of favourable cases a corresponding frequency of occurrence. The cogency of the inference will vary according to the degree of experience. That one face of a die or a coin will turn up nearly as often as another might be affirmed with perfect confidence of the particular dice which Weldon threw some thousands of times, 8 or the coins with which Professor Pearson similarly operated.9 It may be affirmed with much confidence of ordinary coins and dice without specific experience, and generally, where fairplay is presumed, of games of chance. This confidence is based not only on experiments like those tried by Buffon, Jevons and many others,'° but also on a continuous, extensive, almost unconsciously registered experience in pari materia. It is this sort of experience which justifies our expectation that commonly in mathematical tables one digit will occur as often as another, that in a shower about as many drops 1 Laplace, Theorie analytique des probabilites, liv. II. ch. i. No. i., Introduction, Il e Principe. The term employed by Venn in his important Logic of Chance. 3 Below, par. 119.
4 E.g. i s 61, in the expansion of which the digit 8 occurs once in ten times in seemingly random fashion (see Mess. of Maths. 1864, vol. 2, pp. i and 39).
5 The type shows that the phenomena which are the object of probabilities do not constitute a distinct class of things. Occurrences which perfectly conform to laws of nature and are capable of exact prediction yet in certain aspects present the appearance of .chance. Cf. Edgeworth, " Law of Error," Cam. Phil. Trans., 1905, p. 128.
Cf. Venn. cit. ch. v. § 14; and v. Kries on the " Prinzip des mangelnden Grundes " in his Wahrscheinlichkeitsrechnung, ch. i. § 4, et passim. I a passage criticized unfavourably by Dr Venn, Logic of Chance, ch. iv. § 14.
8 Below, par. 115.
9 Chances of Death, i. 44.
10 A summary of such experiments, comprising above 100,000 trials, is given by Professor Karl Pearson in his Chances of Death, i. 48.
will fall on one element of area as upon a neighbouring spot of equal size. Doubtless the presumption must be extended with caution to phenomena with which we are less familiar. For example, is a meteor equally likely to hit one square mile as another of the earth's surface ? We seem to descend in the scale of credibility from absolute certainty that alternative events occur with about equal frequency to absolute ignorance whether one occurs more frequently than the other. The empirical basis of probability may appear to become evanescent in a case like the following, which has been discussed by many writers on Probabilities." What is the probability of drawing a white ball from a box of which we only know that it contains balls both black and white and none of any other colour ? In this case, unlike the case of an urn containing a mixture of white and black balls in equal proportions, we have no reason to expect that if we go on drawing balls from the urn, replacing each ball after it has been drawn, that the series so presented will consist of black and white in about equal numbers. But there is ground for believing that in the long course of experiences in pari materia - other urns of similar constitution, other cases in which there is no reason to expect one alternative more than another - an event of one kind will occur about as often as one of another kind. A " cross-series " thus formed which seems to rest on as extensive if not so definite an empirical basis as the series which we began by considering. Thus the so-called " intellectual probability " 13 which it has been sought to separate from the material probability verified by frequency of occurrence, may still rest on a similar though less obvious ground of experience. This type of probability not verified by specific experience is presented in two particularly important classes.
6. Unverified Probabilities
In applying the theory of errors to the art of measurement it is usual to assume that prior to observation one value of the quantity under measurement is as likely as another. " When the probability is unknown," says Laplace, 14 " we may equally suppose it to have any value between zero and unit." The assumption is fundamentally similar whether the quantum is a ratio to be determined by the theorem of Bayes, 15 or an absolute quantity to be determined by the more general theory of error. Of this first principle it is well observed by Professor Karl Pearson 16: " There is an element of human experience at the bottom of Laplace's assumption." Professor Pearson quotes with approbation 16 the following account of the matter: " The assumption that any probabilityconstant about which we know nothing in particular is as likely to have one value as another is grounded upon the rough but solid experience that such constants do as a matter of fact as often have one value as another." 7. It may be objected, no doubt, that one value (of the object under measurement) is often known beforehand not to be as likely as another. The barometric height for instance is not equally likely to be 29 in. or to be 2 in. The reply is that the postulate is only required with respect to a small tract in a certain neighbourhood, some 2 in. above and below 291 in. in the case of barometric pressure.
8. It is further objected that the assumption in question involves inconsistencies in cases like the following. Suppose observations are made on the length of a pendulum together with the time of its oscillation. As the time is proportional to the square root of the length, it follows that if the values of the length occur with equal frequency those of the time cannot do so; and, inversely, if the proposition is true of the times it cannot be true of the lengths.18 One reply to this objection is afforded by the reply to the former one. For where we are concerned only with a small tract of values it will often happen that both the square and the square root and any ordinary function of a quantity which assumes equivalent values with equal probability will each present an approximately equal distribution of probabilities. 19 It may further be replied that in general the reasoning does not require the a priori probabilities of the different values to be very nearly equal; it suffices that they should not be very unequal; 20 and this much seems to be given by experience.
9. Whenever we can justify Laplace's first principle 21 that " probability is the ratio of the number of favourable cases to the number of all possible cases " no additional difficulty is involved in his second 11 E.g. J. S. Mill, Logic, bk. III., ch. xviii. § 2.
Cf. Venn, Logic of Chance, ch. vi. § 24.
13 Boole, Trans. Roy. Soc. (1862), ix. 251.
cit. Introduction.
15 Below, par. 130.
16 Grammar of Science, ed. 2, p. 146.
17 From the article by the present writer on the " Philosophy of Chance " in Mind, No. ix., in which some of the views here indicated are stated at greater length than is here possible.
18 Cf. v. Kries, op. cit. ch. ii.
19 On the principle of Taylor's theorem; cf. Edgeworth, Phil. Mag. (1892), xxxiv. 431 seq.
20 Cf. J. S. Mill, in the passage referred to below, par. 13, on the use that may be made of an " antecedent probability," though " it would be impossible to estimate that probability with anything like numerical precision." cit. Introduction.
principle, of which the following may be taken as an equivalent. If we distribute the favourable cases into several groups the probability of the event will be sum of the probabilities pertaining to each group.' io. Another important instance of unverified probabilities occurs when it is assumed without specific experience that one phenomenon is independent of another in such wise that the probability of a double event is equal to the product of the one event multiplied by the probability of the other - as in the instance already given of two aces occurring. The assumption has been verified with respect to " runs " in some games of chance; 2 but it is legitimately applied far beyond those instances. The proposition that very long runs of particular digits, e.g. of 7, may be expected in the development of a constant like 7r - e.g. a run of six consecutive sevens if the expansion of the constant was carried to a million places of decimals - may be given as an instance in which our conviction greatly transcends specific verification. In the calculation of probable, and improbable, errors, it 3 has to be assumed without specific verification that the observations on which the calculation is based are independent of each other in the sense now under consideration. With these explanations we may accept Laplace's third principle " If the events are independent of each other the probability of their concurrence (l'existence de leur ensemble) is the product of their separate probabilities." 4 11. Interdependent Probabilities. - Among the principles of probabilities it is usual to enunciate, after Laplace, several other propositions.' But these may here be rapidly passed over as they do not seem to involve any additional philosophical difficulty.
12. It has been shown that when two events are independent of each other the product of their separate probabilities forms the probability of their concurrence. It follows that the probability of the double event divided by the probability of either, say the first, component gives the probability of the other, the second component event. The quotient, we might say, is the probability that when the first event has occurred, the second will occur. The proposition in this form is true also of events which are not independent of one another. Laplace exemplifies the composition of such interdependent probabilities by the instance of three urns, A,B,C, about which it is known that two contain only white balls and one only black balls.6 The probability of drawing a white ball from an assigned urn, say C, is 3. The probability that, a white ball having been drawn from C, a ball drawn from B will be white, is 2. Therefore the probability of the double event drawing a white ball from C and also from B is a X z, or 3. The question now arises. Supposing we know only the probability of the double event, which probability we will call [BC], and the probability of one of them, say [C] (but not, as in the case instanced, the mechanism of their interdependence); what can we infer about the probability [B] of the other event (an event such as in the above instance drawing a white ball from the urn B) - the separate probability irrespective of what has happened as to the urn C ? We cannot in general say that [B] = [BC] divided by [C] but rather that quotient Xk, where k is an unknown coefficient which may be either positive or negative. It might, however, be improper to treat k as zero on the ground that it is equally likely (in the long run of similar data) to be positive or negative. For given values of [BC] and [C], k has not this equiprobable character, since its positive and negative ranges are not in general equal; as appears from considering that [B] cannot be less than [BC], nor greater than unity.?
13. Probability of Causes and Future Effects. - The first principles which have been established afford an adequate ground for the reasoning which is described as deducing the probability of a cause from an observed event. 8 If with the poet 9 we may represent a perfect mixture by the waters of the Po in which the " two Doras " and other tributaries are indiscriminately commingled, there is no great difference in respect of definition and deduction between the probability that a certain particle of water should have emanated from a particular source, or should be discharged through a particular mouth of the river. " This principle," we may say with De Morgan, " of the retrospective or ` inverse ' probability is not essentially ' Bertrand on " Probabilitbs composbes," op. cit. art. 23. In some of the experiences referred to at par. 5.
See below, pars. 132, 159.
4 Op. cit. Introduction.
5 There is a good statement of them in Book's Laws of Thought, ch. xvi. § 7. Cf. De Morgan " Theory of Probabilities " (Encyc. Metrop.), §§ 12 seq.
6 Laplace, op. cit. Introduction, IV e Principe; cf. V e Principe and liv., II. ch. i. § 1.
I such a case there seems to be a propriety in expressing the indeterminate element in our data, not as above, but as proposed by Boole in his remarkable Laws of Thought, ch. xvii., ch. xviii., § (cf. Trans. Edin. Roy. Soc., (1857), vol. xxi.; and. Trans. Roy. Soc., 1862, vol. ix, vol. clii. pt. i. p. 251); the undetermined constant now representing the probability that if the event C does not occur the event B will. The values of this constant - in the absence of specific data, and where independence is not presumable - are, it should seem, equally distributed between the values o and 1. Cf. as to Boole's Calculus, Mind, loc. cit., ix. 230 seq.
8 Laplace's Sixth Principle. Manzoni.
different from the one first stated (Principle I.)." lo Nor is a new first principle necessarily involved when after ascending from an effect to a cause we descend to a collateral effect." It is true that in the inve§tigation of causes it is often necessary to have recourse to the unverified species of probability. An intance has already been given of several approximately equiprobable causes, the several values of a quantity under measurement, from one of which the observed phenomena, the given set of observations, must have, so to speak, emanated. A simpler instance of two alternative causes occurs in the investigation which; J. S. Mill 12 has illustrated - whether an event, such as a succession of aces, has been produced by a particular cause, such as loading of the die, or by that mass of " fleeting causes " called chance. It is sufficient for the argument that the " a priori " probabilities of the alternatives should not be very unequal.l3 14. (2) Whether Credibility is Measurable. - The domain of probabilities according to some authorities does not extend much, if at all, beyond the objective phenomena which have been described in the preceding paragraphs. The claims of the science to measure the subjective quantity, degree of belief, are disallowed or minimized. Belief, it is objected, depends upon a complex of perceptions and emotions not amenable" to calculus. Moreover, belief is not credibility; even if we do believe with more or less confidence in exact conformity with the measure of probability afforded by the calculus, ought we so to believe ? In reply it must be admitted that many of the beliefs on which we have to act are not of the kind for which the calculus prescribes. It was absurd of Craig 15 to attempt to evaluate the credibility of the Christian religion by mathematical calculation. But there seem to be a number of simpler cases of which we may say with De Morgan 16 " that in the universal opinion of those who examine the subject, the state of mind to which a person ought to be able to bring himself " is in accordance with the regulation measure of probability. If in the ordeal to which Portia's suitors were subjected there had been a picture of her not in one only, but in two of the caskets, then - though the judgment of the principal parties might be distorted by emotion - the impartial spectator would normally expect with greater confidence than before that at any particular trial a casket containing the likeness of the lady would be chosen. So the indications of a thermometer may not correspond to the sensations of a fevered patient, but they serve to regulate the temperature of a public library so as to secure the comfort of the majority. This view does not commit us to the quantitative precision of De Morgan that in a case such as above supposed we ought to " look three times as confidently upon the arrival as upon the non-arrival " of the event." Two or three roughly distinguished degrees of credibility - very probable, as probable as not, very improbable, practically impossible - suffice for the more important applications of the calculus. Such is the character of the judgments which the calculus enables us to form with respect to the occurrence of a certain difference between the real value of any quantity under measurement and the value assigned to it by the measurement. The confidence that the constants which we have determined are accurate within certain limits is a subjective feeling which cannot be dislodged from an important part of probabilities." This sphere of subjective probability is widened by the latest developments of the science 19 so far as they add to the number of constants for which it is important to determine the probable - and improbable - error. For instance, a measure of the deviation of observations from an average or mean value was required by the older writers only as subordinate to the determination of the mean, but now this " standard deviation " (below, par. 98) is often treated as an entity for which it is important to discover the limits of error. 2 " Some of the newer methods may also serve to countenance the measurement of subjective quantity, in so far as they successfully apply the calculus to quantities not admitting of a precise unit, such as colour 10 De Morgan, Theory of Probabilities, § 19; cf. Venn, Logic cf Chance, ch. vii. § 9; Edgeworth, " On the Probable Errors cf Frequency Constants," Journ. Stat. Soc. (1908), p. 653. The essential symmetry of the inverse and the direct methods is shown by an elegant proof which Professor Cook Wilson has given for the received rules of inverse probability (Nature, 1900, Dec. 13).
11 Laplace's Seventh Principle.
12 Logic, book III., ch. xviii. § 6.
13 Cf. above, par: 8; below, par. 46.
14 Cf. Venn, Logic of Chance, p. 126.
15 See the reference to Craig in Todhunter, History ...of Probability. 16 Formal Logic, p. 173.
17 Ibid. Cf. " Theory of Probabilities " (Encyc. Metrop.), note to § 5, Wherever the term greater or less can be applied there twice, thrice, &c., can be conceived, though not perhaps measured by us." 18 It is well remarked by Professor Irving Fisher (Capital and Income, 1907, ch. xvi.), that Bernoulli's theorem involves a " subjective " element a " psychological magnitude." The remark is applicable to the general theory of error of which the theorem of Bernoulli is a particular case (see below, pars. 103, 104).
19 In the hands of Professor Karl Pearson, Mr Sheppard and Mr Yule. Cf. par. 149, below.
2° Cf. Edgeworth, Journ. Stat. Soc. (Dec. 1908).
] of eye or curliness of hair.' A closer analogy is supplied by the older writers who boldly handle " moral " or subjective advantage, as will be shown under the next head.
15. (3) Axioms of Expectation. - Expectation so far as it involves probability presents the same philosophical questions. They occur chiefly in connexion with two principles analogous to and deducible from propositions which have been stated with respect to probability. 2 (i.) The expectation of the sum of two quantities subject to risk is the sum of the expectations of each. (ii.) The expectation of the product of two quantities subject to risk is the product of the expectations of each; provided that the risks are independent. For example, let one of the fortuitously fluctuating quantities be the winnings of a player at a game in which he takes the amount A if he throws ace with a die (and nothing if he throws another face). Then the expectation of that quantity is *A; or, in n trials (n being large), the player may expect to win about n*A. Let the other fortuitously fluctuating quantity be winnings of a player at a game in which he takes the amount B when an ace of any suit is dealt from an ordinary pack of cards. The expectation of this quantity is I I, B; or in n trials the player may expect to win about n1 1 8 -B. Now suppose a compound trial at which one simultaneously throws a die and deals a card; and let his winning at a compound trial be the sum of the amounts which he would have received for the die and the card respectively at a simple trial. In n such compound trials he may expect to win about n Ad-ni gB, or the expectation of the winning at a compound trial is the sum of the separate expectations. Next suppose the winning at a compound trial to be the product of the two amounts which he would have received for the die and the card if played at a simple trial. It is zero unless the player obtains two aces. It is A X B when this double event occurs. But this double event occurs in the long run only once in 78 times. Accordingly the expectation of the winning at a compound trial at which the winning is the product of the winnings at two simple trials is the product of the separate expectations. What has been shown for two expectations of the simplest type, where a is the probability of an event which has been associated with a quantity a, may easily be extended to several expectations each of the type a 1 a 1 +a 2 a 2 +a 3 a 3 + .
where ara r is an expectation of the simplest type, above exemplified, or of the type alai Xa2a2Xa3a3X ... or a mixture of these types. For by the law which has been exemplified the sum of r expectations can always be reduced to the sum of r- 1, and then the r - i to r-2, and so on; and the like is true of products.
16. It should be remarked that the proviso as to the independence of the probabilities involved is required only by the second of the two fundamental propositions. It may be dispensed with by the first. Thus in the example of interdependent probabilities given by Laplace' - three urns about which it is known that two contain only black balls and one only white - if a person drawing a ball first from C and then from B is to receive x shillings every time he draws a white ball, from one or other of the urns, he may expect if. he performs the compound operation n times to receive n X2 X gx shillings. But the expectation of the product of the number of shillings won by drawing a white ball from C and the number of shillings won by afterwards drawing a white ball from B is not n(I) 2 x 2 , but nix".
17. The first of the two principles is largely employed in the. practical applications of probabilities. The second principle is largely employed in the higher generalizations of the science 4 (the laws of error demonstrated in Part II.); the requisite independence of the involved probabilities being mostly of the unverified 5 species.
18. Expectation of Utility
A philosophical difficulty peculiar to expectation 6 arises when the quantity expected has not the objective character usually presupposed in the applications of mathematics. The most signal instance occurs when the expectation relates to an advantage, and that advantage is estimated subjectively by the amount of utility or satisfaction afforded to the possessor. Mathematicians have commonly adopted the assumption made by Daniel Bernoulli that a small increase in a person's material means or fortune " causes an increase of satisfaction or " moral fortune," inversely proportional to the physical fortune; and accordingly that the moral fortune is equateable to the logarithm of the physical fortune.' The spirit in which this assumption should be employed is well expressed by Laplace when he says 8 that the expec 1 Below, par. 152.
Consider the equivalent of Laplace's second principle given at par 9, above, and his third principle quoted at par. to.
Above, par. 12.
4 In the more familiar form; that (of two independently fluctuating quantities) the mean of the product is the product of the means (cf. Czuber, Theorie der Beobachtungsfehler, p. 133).
' Above, par. 6.
These peculiarities afford some justification for Laplace's restriction of the term expectation to " goods." As to the wider definition here adopted see below, par. 94 and par. 95, note.
Each fortune referred to is divided by a proper parameter. See below, par. 69.
Op. cit. liv. II. ch. xiii. No. 41. Cf. liv. II. ch. i. No. 2.
tation of subjective advantage (l'esperance morale) " depends on a thousand variable circumstances which it is almost always impossible to define and still more to submit to calculation." " One cannot give a general rule for appreciating this relative value," yet the principle above stated in " applying to the commonest cases leads to results which are often useful." 19. In this spirit we may regard the logarithm in Bernoulli's (as in Malthus's) theory as representative of a more general relation. Thus generalized the principle has been accepted by economists and utilitarian philosophers whose judgment on the relation between material goods and utility or satisfaction carries weight. Thus Professor Alfred Marshall writes:" " In accordance with a suggestion made by Daniel Bernoulli, we may perhaps suppose that the satisfaction which a person derives from his income may be regarded as beginning when he has enough to support life and afterwards as increasing by equal amounts with every equal 'successive percentage that is added to his income; and vice versa for loss of income." 10 The general principle is embodied in Bentham's utilitarian reasoning which hasLbeen widely accepted." The possibility of formulating the relation between feeling and its external cause is further supported by Fechner's investigations. This branch of Probabilities also obtains support from another part of the science, the calculation sanctioned by Laplace, of the disutility incident to error of measurement. 12 Altogether it seems impossible to deny that some simple mathematical operations prescribed by the calculus of probabilities are sometimes serviceably employed to estimate prospective benefit in the subjective sense of desirable feeling.
20. Single Cases and " Series." - Analogous to the question regarding the standard of belief which arose under a former head, a question regarding the standard of action arises under the head of expectation. The former question, it may be observed, arises chiefly with respect to events which are considered as singular, not forming part of a series. There is no doubt, there is a full belief, that if we go on tossing (unloaded) dice the event which consists of obtaining either a five or a six will occur in approximately 33'3% of the trials. The important question is what is or should be our state of mind with regard to the result of a trial which is sui generis and not to be repeated, like the choice of a casket in the Merchant of Venice." A similar difficulty is presented by singular events, with respect to volition. Is the chance of one to a thousand of the .prize r000 at a lottery approximately equivalent to £i in the eyes of a person who for once, and once only, has the offer of such a stake ? The question is separable from one with which it is often confounded, the one discussed in the last paragraph what is the " moral " value of the prize ? The person might be a millionaire for whom £1 and 1000 both belong to the category of small change. The stake and the prize might both be " moral." The better opinion seems that apart from a system of transactions like that in which an insurance company undertakes, or at least a " ` cross-series " 14 of the kind which seem largely to operate in ordinary life, expectations in which the risks are very different are no longer equateable. So De Morgan with regard to the " single case " (the solitary transaction in question) declares that the " mathematical expectation is not a sufficient approximation to the actual phenomenon of the mind when benefits depend upon very small probabilities; even when the fortune of the player forms no part of the consideration " 15 [without making allowance for the difference between " moral " and mathematical probabilities]. So Condorcet, " If one considers a single man and a single event there can be no kind of equality " 1s (between expectations with very different risks). It is only for the long run - lorsqu'on embrasse la suite indefinie des evenements - that the rule is valid: To the same effect at greater length the logicians Dr Venn 17 and von Kries. 18 Some of the mathematical writers have much to learn from their logical critics 19 on this and other questions relating to first principles.
Section II. - Calculation of Probability. 21. Object of the Section. - In the following calculations the principal object is to ascertain the number of cases favourable to an event in proportion to the total number of possible cases 20 9 Principles of Economics, book III., ch. vi. § 6, p. 209, ed. 4.
10 Cf. below, par. 71.
11 Some further references bearing on the subject are given in a paper by the present writer on the " Pure Theory of Taxation," No. III. Economic Journ. (1897), vii. 550-551.
12 Below, par. 131.
Above, par. 14.
14 Above, par. 5.
Article on " Probabilities " (Encyc. Metrop.), § 40. 1s Essai (1785), pp. 142 et seq.
17 Logic of Chance, ch. vi. §§ 24-28.
18 Wahrscheinlichkeitsrechnung, pp. 184 seq.
The relations of recent logicians to the older mathematical writers on Probabilities may be illustrated by the relations of modern " historical " economists to their more abstract predecessors.
20 Of the two properties which have been found to characterize probability (above, par. 5) - proportionate (1) number of (equally) favourable cases and (2) frequency of observed occurrence - the former especially pertain to the data and quaesita of this section.
[[[Methods Of Calculation]] The difficulty consists in the enumeration of the cases," as Lagrange says. Sometimes summation is the only mathematical operation employed; but very commonly it is necessary to apply the theory of permutations and combinations involving multiplication.' Fundamental Theorem. - One of the simplest problems of this sort is one of the most important. Given a mélange of things consisting of two species, if n things are taken at random what is the probability that s out of these n things will be of a certain species ? For example, the mélange might be a well-shuffled pack of cards, and the species black and red; the quaesitum, what is the probability that if n cards are dealt, s of them will be black ? There are two varieties of the problem: either after each card is dealt it is returned to the pack, which is reshuffled, or all the n cards are dealt (as in ordinary games of cards) without replacement. The first variety of the problem deserves its place as being not only the simpler, but also the more important, of the two.
23. At the first deal there are 26 cases favourable to black, 26 to red. When two deals have been made (in the manner prescribed), out of 52 2 cases formed by combinations between a card turned up at the first deal and a card turned up at the second, 26X26 cases are combinations of two blacks, 26 X26 are combinations of two reds, and the remainder 2(26X26) are made up of combinations between one black and one red; 26 X26 cases of black at the first deal and red at the second, and 26 X26 cases of red at the first and black at the second deal. The number of cases favourable to each alternative is evidently given by the several terms in the expansion of (26+26) 2. The corresponding probabilities are given by dividing each term by the total number of cases, viz. 52 2. Similarly, when we go on to a third deal, the respective probabilities of the three possible cases, three blacks, two blacks and one red, two reds and one black, three blacks, are given by the successive terms in the binomial expansion of (26+26) 3, and so on. The reasoning is quite general. Thus for the event which consists of dealing either clubs or spades (black) we might substitute an event of which the probability at a single trial is not 2, e.g. dealing hearts. Generally, if p and 1 - p are the respective probabilities of the event occurring or not occurring at a single trial, the respective probabilities that in n trials the event will occur n times, n-1 times ... twice, once or not at all, are given by the successive terms in the expansion of [pE(1 - p)]n; of which expansion the general term is ? (n s) ' p s (I - p)n-s. s 24. The probability may also be calculated as follows. Taking for example the case in which the event consists of dealing hearts; consider any particular arrangement of the n cards, of which s are hearts, e.g. the arrangement in which the s cards first dealt are hearts and the following n - s all belong to other suits. The probability of the first s cards being all hearts is (4) 3; the probability that none of the last (n - s) cards are hearts is (4) n - s. Hence the probability of that particular arrangement occurring is (4)3(*)n-8 But this arrangement is but one of many, e.g. that in which the s hearts are the last dealt, which are equally likely to occur. There are as many different arrangements of this type as there are combinations of n things taken together s times, that is n!/s!(n - s)! The probability thus calculated agrees with the preceding result.
25. It follows from the law of expansion for [p+(1 - p)] n that as n is increased, the value of the fractions which form the terms at either extremity diminishes. When n becomes very large, the terms which are in the neighbourhood of the greatest term of the expansion overbalance the sum total of the remaining terms.' Thus in the example above given, if we go on and on dealing cards (with replacement) the ratio of the red cards dealt to all the cards dealt tends to become more and more nearly approximate to the limit 2. These statements are comprised in the theorem known as James Bernoulli's. Stated in its simplest form - that " in the long run all events will tend to occur with a relative frequency proportional to their objective probabilities " 3 - this theorem has been regarded as tautological or circular. Yet the proofs of the theorem which have been given by great mathematicians may deserve attention as at least showing the consistency of first principles. 4 Moreover, as usually stated, James Bernoulli's imports something more than the first axiom of probabilities.' 26. The generalization of the Binomial Theorem which is called 1 Cf. Bertrand's distinction between " Probabilites totales," and " Probabilites composees," Calcul des probabilites, ch. ii. arts. 23, 24.
Cf. Todhunter, History.. of Probability, p. ' 360, and other statements of James Bernoulli's Theorem, referred to in the index.
3 Venn, op. cit. p. 91.
4 Some of these proofs are adduced, and a new and elegant one added by Bertrand, op. cit. ch. v.
' When the degree in which a certain range of central terms tends to preponderate over the residue of the series is formulated with precision, as in the statement given by Todhunter (op. cit. p. 548) when he is interpreting Laplace, then James Bernoulli's theorem presents a particular case of the law of error - the case considered below in par. 103.
the " Multinomial Theorem " 6 gives the rule when there are more than two alternatives at each trial. For instance, if there are three alternatives, hearts, diamonds or a card belonging to a black suit, the probability that if n cards are dealt there will occur s hearts, t diamonds, and n - s - t cards are either clubs or spades is s!t!(n - s - t)! (1.i) t 0) n-s-e 27. Applications of Fundamental Theorem. - The peculiar interest of the problem which is here placed first is that its solution represents a law of almost universal application: the law assigning the frequency with which different values assumed by a quantity, like most of the quantities with which statistics has to do, depends upon several independent agencies. It is remarkable that the problem in probabilities which historically was almost the first belongs to the kind which is first in interest. Of this character is a question which occupied Galileo and before him Cardan, and an even earlier writer: what are the chances that, when two or three dice are thrown, the sum of the points or pips turned up should amount to a certain number ? A particular case of this problem is presented by the old game of " passedix ": what is the probability that if three dice are thrown the sum of the pips should exceed ten ? 7 The answer is obtained by considering the number of combinations that are favourable to each of the different alternatives, 18 pips, 17, 16 .... I I pips, which make up the event in question. Thus out of the total of 216 (63) combinations, one is favourable to 18, three to 17, and so on. There are twenty-five chances, as we may call the permutations, in favour of twenty-seven in favour of 11.8 The sum of all these being 108, we have for the event in question 108/216, an even chance. More generally it may be inquired: what is the probability that, if n dice are thrown, the number of points turned up will be exactly s ? By an extension of the reasoning which was employed in the first problem it is seen that the required probability is that of which. the index is s in the expansion of the expression L 6/ ? 6 4 + (16) (6) sl n The calculation may be simplified by writing this expression in the form C6) n L I - (6) T L I - 6] - nThe successive terms of the expansion give the respective probabilities that the number in question should be n, n+I ... 6n comprising all the possible numbers among which s is presumably included (otherwise the answer is zero). Of course we are not limited to six alternatives; instead of a die we may have a teetotum with any number of sides. The series expressing the probabilities of the different sums can be written out in general terms, as Laplace and others have done; but it seems to be of less interest than the approximate formula which will be given later.' 28. Variant of the Fundamental Theorem. - The second variety of our first problem may next be considered. Suppose that after each trial the card dealt (ball drawn, &c.) is not replaced in state quo ante. For instance, if r cards are dealt in the ordinary way from a shuffled pack, what is the probability that s of them will be hearts (s <13)? Consider any particular arrangement of the r cards, whereof s are hearts, e.g. that in which the s cards first dealt are all hearts, the remaining r - s belonging to other suits. The probability of the first card being a heart is z; the probability that, the first having been a heart, the second should be a heart is if (since a heart having been removed there are now favourable cases out of a total of 51 cases). And so on. Likewise the probability of the (s+1)th card being not a heart, all the preceding s having been hearts, is 39/(52-s), the probability of the (s+2)th card being not a heart is similarly reckoned. And thus the probability of the particular arrangement considered is found to be 13.12.... {13 - (s - I){. 39.38.... (39 - r - s - I) 52.51.
{52 - (s - I)}. 152-51152 Now consider any other arrangement of the r cards, e.g. t of the s hearts to occur first and the remaining s - t last. The denominator in the above expression will remain the same; and in the numerator only the order of the factors will be altered. The probability of the second arrangement is therefore the same as that of the first; and the probability that some one or other of the arrangements will occur is given by multiplying the probability of any one arrangement and the number of different arrangements, which, as in the simpler case of the problem," is the same as the number of combinations formed by r things taken together s times, that is r!/s! (r - s)!. The formula thus obtained may be generalized by substituting ii for 6 See Chrystal, Algebra, ch. xxiii. § or other textbook of algebra.
7 See Todhunter, History ... of Probability, art. 8; Bertrand,. Calcul des probabilites, p. vii., or the original documents.
8 As Galileo discerned. A friend of his had observed that I L occurred 1080 times to IOoo times of 12.
9 The law of error given below, par. 104.
to Above, par. 24.
52, pn for 13, qn for 39 (where p+q =I; pn and qn are integers). A formula thus generalized is proposed by Professor Karl Pearson 1 as proper to represent the frequency with which different values are assumed by a quantity depending on causes which are not independent.
29. Miscellaneous Examples: Games of Chance. - The majority of the problems under this heading cannot, like the preceding two, be regarded as conducing directly to statistical methods which are required in investigating some parts of nature. They are at best elegant exercises in a kind of mathematical reasoning which is required in most of such methods. Games of chance present some of the best examples. We may begin with one of the oldest, the problem which the Chevalier de Mere put to Pascal when he questioned: How many times must a pair of dice be thrown in order that it may be an even chance that double six - the event called sonnez- may occur at least once? 2 The answer may be obtained by finding a general expression for the probability that the event will occur at least once in n trials; and then determining n so that this expression = 2. The probability of the event occurring is the difference between unity and the probability of its failing. Now the proba bility of " sonnez " failing at a single throw (of two dice) is 36. Therefore the probability of its failing in n throws is (15 ,) n. Whence we n obtain, to determine n, the equation 1 - (3b) = Z, which gives n =24.605 nearly.
30. In the preceding problem the quaesitum was (unity minus) the probability that out of all the possible events an assigned one (" sonnez ") should fail to occur in the course of n trials. In the following problem the quaesitum is the probability that out of all the possible events one or other should fail - that they should not all be represented in the course of n trials. A die being thrown n times, what is the probability that all three of the following events will not be represented (that one or other of the three will not occur at least once); viz. (a) either ace or deuce turning up, (b) either 3 or 4, (c) either 5 or 6. The number of cases in which one at least of these events fails to occur is equal to the number of cases in which (a) fails, plus the number in which (b) fails, plus the number in which (c) fails, minus the number of cases in which two of the events fail concurrently (which cases without this subtraction would be counted twice). 3 Now the number of cases in which (a) fails to occur in the course of the n trials is(3,) n of al the possible cases numbering 3n. Like propositions are true of (b) and (c). The number of cases in which both (a) and (b) fail is n ot the total; 4 and the like is true of the cases in which both (a) and (c) fail and the cases in which both (b) and (c) fail. Accordingly the probability that one at least of the events will fail to occur in the course of n trials is 3 () n - 3 () n.
3 3 31. One more step is required by the following problem: If n cards are dealt from a pack, each card after it has been dealt being returned to the pack, which is then reshuffled, what is the probability that one or other of the four suits will not be represented? The probability that hearts will fail to occur in the course of the n deals is n; and the like is true of the three other suits. From the sum of these probabilities is to be subtracted the sum of the probabilities that there will be concurrent failures of any two suits; but from this subtrahend are to be subtracted the proportional number of cases in which there are concurrent failures of any three suits (otherwise cases such as that in which e.g. hearts, diamonds and clubs concurrently failed 5 would not be represented at all). Now the pro p bability of any assigned two suits failing is4); the probability of I i " any assigned three suits failing is 4). Accordingly the required probability is 4(4)n-6 n+4(1)nThe analogy of the Binomial Theorem supplies the clue to the solution of the general problem of which the following is an example.
1 Trans. Roy. Soc. (1895). See below, par. 165.
2 Todhunter, History. .. of Probability, and Bertrand, Calcul des probabilites, p. 9.
3 All three events cannot fail.
(c) occurring n times.
5 The reasoning may be illustrated by using the area of a circle to represent the frequency with which hearts fail, another (equal) circle for diamonds; for the case in which both hearts and diamonds fail the area common to the circles interlapping, and so on.
If a die is thrown n times the probability that every face will have turned up at least once is I -6 `?)n-?1516J n-2016Jn--15 (6)n -6 (6} n.
32. If in the (first) problem stated in paragraph 31 the cards are dealt in the ordinary way (without replacement), we must substitute for (4) n, the continued product 59 5g
39 - (n - I?; for (4) n the continued product 56 25 52 (n - I) ' d so on 33. Still considering miscellaneous examples relating to games of chance let us inquire what is the probability that at whist each of the two parties should have two honours ? 7 If the turned-up card is an honour, the probability that of the three other honours an assigned one is among the twenty-five which are in the hands of the dealer or his partner, while the remaining two honours are in the hands of the other party, is, So, But the assigned card SI may with equal probability be any one of three honours; and accordingly the above written probability is to be multiplied by 3. If the turned-up card is not an honour then the probability that an assigned pair of honours is in the hands of the dealer or his partner, while the remaining two honours are in the hands of their adver saries, is
this probability is to be multiplied by six, as the assigned pair may be any of the six binary combinations formed by the four honours. Now the probability of the alternative first considered - the turned-up card being an honour - is - 4 -; and the probability of the second alternative, 3. Accordingly the required probability is 4 25 26 25 '9 25 24 26 25 325 13351 5 o 49 + 13651 5 o 4948833 34. The probability that each of the four players should have an honour may be calculated thus. 8 If the card turned up is an honour then ipso facto the dealer has one honour and the probability that the remaining players have each an assigned one of the three remaining honours, is 50 Which probability is to be multiplied by 3!, 51 as there are that number of ways in which the three cards may be assigned. If the card turned up is not an honour the probability that each player has an assigned honour is 13 r 3 i 3 12Which5 1 5 o 49 48 probability is to be multiplied by 4!. Accordingly the required probability is 4 133912.13 3 6.133 13 3.51.50.49 + 13 4 '48.51.50 49 51.50.49 (the chance not being affected by the character of the card turned up).
35. The probability of all the trumps being held by the dealer is 12.39l, which being calculated by means of 51.50...41.40 52!
tables for (logarithms of) factorials 9 or directly, 10 is 158,753,389,900.
36. There is a set of dominoes which goes from double blank to double nine (each domino presenting either a combination - which occurs only once - of two digits, or a repetition of the same digit). What is the probability that a domino drawn from the set will prove to be one assigned beforehand? The probability is the reciprocal of the number of dominoes: which is IoX9/2 (the number of combinations of different digits) +Io (the number of doubles) =55.
37. Choice and Chance
When we leave the sphere of games of chance and frame questions relating to ordinary life there is a danger of assuming distributions of probability which are far from probable. For example, let this be the question. The House of Commons formerly consisting of 489 English members, 60 Scottish and 103 Irish, what was the probability that a committee of three members should represent the three nationalities? An assumption of indifference where it does not exist is involved in the answer that the required probability is the ratio of the number of favourable triplets, viz. 489 X 60 X 103 to the total number of triplets, viz. 652 X 651 X650 X3! A similar absence of selection is postulated by the ordinary treatment of a question like the following. There being s candidates 6 See Whitworth, Exercises in Choice and Chance, No. 502 (p. 125); referring to prop. xiv. of the same author's Choice and Chance. 7 Cf. Whitworth, Choice and Chance, question 143, p. 183, ed. 4.
8 Ibid.
9 There is such a table at the end of De Morgan's article in the Calculus of Probabilities in the Ency. Brit. " Pure Sciences," vol. ii.
1 ° Cancelling factors common to the numerator and denominator.
[[[Methods Of Calculation]] at an examination and r optional subjects from which each candidate chooses one (r> s), what is the probability that no two candidates should choose the same subject? If the candidates be arranged in any order, the probability that the second candidate should not choose the same subject as the first candidate is (n - 1)/n. The probability that the third candidate will not choose either of the two subjects taken by the aforesaid candidates is (n - 2)/n, and so on. Thus the required probability is n(n - t) (n-2) ... {n - (s - I){/ns.
38. When as in these cases the interest of the problem lies chiefly in the application of the theory of combinations, or permutations, there is a propriety in Whitworth's enunciation of the questions under the head of choice rather than chance. It comes to the same whether we say that there are x ways in which an event may happen, or that the probability of its happening in an assigned one of those ways is I/x. For example, suppose that there are n couples waltzing at a ball; if the names of the men are arranged in alphabetical order, what is the probability that the names of their partners will also be in alphabetical order? The probability that the man who is first in alphabetical order should have for partner the lady who is first in that order is I/n. The probability that the man who is second alphabetical order should have for partner the lady who is second in that order is I/(n-I), and so on. Therefore the required. probability is 1 /n!. Or it may be easier to say that the number of ways, each consisting of a set of couples in which the party can be arranged, is n!; of which only one is favourable.
39. The same principle governs the following question. For how many days can a family of 10 continue to sit down to dinner in a different order each day; it not being indifferent who sits at the head of the table - what is the absolute, as well as the relative, position of the members? The number of permutations, viz. to!, is the answer. If we are to attend to the relative position only - as would be natural if the question related to to children turning round a flypole - the number of different arrangements would be only 9!
40. Method of Equations in Finite Differences
The last question may serve to introduce a method which Laplace has applied with great éclat to problems in probabilities. Let yn be the number of ways in which n men can take their places at a round table, without respect to their absolute position; and consider how the number will be increased by introducing an additional man. From every particular arrangement of the original n men can now be obtained n different arrangements of the n+I men (since the additional man may sit between any two of the party of n). Hence yn+i =nyn, an equation of differences of which the solution is C (n - I) ! The constant may be determined by considering the case in which n is 2.
41. The following example is not quite so simple. If a coin is thrown n times, what is the chance that head occurs at least twice running? Calling each sequence of m throws a " case," consider the number of cases in which head never occurs twice running; let u n be this number, then 2" - u n must be the number of cases when head occurs at least twice successively. Consider the value of un+2; if the last or (n+2)th throw be tail, u n. }2 includes all the cases (un +i) of the n+I preceding throws which gave no succession of heads; and if the last be head the last but one must be tail, and these two may be preceded by any one of the u n favourable cases for the first n throws. Consequently, un+2 = +1 + un.
If a, l3 are the roots of the quadratic x 2 - x - I =o, this equation gives 1 The probability that head never turns up twice running is found by dividing this by 2", the whole number of cases. This probability, of course, becomes smaller and smaller as the number of trials (n) is increased. This is a particular case of a more general problem solved by Laplace 2 as to the occurrence i times running of an event of which the probability at one trial is p. 2. In such problems where we now employ the calculus of finite difference Laplace employed his method of generating functions. A distinguished instance is afforded by the problem of points which was put by the Chevalier de Mere to Pascal and has exercised generations of mathematicians. It is thus stated by Laplace.' Two players of equal skill have staked equal sums; the stakes to belong to the player who shall have won a certain number of games. Suppose they agree to leave off playing when one player, A, wants x " points " (games to be won) in order to complete the assigned number, while the second player wants x' points: how ought they 1 Cf. Boole's Finite Differences, ch. vii. § 5.
Op. cit. liv. II. ch. ii., No. 12.
Op. cit. liv. II. ch. ii., No. 8.
to divide the stakes? This is a question in Expectation, but its difficulty consists in determining the probability that one of the players, say A, shall win the stakes. Let that probability be yz,x'. Then, after the next game, if A has won, the probability of his winning the stakes will be y z _ 11z '. But if A loses, B winning, the probability will be yx,x' -1. But these alternatives are equally likely. Accordingly the probability of A winning the stakes may be written 2yx 1, x'+2yx, x' - i. This is the same probability as that which was before written yx,x'. Equating the two expressions we have, for the function y, an equation of finite difference involving two variables, of which the solution is _I x I x(x+I) I x(x+t) (x+x'-2) I 2 x I+ I 2 I.2 22+. ... F` I.2 ... (x' - I) 2x - 1 5 43. The problem of points is to be distinguished from another classical problem, relating to a contest in which the winner has not simply to win a certain number of games, but to win a certain number of counters from his opponent.' Space does not admit even the enunciation of other complicated problems to which Laplace has applied the method of generating functions.
44. Probability of Causes Deduced from Observed Events. - Problems relating to the probability of alternative causes, deduced from observed effects, are usually placed in the separate category of " inverse " probability, though, as above remarked,' they do not necessarily involve different principles. The difference principally consists in the need of evidence, other than that which is afforded by the observed event, as to the probability of the alternative causes existing and operating. The following is an example free from the difficulty incident to unverified a priori probabilities, which commonly besets this kind of problem. A digit having been taken at random from mathematical tables (or the expansion of an endless constant such as 7); a second digit is obtained by taking from a random succession of digits one that added to the first digit makes a sum greater than 9. Given a result thus formed, what are the respective probabilities that the second digit should have been o, I, 2, ... 8 or 9? In the long run the first digit assumes with equal frequency the values 0, I, 2 ... 8, 9. Accordingly the second digit can never be o. There is only one chance of its being t, namely when the first digit is 9. If the second digit is 2, and the first either 8 or 9, the observed effect will be produced. And so on. If the second digit is 9, the effect may occur in nine ways. Accordingly in the long run of pairs thus formed it will occur that the cases or causes which are defined by the circumstances that the second digit is 0, I, 2,. .. 8, 9, respectively, will occur with frequencies in the following ratios o: I: 2 ... 8: 9. The probability of the observed event having been caused by a particular (second) digit, e.g. 7, is 7/(0 -+ I +2 +.. +9) = 7/45.
45. The following example taken from Laplace 7 is of a more familiar type. An urn is known to contain three balls made up of white and black balls in some unknown proportion. From this urn a ball is extracted m times (being each time replaced after extraction). If a white ball is drawn every time, what are the respective probabilities that the number of white balls in the urn are 3, 2, I or o? By parity of reasoning it appears that in the first case the result is certain, its probability I, in the second case the probability of the observed event occurring is (i-) m, in the third case that probability is (i) m, in the fourth case zero. Accordingly the respective inverse probabilities are in the ratios (i) m (3) m: O; provided that (as in the preceding example, with respect to the second digits) the alternative causes, the four possible constitutions of the urn, are (a priori) equally probable. This is rather a bold assumption with respect to the contents of concrete urns' and similar groupings; but with regard to things in general may perhaps be justified on the principle of cross-series.° 46. Often in the investigation of causes we are not thrown back on unverified a priori probabilities. We have some specific evidence though of a very rough character. An example has been cited from Mill in a preceding paragraph. 10 Against the improbabilities calculated by the methods of the present section there has often to be balanced an improbability evidenced by common sense, which does not admit of mathematical calculation. Bertrand n puts the following case. The manager of a gambling house has purchased a roulette table which is found to give red 5300 times, black 4700 times,, out of to,000 trials. The purchaser claims an indemnity from the maker. What can the calculus tell us as to the justice of the claim? Nothing A clear and corrected version of LapIace's reasoning is given by Todhunter, History... of Probability, art. 973, p. 528, with reference to the more general cases in which the " skills " of each party - their chances of winning a single game - are not equal but respectively p and q (p+q = I). See also Czuber, Wahrscheinlichkeitstheorie, pp. 30 seq.
5 See Todhunter, op. cit. art. 107, and other articles referring to duration of play. See also Boole, Finite Differences, ch. xiv., art. 7, ex. 6.
' Above, par. 13. cit. liv. II. ch. i. No. 1. Cf. Bertrand, op. cit. § 118. ° Above, par. 5.
1° Par. 13. ll Op. cit. § 134.
A= a2 a, a - (3 # - a un - n 2 (n+n I) 1.2.3 whence 5+&c. .
un = Aan+BNn.
Here A and B are easily found from the conditions u1 = 2, u2=3; viz.
] precise, yet something worth knowing. The a priori improbability of the maker's inaccuracy must be very great to overcome the improbability of such an event occurring by chance if the machine is accurately made (accuracy being defined, say, by the condition that the ratio of red to [red +white] would prove to be in the indefinitely long run of trials between 0.499 and 0.501). The odds against the so defined event occurring are found to be some millions to one.' 47. The difficulty recurs in more practical problems: for instance, certain symptoms having been observed, to find the probability that they are produced by a particular disease. Such concrete applications of probabilities are often open to the sort of objections which have been urged against the classical use of the calculus to determine the probability that witnesses are true, or judges just.
48. Probability of Testimony
The application of probabilities to testimony proceeds upon two assumptions: (I) that to each witness there pertains a coefficient of probability representing the average frequency with which he speaks the truth or untruth, (2) that the statements of witnesses are independent in the sense proper to probabilities. Thus if two witnesses concur in making a statement which must be either true or false, their agreement is a circumstance which is only to be accounted for by one of two alternatives: either that they are both speaking the truth, or both false. If the average truthfulness - the credibility - of one witness is p, that of the other p', then the probabilities of the two alternative explanations are to each other in the ratio pp': (I - p) (I - p'); the probability that the statement is true is pp'/{pp'+ (I - p) (I - p')}. So far no account is taken of the a priori probability of the statement. This evidence may be treated as an independent witness. Thus, if a person whose credibility is p asserts that he has seen at whist a hand consisting entirely of trumps dealt from a well-shuffled pack of cards, there are two alternative explanations of his assertion, with probabilities in the ratio p Xo000,000,000,006 3: (I - p) X 0.999,999,999,993. The truthfulness of the witness must be very great to outweigh the a priori improbability of the fact.' These formulae are easily extended to the case of three or more witnesses. The probability of a statement made by three witnesses of respective credibilities PP'P"/{PP'P"+(I -P) (I -P) (I -p")}. For r witnesses we have pi p2... p r /{p i Pr+( I -Pi) (1 - p 2).. . (I -Pr) }Dividing both the numerator and the denominator by pip2...pr, we see that the probability of the statement increases with the number of the witnesses, provided that for every witness (I - p)/p is a proper fraction, and accordingly p> 2. As an example of several witnesses, let us inquire how many witnesses to a fact such as a hand at whist consisting entirely of trumps would be required in order to make it an even chance that the fact occurred, supposing the credibility of each witness to be 9 3 Let x be the required number of witnesses. We have the I/(1 +(J)x o000,000,000,006) = 2, or x log 9 =12'2. Whence, if x is 13, it is more than an even chance that the statement is true.
49. When an event may occur in two or more ways equally probable a priori, the formulae show that the probability of the statement will depend on the credibility of the witnesses; and accordingly the explicit consideration of a priori probabilities may, as in our first instance, be omitted. One who reports the number of a ticket obtained at a lottery ordinarily makes a statement against which there is no a priori improbability; but if the number is one which had been predicted, there is an a priori improbability I - that an assigned ticket should be drawn out of a mélange of n tickets. Similar reasoning is applicable to the probability that the decisions of judgments, the verdict of juries, is right.
50. The assumptions upon which all this reasoning is based are open to serious criticisms. The postulated independence of witnesses and judges is frequently not realized. The revolutionary tribunal which condemned Condorcet was affected by an identity of illusions and passions which that mathematician had not taken into account when he calculated " that the probability of a decision being conformable to truth will increase indefinitely as the number of voters is increased." 51. The use of coefficients based on the average truthfulness or j ustice of each witness and judge involves the neglect of particulars which ought to influence our estimate of probability, such as the consistency of a witness's statements and the relation of the case to the interests, prejudices and capacities of the witness or the judge.' Thus even in so simple a case as the alleged occurrence of 1 By a calculation based on the fundamental theorem (above, par. 23; cf. below, par. 203).
But see below, par. 51.
3 Morgan Crofton, loc. cit. p. 778, par.
4 Essai, p. 6 (there is postulated a proviso analogous to that which has been stated in par. 49 above, with reference to witnesses: that the probability of any one voter being right is> i).
See Mill's forcible remarks on this use of probabilities, which an extraordinary hand at whist, the " truthfulness " of the witness in the general sense of the term may not adequately represent his liability to have made a mistake about the shuffling.' A neglect of particulars, however, is sometimes practised with success in the applications of statistics (insurance, for instance). Perhaps there are broad results and general rules to which the mathematical theory may be applicable. Perhaps the laborious researches of Poisson on the " probability of judgments " are not, as they have been called by an eminent mathematician, absolument rien. 7 More than mathematical interest may attach to Laplace's investigation of a rule appropriate to cases like the following. An event (suppose the death of a certain person) must have proceeded from one of n causes A, B, C, &c., and a tribunal has to pronounce on which is the most probable. Professor Morgan Crofton's original proof of Laplace's rule is here reproduced.' 52. Let each member of the tribunal arrange the causes in the order of their probability according to his judgment, after weighing the evidence. To compare the presumption thus afforded by any one judge in favour of a specified cause with that afforded by the other judges, we must assign a value to the probability of the cause derived solely from its being, say, the rth on his list. As he is supposed to be unable to pronounce any closer to the truth than to say (suppose) H is more likely than D, D more likely than L, &c., the probability of any cause will be the average value of all those which that probability can have, given simply that it always occupies the same place on the list of the probabilities arranged in order of magnitude. As the sum of the n probabilities is always 1, the question reduces to this: Any whole (such as the number I) is divided at random into n parts, and the parts are arranged in the order of their magnitude - least, second, third,. .. greatest; this is repeated for the same whole a great number of times; required the mean value of the least, of the second, &c., parts, up to that of the greatest.
A B b Let the whole in question be represented by a line AB =a, and let it be divided at random into n parts by taking ni points indiscriminately on it. Let the required mean values be A l a, A 3 a .... Ana, where A I, must be constant fractions. As a great number of positions is taken in AB for each of the n points, we may take a as representing that number; and the whole number N of cases will be N = an-'.
The sum of the least parts, in every case, will be Sl=NA1a=X1(".
Let a small increment, Bb =5a, be added on to the line AB at the end B; the increase in this sum is SS 1 =nX 1 a n -'6a. But, in dividing the new line Ab, either the n -1 points all fall on AB as before, or n-2 fall on AB and I on Bb (the cases where 2 or more fall on Bb are so few we may neglect them). If all fall on AB, the least part is always the same as before except when it is the last, at the end B of the line, and then it is greater than before by Sa; as it falls last inn 1 of the whole number of trials, the increase in S i is n 'a n -'3a. But if one point of division falls on Bb, the number of new cases introduced is (n - I)a"- 2 3a; but, the least part being now an infinitesimal, the sum S i is not affected; we have therefore SS 1 = nX 1 a n -'Ia =n-la"-1-3a; =n 2.
To find reasoning exactly in the same way, we find that where one point falls on Bb and n-2 on AB, as the least part is infinitesimal, the second least part is the least of the n -1 parts made by the n-2 points; consequently, if we put X 1 ' for the value of X 1 when there are n-1 parts only, instead of n, 352=nX 2 a n -'Sa=n 1 a'i-13a+(n-I)an-2A1'a8a, .'.nX2=n1+(n-I)A1'; but A'1=(n-I)-2; (n I) -1 In the same way we can show generally that n?. .,.=n 1+(n-I)A'r 1; and thus the required mean value of the rth part is A r a=an'{72 '+(n-I)-'-}-(n-2)-'?-.. . (n-r--I)-1}.
he places among the " misapplications of the calculus which have made it the real opprobrium of mathematics " (Logic, Book III, ch. xviii. § 3). Cf. Bertrand, Calcul des probabilites; Venn, Logic of Chance, ch. xvi. § 5-7; v. Kries, Principien der Wahrscheinlichkeitsrechnung, ch. ix., preface, § v., and ch. xiii. §§ 12, 13; Laplace's general reflections on this matter seem more valuable than his calculations: " Tant de passions et d'interets particuliers y melent si souvent leur influence qu'il est impossible de soumettre au calcul cette probabilit y ," op. cit. Introduction (Des Choix et decisions des assemblees). s As to the possibility of mistake in this respect, see Proctor, How to play Whist, p. 121.
7 Bertrand, loc. cit. 3 Loc. cit. 43.
[[[Methods Of Calculation]] Thus each judge implicitly probabilities assigns the probabiliti n 2 ' n (n - { -n I I)' (n+n I I mn 1 2), to the causes as they stand on his list, beginning from the lowest. The values assigned for the probability of each alternative cause may be treated as so many equally authoritative observations representing a quantity which it is required to determine. According to a general rule given below 1 the observations are to be added and divided by their number; but here if we are concerned only with the relative magnitudes of the probabilities in favour of each alternative it suffices to compare the sums of the observations. We thus arrive at Laplace's rule. Add the numbers found on the different lists for the cause A, for the cause B, and so on; that cause which has the greatest sum is the most probable.
53. Probability of Future Effects deduced from Causes. - Another class of problems which it is usual to place in a separate category are those which require that, having ascended from an observed event to probable causes, we should descend to the probability of collateral effects. But no new principle is involved in such problems. The reason may be illustrated by the following modification of the problem about digits which was above set 2 to illustrate the method of deducing the probability of alternative causes. What is the probability that if to the second digit which contributed to the effect there described there is added a third digit taken at random, the sum of the second and third will be greater than io (or any other assigned figure)? The probabilities - the a posteriori probabilities derived from the observed event (that the sum of the first and second digit exceeds 9) - each multiplied by 45, of the alternatives constituted by the different values o, I, 2,. 8, 9 of the second figure are written in the first of the subjoined rows.
o I 2345 6789 o o I 2345678 o o 2 6 12.20 30 42 56 72 Below each of these probabilities is written the probability, X io that if the corresponding cause existed the effect under consideration would result. The product of the two probabilities pertaining to each alternative way of producing the event gives the probability of the event occurring in that way. The sum of these products which are written in the third row divided by 45 X 10. viz. 4 N =-P5, is the required probability. It may be expected that actual trial would verify this result.
54. " Rule of Succession." - One case of inferred future effects, sometimes called the " rule of succession," claims special notice as having been thought to furnish a test for the cogency of induction. A white ball has been extracted (with replacement after extraction) n times from an immense number of black and white balls mixed in some unknown proportion; what is the probability that at the (n+i)th trial a white ball will be drawn? It is assumed that each constitution of the mélange' formed by the proportion of white balls (the probability of drawing a white ball), say p, is a priori as likely to have any one value as another of the series op, 2Ap. 30p,. .. I-2Ap, I - o p, I.
Whence a posteriori the probability of any particular value of p as the cause of the observed recurrence is p n /zip n, where p in the denominator receives every value from A p to 1. The probability that this cause, if it exists, will produce the effect in question, the extraction of a white ball at the (n+1)th trial, is p. The probability of the event, obtained by summing the probabilities of all the different ways in which it may occur, is accordingly Zpn+1Æpn, where p both in the numerator and the denominator is to receive all possible values between o p and 1. In the limit we have f 'P n+l dp/f' p n dp = (n+ I)! (n+2). In particular if n = I, the probability that an event which has been observed once will recur on a second trial is 3. These results are perhaps not so absurd as they have seemed to some critics, when the principle of " cross-series " 4 is taken into account. Among authorities who seem to attach importance to the rule of succession, in addition to the classical writers on Probabilities, may be mentioned Lotze b and Karl Pearson., Section III. - Calculation of Expectation. 55. Analogues of Preceding Problems. - This section presents problems analogous to the preceding. If n balls are extracted 1 Below, pars. 135, 136. A difficulty raised by Cournot with respect to the determination of several quantities which are connected by an equation does not here arise. The system of values determined for the several causes fulfils by construction the condition that the sum of the values should be equal to unity.
2 Above, par. 44.
3 It comes to the same to suppose the total number of balls in the mixture to be N; and to assume that the number of white balls is a priori equally likely to have any one of the values I, 2,. .. N-1, N.
4 Above, par. 5.6 Logic, bk. ii. ch. ix. § 5.
6 Grammar of Science, ch. iv. § 16. Cf. the article in Mind above referred to, ix. 234.
from an urn containing black and white balls mixed up in the proportions p: (1-p), each ball being replaced after extraction, the expected number of white balls in the set of n is by definition np. 7 It may be instructive to verify the consistency of first principles by demonstrating this axiomatic proposition., Consider the respective probabilities that in the series of n trials there will occur no white balls, exactly one white ball, exactly two white balls, and so on, as shown in the following scheme: No. of white balls b Corresponding probability .
To calculate the expectation of white balls it is proper to multiply I by the probability that exactly one white ball will occur, 2 by the probability of two white balls, and so on. We have thus for the required expectation = np [ (I - p) + p l n - 1 = np. The expectation in the case where the balls are not replaced - not similarly axiomatic - may be found by approximative formulae.' 56. Games of Chance. - With reference to the topic which occurred next under the head of probabilities, a distinction must be drawn between the number of trials which make it an even chance that all the faces of a die will not have turned up at least once, and the number of trials which are made on an average before that event occurs. We may pass from the probability to expectation in such cases by means of the following theorem. If s is the number of trials in which on an average success (such as turning up every face of a die at least once) is obtained, then s= I 4-fi+f2+...; where f,. denotes the probability of failing in the first r trials. For the required expectation is equal to I Xprobability of succeeding at the first trial + 2 X probability of succeeding at the second trial +&c. Now the probability of succeeding at the first trial is I - f l; the probability of succeeding at the second trial (after failing at the first) is MI --f 2); the probability of succeeding at the third trial is similarly f 2 (I -f 3) , and so on. Substituting these values for the expression for the expectation, we have the proposition which was to be proved. In the proposed problem f„ = 6 (6) n -15 (6) X 20 0 6 -) n -15 n +6 Assigning to n in each of these terms, every value from I to oo we have 61/(i -g), =30, for the sum of the first set, with corresponding expressions for the sets formed from the following terms. Whence s = I + 30 - 30 + 20 -; 5 + = 1 4.7. By parity of reasoning it is proved that on an average 7tgt cards 10 must be dealt before at least one card of every suit has turned up." 57. Dominoes are taken at random (with replacement after each extraction) from the set of the kind described in a preceding paragraph. L2 What is the difference (irrespective of sign) to be expected between the two numbers on each domino? The digit 9, according as it. is combined with itself, or any smaller digit, gives the sum of differences 0+I+2+...+9.
The digit 8 combined with itself or any smaller digit gives the sum of differences o + I + 2 + ... + 8 and so on. The sum of the differences is Z1, r. r+1, where r has every integer value from I to 9 inclusive, - 9(9+2)39-I 2), =165. And the number of the differences is to + 9 + 8 +. .. + 2 + I = 55. Therefore the required expectation is 16 5/55 =3.
58. Digits taken at Random
The last question is to be distinguished from the following. What is the difference (irrespective of sign) between two digits, taken at random from mathematical tables, or the expansion of an endless constant like 7r? The combinations of different digits will now occur twice as often as the repetitions of the same digit. The sum of the differences may now be obtained from the consideration that the sum of the positive differences must be equal to sum of the negative differences when the null differences are distributed equally between the positive and the negative set. The sum of the positive set is, as before, See the introductory remarks headed " Description and Division of the Subject." Cf. above, par. 25.
9 See Pearson, Phil. Trans. (1895), A.
1° Whitworth, Exercises, No. 502.
11 Ibid. No. 504, cf. above, par. 29.
12 Ibid. par. 36.
n! (nI) !(I-p)n-1p+(n =np [(I - p)n - 1 n! 2)' (I - p)n-2p2+ ... n! (1 - p)n -2p +. .. (n -i)!
- + n pn p) n-r P r -1+. ] 165. But the denominator of this numerator is not the same as before, but less by half the number of null differences, that is 5. We thus obtain for the required expectation 16 5/5 0= 3 3.
59. A simple verification of this prediction may thus be obtained. In a table of logarithms note any two digits so situated as to afford no presumption of close correlation; for instance, in the last place of the logarithm of 10009 the digit and in the last place of the logarithm of 10019 the digit 4, and take the difference between these two, viz. 3, irrespective of sign. Proceed similarly with the similarly situated pair which form the last places of the logarithms of 10029 and 10039; for which the difference is 1, and so on. The mean of the differences thus found ought to be approximately 3.3. Experimenting thus on the last digits of logarithms, in Hutton's tables extending to seven places, from the logarithm of 10009 to the logarithm of 10909, the writer has found for the mean of 250 differences, 3.2.
60. Points taken at Random. - By parity of reasoning it may be shown that if two different milestones are taken at random on a road n miles long (there being a stone at the starting-point) their average distance apart is 3(n+2).
61. If instead of finite differences as in the last two problems the intervals between the numbers or degrees which may be selected are indefinitely small, we have the theorem that the mean distance between two points taken at random on a finite straight line is a third of the length of that straight line.
62. The fortuitous division of a straight line is happily employed by Professor Morgan Crofton to exhibit Laplace's method of deter mining the worth of several candidates by combining the votes of electors. There is a close relation between th i s method and the method above given for deter mining the probabilities of several alternatives by combining the judgments of different judges.' But there is this difference - that the several estimates of worth, unlike those of probability, are not subject to the condition. that their sum should be equal to a constant quantity (unity). The quaesita are now expectations, not probabilities. Professor Morgan Crofton's version 2 of the argument is as follows. Suppose there are n candidates for an office; each elector is to arrange them in what he believes to be the order of merit; and we have first to find the numerical value of the merit he thus implicitly attributes to each candidate. Fixing on some limit a as the maximum of merit, n arbitrary values less than a are taken and then arranged in order of magnitude - least, second, third,. greatest; to find the mean value of each.
X Y Z B Take a line AB =a, and set off n arbitrary lengths AX, AY, AZ. .. beginning at A; that is, n points are taken at random in AB. Now the mean values of AZ, XY, YZ,. .. are all equal; for if a new point P be taken at random, it is equally likely to be 1st, 2nd, 3rd, &c., in order beginning from A, because out of n+I points the chance of an assigned one being 1st is (n) 1; of its being 2nd (n-1-01; and so on. But the chance of P being 1st is equal to the mean value of AX divided by AB; of its being 2nd M(XY) ±AB; and so on. Hence the mean value of AX is AB; that of AY is 2AB (n+i)- 1; and so on. Thus the mean merit assigned to the several candidates is 2a(n+i)- 1, 3a(n+I)-i...na(n+1)-1, Thus the relative merits may be estimated by writing under the names of the candidates the numbers I, 2, 3,. n. The same being done by each elector, the probability will be in favour of the candidate who has the greatest sum.
Practically it is to be feared that this plan would not succeed, because, as Laplace observes, not only are electors swayed by many considerations independent of the merit of the candidates, but they would often place low down in their list any candidate whom they judged a formidable competitor to the one they preferred, thus giving an unfair advantage to candidates of mediocre merit.
63. This objection is less appropriate to competitive examinations, to which the method may seem applicable. But there is a more fundamental objection in this case, if not indeed in every case, to the reasoning on which the method rests: viz. that there is supposed an a priori distribution of values which is in general not supposable; viz, that the several estimates of worth, the marks given to different candidates by the same examiner, are likely to cover evenly the whole of the tract between the minimum and maximum, e.g. between o and Too. Experience, fortified by theory, shows that very generally such estimates are not thus indifferently disposed, but rather in an order which will presently be described as the normal law of error.' The theorem governing the case would therefore seem to be not that which is applied by Laplace and Morgan Crofton, but that which has been investigated by Karl Pearson, 4 a theorem which does not lend itself so readily to the purpose in hand.' 1 Above, par. 52.2 Loc. cit. § 45.
See Edgeworth, " Elements of Chance in Examinations," Journ. Stat. Soc. (1890). Cf. below, par. 124.
4 Biometrika, i. 390.
5 Moore, of Columbia University, New York, has attempted to 64. Expectation of Advantage. - The general examples of expectation which have been given may be supplemented by some appropriate to that special use of the term which Laplace has sanctioned when he considers the subject of expectation as a " good "; in particular money, or that for the sake of which money is desired, "moral " advantage, in more modern phrase utility or satisfaction.
65. Pecuniary Advantage
The most important calculations of pecuniary expectation relate to annuities and insurance; based largely on life tables from which the expectation of life itself, as well as of money value at the end, or at any period, of life is predicted. The reader is referred to these heads for practical exemplifications of the calculus. It must suffice here to point out how the calculations are facilitated by the adoption of a law of frequency, the Gompertz or the Gompertz-Makeham law, which on the one hand can hardly be ranked with hypotheses resting on a vera causa, yet on the other hand is not purely empirical, but is recommended, as germane to the subject-matter, by colourable suppositions.° 66. There is space here only for one or two simple examples of money as the subject of expectation. Two persons A and B throw a die alternately, A beginning, with the understanding that the one who first throws an ace is to receive a prize of £1. What are their respective expectations? 7 The chance that the prize should be won at the first throw is s, the chance that it should be won at the second throw is s 1; at the third throw (fi) 2 6, at the fourth throw (s) 3 6, and so on. Accordingly (the expectation of A - £T X6 {T +(6)2+(8)4+.
}; expectation of B r {T+(6)2+(°)4+
. }.
Thus A's expectation is to B's as T: s. But their expectations must together amount to £1. Therefore A's expectation is isi of a poui_d, B's isi.
67. There are n tickets in a bag, numbered I, 2, 3,. .. n. A man draws two tickets at once, and is to receive a number of sovereigns equal to the product of the numbers drawn. What is his expectation ? 8 It is the number of pounds divided by an improper fraction of which the denominator is the number of possible products, Zn(n-1), and the numerator is the sum of all possible products = 2 { (1 +2 -{- 3. -{- n) 2 - (1 2 -{ 22 + Whence the required number (of pounds) is found to be i 2 (n + T) (3n+2). The result may be contrasted with what it would be if the two tickets were not to be drawn at once, but the second after replacement of the first. On this supposition the expectation in respect of one of the tickets separately is -1(n+ I). Therefore, as the two events are now independent, the expectation of the product,, being the product of the expectations, is { 2 (n +I) }2.
68. Peter throws three coins, Paul two. The one who obtains the greater number of heads wins £T. If the number of heads are equal, they play again, and so on, until one or other obtains a greater number of heads. What are their respective expectations? 10 At the first trial there are three alternatives: (a) Peter obtains more heads than Paul, (0) an equal number, (y) fewer. The cases in favour of a are (1) Peter obtains three heads, (2) Peter, two heads, while Paul one or none, (3) Peter one head, Paul none. The cases in favour of 1 3 are (T) two heads for both, or (2) one head, or (3) none, for both. The remaining case favours y. The probability of a is 8-+s4+a4=1. The probability of 1 3 is The probability of y is 1 - i s =,. Alternative is to be split up into three a', a', y', of which the probabilities (when R has occurred) are as before, ,6, i J s, T !',. is similarly split up, and so on. Thus Peter's expectation is Ail +A ±(i a s) 2 +. .. }£I =ii£1. expectation is 131£I.
An urn contains m balls marked I, 2, 3,. .. m. Paul extracts successively the m balls, under an agreement to give Peter a shilling every time that a ball comes out in its proper order. What is Peter's expectation? The expectation with respect to any one ball is m , and therefore the expectation with respect to all is (shilling)." 69. Advantage subjectively estimated. - Elaborate calculations are paradoxically employed by Laplace and other mathematicians to determine the expectation of subjective advantage in various cases of risk. The calculation is based on Daniel Bernoulli's formula which may be written thus: If x denote a man's physical fortune, and y the corresponding moral fortune y = k log (x/h), k, h being constants. x and y are always positive,andx>h; forevery trace Karl Pearson's theory in the statistics relating to the efficiency of wages (Economic Journal, Dec. 1907; and Journ. Dec. 1907).
6 Cf. below, par. 169.
'Whitworth, Choice and Chance, question 126.
8 Whitworth, Exercises, No. 567.
9 According to the principle above enounced, par. 15.
1° Bertrand, id. § 44, prob. xlvii.
11 Bertrand, id. § 39, prob. xliii. It is not to be objected that the probabilities on which the several expectations are calculated are not independent (above, par. 16).
man must possess some fortune, or its equivalent, in order to live. To estimate now the value of a moral expectation. Suppose a person whose fortune is a to have the chance p of obtaining a sum a, q of obtaining 0, r of obtaining y, &c., and let p + q + r + ... =1, only one of the events being possible. Now his moral expectation from the first chance - that is, the increment of his moral fortune multiplied by the chance - is pk l log a a - log h = pk log (a+a) - pk log a. Hence his whole moral expectation is 1 E = kp log (a+a) +kq log(a+l) +kr log (a+y) +. .. - k log a; and, if Y stands for his moral fortune including this expectation, that is, k log (a/h) + E, we have Y = kp log(a+a) +kq log(a+a) +. .. - k log h. To find X, the physical fortune corresponding to this moral one, we have Y =k log X - k log h. Hence X = (a+a)P(a+0)q(a+y)r, and X - a will be the actual or physical increase of fortune which is of the same value to him as his expectation, and which he may reasonably accept in lieu of it. The mathematical value of the same expectation is2 pa+q/3+ry+... 70. Gambling and Insurance. - These formulae are employed, often with the aid of refined mathematical theorems, to demonstrate received propositions of great practical importance: that in general gambling is disadvantageous, insurance beneficial, and that in speculative operations it is better to subdivide risks - not to " have all your eggs in one basket." 71. These propositions may be deduced by the use of a formula which perhaps keeps closer to the facts: viz. that utility or satisfaction is a function of material goods not definitely ascertainable, defined only by the conditions that the function continually increases with the increase of the variable, but at a continually decreasing rate (and some additional postulate as to the lower limit of the variable), say y=,k (x) (if x as before denotes physical fortune, and y the corresponding utility or satisfaction); where all that is known in general of >G is that 4/(x) is positive, 1P"(x) is negative; and ,k(x) is never less, x is always greater than zero. Suppose a gambler whose (physical) fortune is a, to have the chance p of obtaining a sum a and the chance q(= i - p) of losing the sum 0. If the game is fair in the usual sense of the term pa=q13. Accordingly the prospective psychical advantage of the party is 4'(a+a)+glG(a - a)=pA/'(a+ a) +q,k}a - (p / q)a}, say ya. When a is zero the expression reduces to the first state of the man, ,k (a), say yo. To compare this state with what it becomes by the gambling transaction, let a receive continually small increments of Aa. When a is zero the first differential coefficient of (ya - yo), viz. p' (a) - pP' (a), =0. Also the second differential coefficient, viz. pi"(a) + - "(a), is negative, since by hypothesis ,P" is continually negative. And as a continues to increase from zero ,the second differential coefficient of (ya - yo), viz. At/ (a+ a)+q"(a+qa), continues to be negative. Therefore the increments received by the first differential coefficient of (ya - yo) are continually negative; and therefore (ya - yo) is continually negative; ya<yo 3 for finite values of a (not exceeding qa/p).4 72. To show the advantage of insurance, let us suppose with Morgan Crofton 5 that a merchant, whose fortune is represented by I, will realize a sum e if a certain vessel arrives safely. Let the probability of this be p. To make up exactly for the risk run by the insurance company, he should pay them a sum (i - p)e. If he does, his moral fortune becomes, according to the formula now proposed ,/i(1 +pe), since his physical fortune is increased by the secured sum e, minus the payment (I - p)e; while if he does not insure it will be P/'('+e)+(1 - AG (1). We have then to compare 1y(1+p€), say y l, with p>/'(1 +e)+(I - p),k(1), say y 2. By reasoning analogous to that of the preceding paragraph it appears that (y2-34) is zero when e = o and continually diminishes as a increases up to any assigned finite (admissible) value. Similarly it may be shown that it is better to expose one's fortune in a number of separate sums to risks independent of each other than 1 It is important to remark that we should be wrong in thus adding the expectations if the events were not mutually exclusive. For the mathematical expectations it is not so.
2 This paragraph is taken from Morgan Crofton's article on " Probability," in the 9th edition of the Ency. Brit. Cf. Marshall, Principles of Economics, Mathematical Appendix, note ix.
4 Or should we rather say, not exceeding the limit at which, ' (a - pa/q) becomes o ? (The value of ,k(o) may be regarded as - Do .) Neither of the proposed limitations materially affects the validity of the theorem.
Loc. cit. par. 25.
to expose the whole to the same danger. Suppose a merchant, having a fortune, has besides a sum e which he must receive if a ship arrives in safety. Then, if the chance of the ship arriving =p, and q = 1 - p, his prospective advantage is pt (1 -}-e) -{-q,/. (i). Now instead of exposing the lump sum e to a single risk, let him subdivide e into n equal parts, each exposed to an independent equal risk (q) of being lost. As n is made larger 6 it becomes more and more nearly a certainty that he will realize out of the total e exposed to risk. Therefore his condition (in respect of the sort of advantage which is under consideration) will be approximately CI +pe). Then we have to compare i/' (1 +pe), say yl, with p,k(1 +e) +qlG (I), say y 2. By reasoning analogous to that which has been above employed - observing that (p - p 2) 1 "(i) is negative for all possible values of p - we conclude that y 2 <yi.
73. The Petersburg Problem
The doctrine of " moral fortune " was first formulated by Daniel Bernoulli 7 with reference to their celebrated " Petersburg Problem," which is thus stated by Todhunter 8: " A throws a coin in the air: if head appears at the first throw he is to receive a shilling from B, if head does not appear until the second throw he is to receive 2s., if head does not appear until the third throw he is to receive 4s., and so on, required the expectation of A." So many lessons are presented by this problem that there has been room for disputing what is the lesson. Laplace and other high authorities follow Daniel Bernoulli. Poisson finds the explanation in the fact that B could not be expected to pay up so large a sum. Whitworth, who regards the disadvantage of gambling as consisting mainly in the danger of becoming " cleaned out," 9 finds this moral in the Petersburg problem. All have not noticed what some regard as the principal lesson to be obtained from the paradox: viz. that a transaction which cannot be regarded as one of a series - at least a " cross-series " 10 - is not subject to the general rule for expectations of advantage whether material or moral." Section IV. - Geometrical Applications. 74. Under this head occur some interesting illustrations of principles employed in the preceding sections; in particular of a priori probabilities and of the relation between probability and expectation.
75. Illustrations of a priori Probabilities
The assumption which has been made under preceding heads that the probability of certain alternatives is approximately equal appears to rest on evidence of much the same character as the assumption which is made under this head that one point in a line, plane or volume is as likely to occur as another, under certain circumstances. Thus consider the proposition: if a given area S is included within a given area A, the chance of a point P, taken at random on A, falling on S is S/A. In a great variety of circumstances such a size can be assigned to the spaces, and " taking at random " can be so defined that the proposition is more or less directly based on experience. The fact that the points of incidence are equally distributed in space is observed, or connected by inference with observation, in many cases, e.g. raindrops and molecules. There is a solid substratum of evidence for the premiss employed in the solution of problems like the following: On a chess-board, on which the side of every square is a, there is thrown a coin of diameter b(b <a) so as to be entirely on the board, which may be supposed to have no border. What is the probability that the coin is entirely on one square?" The area on which the coin can fall is (8a - b) 2 . The portion of the area which is favourable to the event is 64 (a - b) 2 . Therefore the required probability is (a - b) 2 /(a - $b)2. 76. Random Lines. - Speculative difficulties recur when we have to define a straight line taken at random in a plane; for instance, in the following problem proposed by Buffon.13 A floor is ruled with equidistant parallel lines; a rod, shorter than the distance between each pair, being thrown at random on the floor, to find the chance of its falling on one of the lines. The problem is usually solved as follows: Let x be the distance of the centre of the rod from the nearest line, 0 the inclination of the rod to a perpendicular to the parallels, 2a the common distance of the parallels, 2c the length of rod; then, as all values of x and 0 between their extreme limits are equally probable, the whole number of cases will be represented by f:.Í d xd e = ?ra. i 6 See above, par. 25 (James Bernoulli's theorem).
7 Specimen theoriae novae de mensura sortis (16), translated (into German) with notes by Pringsheim (3906).
8 Op. cit. art. 389.
9 Choice and Chance, pp. 211, 232. The danger of a party to a game of chance being " ruined " (by losing more than his whole fortune), which forms a separate chapter in some treatises, is readily deducible from the theory of deviations from an average which will be stated in pt. ii.
1° Above, par. 5.11 Above, par. 20.
12 Whitworth, Exercises, No. 500.
u Cf. Morgan Crofton, loc. cit. ] Now if the rod crosses one of the lines we must have c> x/cos 0; so that the favourable cases will be measured by cos f z is Thus the probability required is p = 2c/7ra. It may be asked - why should we take the centre of the rod as the point where distance from the nearest line has all its values equally probable? Why not one extremity of the line, or some other point suited to the circumstances of projection? Fortunately it makes no difference in the result to what point in the rod we assign this pre-eminence.
77. The legitimacy of the assumption obtains some verification from the success of a test suggested by Laplace. If a rod is actually thrown, as supposed in the problem, a great number of times, and the frequency with which it falls on one of the parallels is observed, that proportionate number thus found, say p, furnishes a value for the constant 7r. For 7r ought to equal 2c/pa. The experiment has been made by Professor Wolf of Frankfort. Having thrown a needle of length 36 mm. on a plane ruled with parallel lines at a distance from each other of 45 mm. 5000 times, he observed that the needle crossed a parallel 2532 times. Whence the value of 7r is deduced 3.1596, with a probable error' .05.
78. More hesitation may be felt when we have to define a random chord of a circle, 2 for instance, with reference to the question, what is the probability that a chord taken at random will be greater than the side of an equilateral triangle? For some purposes it would no doubt be proper to assume that the chord is constructed by taking any point on the circumference and joining it to another point on the circumference, the points from which one is taken at random being distributed at equal intervals around the circumference. On this understanding the probability in question would be 2. But in other connexions, for instance, if the chord is obtained by the intersection with the circle of a rod thrown in random fashion, it seems preferable to consider the chord as a case of a straight line falling at random on a plane. Morgan Crofton' himself gives the following definition of such a line: If an infinite number of straight lines be drawn at random in a plane, there will be as many parallel to any given direction as to any other, all directions being equally probable; also those having any given direction will be disposed with equal frequency all over the plane. Hence, if a line be determined by the co-ordinates p, co, the perpendicular on it from a fixed origin 0, and the inclination of that perpendicular to a fixed axis, then, if p, w be made to vary by equal infinitesimal increments, the series of lines so given will represent the entire series of random straight lines. Thus the number of lines for which p falls between p and p+dp, and w between w and co+dw, will be measured by dpdco, and the integral f fdpdw, between any limits, measures the number of lines within those limits.
79. Authoritative and useful as this definition is, it is not entirely free from difficulty. It amounts to this, that if we write the equation of the random line x COS a+y Sin a - p=o, we ought to take a and p as those variables, of which the equicrescent values are equally probable - the equiprobable variables, as we may say. But might we not also write the equation in either of the following forms (I) =o, (2) ax+by - I =o, and take a and b in either system as the equiprobable variables? To be sure, if the equal distribution of probabilities is extended to infinity we shall be landed in the absurdity that of the random lines passing through any point on the axis of y a proportion differing infinitesimally from unity - loo%are either (I) parallel or (2) perpendicular to the axis of x. But the admission of infinite values will render any scheme for the equal distribution of probabilities absurd. If Professor Crofton's constant p, for example, becomes infinite, the origin being thus placed at an infinite distance, all the random chords intersecting a finite circle would be parallel!
80. However this may be, Professor Crofton's conception has the distinction of leading to a series of interesting propositions, of which specimens are here subjoined.' The number of random lines which meet any closed convex contour of length L is measured by L. For, taking 0 inside the contour, and integrating first for p, from o to p, the perpendicular on the tangent to the contour, we have taking this through four right angles for w, we have As recorded by Czuber, Geometrische Wahrscheinlichkeiten, p. 90.
2 Cf. Bertrand, Calcul des probabilite's, pp. 4 seq. The matter has been much discussed in the Educational Times. See Mathematical Questions. .. from the Educational Times [a reprint], xxix. 17-20, containing references to earlier discussions, e.g. x. 33 (by Woolhouse).
3 Loc. cit. § 75.
4 The whole of p. 787 of Morgan Crofton's article is often referred to, and parts of pp. 786, 788 are transferred here.
by Legendre's theorem on rectification, N being the measure of the number of lines, o? pdw=L.6 Thus, if a random line meet a given contour, of length L, the chance of its meeting another convex contour, of length 1, internal to the former is p=l/L. If the given contour be not convex, or not closed, N will evidently be the length of an endless string, drawn tight around the contour.
81. If a random line meet a closed convex contour of length L, the chance of it meeting another such contour, external to the former, is p= (X - Y)/L, where X is the length of an endless band enveloping both contours, and crossing between them, and Y that of a band also enveloping both, but not crossing. This may be shown by-means of Legendre's integral above; or as follows: Call, for shortness, N(A) the number of lines meeting an area A; N(A, A') the number which meet both A and A'; then (fig. I) N([[Sroqph) +N (S'Q'Or'P'H') = N (Sroqph +S'Q'Or'P'H') +N(Sroqph, S'Q'Or'P'H]]'), since in the first member each line meeting both areas is counted twice. But the number of lines meeting the non-convex figure consisting of Oqphsr and OQ'S'H'P'R' is equal to the band Y, and the number meeting both these areas is identical with that of those meeting the given areas S2, 12'; hence X=Y +N(S2, a'). Thus the number meeting both the given areas is measured by X - Y. Hence the theorem follows.
82. Two random chords cross a given convex boundary, of length L, and area St; to find the chance that their intersection falls inside the boundary.
Consider the first chord in any position; let C be its length; considering it as a closed area, the chance of the second chord meeting it is 2C/L; and the whole chance of its coordinates falling in dp, dc,i and of the second chord meeting it in that position is 2C dpdw = 2 Cd dw L ffdpdw L2 p .
But the whole chance is the sum of these chances for all its positions;. . prob. = 2L-? f f Cdpdw.
Now, for a given value of w, the value of Cdp is evidently the area St; then, taking w from 7r to o, we have required probability = 27rS2L-2.
The mean value of a chord drawn at random across the boundary ffCdpdw 7r2 ffdpdw L 83. A straight band of breadth c being - traced on a floor, and a circle of radius r thrown on it at random; to find the mean area of the band which is covered by the circle. (The cases are omitted where the circle falls outside the band.)6 If S be the space covered, the chance of a random point on the circle falling on the band is p = M (S)/z-r 2, this is the same as 5 This result also follows by considering that, if an infinite plane be covered by an infinity of lines drawn at random, it is evident that the number of these which meet a given finite straight line is proportional to its length, and is the same whatever be its position. Hence, if we take 1 the length of the line as the measure of this number, the number of random lines which cut any element ds of the contour is measured by ds, and the number which meet the contour is therefore measured by ZL, half the length of the boundary. If we take 21 as the measure for the line, the measure for the contour will be L, as above. Of course we have to remember that each line must meet the contour twice. It would be possible to rectify any closed curve by means of this principle. Suppose it traced on the surface of a circular disk, of circumference L, and the disk thrown a great number of times on a system of parallel lines, whose distance asunder equals the diameter, if we count the number of cases in which the closed curve meets one of the parallels, the ratio of this number to the whole number of trials will be ultimately the ratio of the circumference of the curve to that of the circle. [Morgan Crofton's note.] s Or the floor may be supposed painted with parallel bands, at a distance asunder equal to the diameter; so that the circle must fall on one.
M= FIG. I.
if the circle were fixed, and the band thrown on it at random. Now let A (fig. 2) be a position of the random point; the favourable cases are when HK, the bisector of the band, meets a circle, centre A, radius Zc; and the whole number are when HK meets a circle, centre 0, radius r+ Z c; hence the probability is p 27r(r + 2c) 2r + c This is constant for all positions of A; hence, equating these two values of p, the mean value required is M(S) =c(2r+c)-17rr2.
The mean value of the portion of the circumference which falls on the band is the same fraction c/(2r+c) of the whole circumference.
If any convex area whose surface is S2 and circumference L be thrown on the band, instead of a circle, the mean area covered is M (S) = 71-c (L+7rc)-iS2.
For as before, fixing the random point at A, the chance of a random point in S2 falling on the band is p = 27r . where L' is the perimeter of a parallel curve to L, at a normal distance lc from it. Now L'= L-F-27r., M (S) _ 7rc L +71-c 84. Buffon's problem may be easily deduced in a similar manner. Thus, if 2r =length of line, a =distance between the parallels, and we conceive a circle (fig. 3) of diameter a with its centre at the middle 0 of the line,' rigidly attached to the latter, and thrown with it on the parallels, this circle must meet one of the parallels; if it be thrown an infinite number of times we shall thus have an infinite number of chords crossing it at random. Their number is measured by and the number which meet 2r is measured by 4r. Hence the chance that the line meets one of the parallels is p= 4r/7ra.
85. To investigate the probability that the inclination of the line joining any two points in a given convex area S2 shall lie within given limits. We give here a method of reducing this question to calculation, for the sake of an integral to which it leads, and which is not easy to deduce otherwise.
First let one of the points A (fig. 4) be fixed; draw through it a chord PQ = C, at an inclination 0 to some fixed line; put AP= r, AQ =r'; then the number of cases for which the direction of the line I i 1G. joining A and B lies between 0 and 0 +d0 4 ' is measured by 2 (r2+r'2)d0. Now let A range over the space between PQ and a parallel chord distant dp from it, the number of cases for which A lies in this space and the direction of AB from 0 to 0+d0 is (first considering A to lie in the element drdp) zdpd0 f (r 2 +r' 2)dr= 3C3dpd0.
Let p be the perpendicular on C from a given origin 0, and let w be the inclination of p (we may put dw for d0), C will be a given function of p, w; and, integrating first for w constant, the whole number of cases for which w falls between given limits w', co" is 3 dwJC3dp; the integral fC 3 dp being taken for all positions of C between two tangents to the boundary parallel to PQ. The question is thus reduced to the evaluation of this double integral, which, of course, is generally difficult enough; we may, however, deduce from it a remarkable result; for, if the integral 3 f f C 3 dpdw be extended to all possible positions of C, it gives the whole number of pairs of positions of the points A, B which lie inside the area; but this number is S2 2; hence fJC3dpdw=3522, the integration extending to all possible positions of the chord C, - its length being a given function of its co-ordinates p, w.2 ' The line might be anywhere within the circle without altering the question.
This integral was given by Morgan Crofton in the Cornptes rendus (1869), p. 1469. An analytical proof was given by Serret, Annales scient. de l'ecole normale (1869), p. 177.
Hence if L, S2 be the perimeter and area of any closed convex contour, the mean value of the cube of a chord drawn across it at random is 3522/L.
86. Let there be any two convex boundaries (fig. 5) so related that a tangent at any point V to the inner cuts off a constant segment S from the outer (e.g. two concentric similar ellipses); let the annular area between them be called A; from a point X taken at random on this annulus draw tangents XA, XB to the inner. The mean value of the FIG. 5.
arc AB, M (AB) = LS/A, L being the whole length of the inner curve ABV.
The following lemma will first be proved: If there be any convex arc AB (fig. 6), and if N 1 be (the measure of) the number of random lines which meet it once, N2 the number which meet it twice, A 6 2 arc AB = N1+2N2. FIG. 6. For draw the chord AB; the number of lines meeting the convex figure so formed is N i -1-N2 =arc +chord (the perimeter); but N1= number of lines meeting the chord = 2 chord; 2 arc + N i = 2N2 -12N2, .. 2 arc = Ni-(-2N2.
Now fix the point X, in fig. 5, and draw XA, XB. If a random line cross the boundary L, and p i be the probability that it meets the arc AB once, P2 that it does so twice, 2AB/L =p i +21,2; and if the point X range all over the annulus, and pi, p2 are the same probabilities for all positions of X, 2M (AB)/L =p1 +2p2.
Let now IK (fig. 7) be any position of the random line; drawing tangents at I, K, FIG. 7.
it is easy to see that it will cut the arc AB twice when X is in the space marked a, and once when X is in either space marked 13; hence, for this position of the line, p i +2p 2 =2 (a+ /)/A = 2S/A, which is constant; hence M(AB)/L=S/A.
Hence the mean value of the arc is the same fraction of the perimeter that the constant area S is of the annulus.
If L be not related as above to the outer boundary, M(AB)/L= M(S)/A, 14(S) being the mean area of the segment cut off by a tangent at a random point on the perimeter L.
The above result may be expressed as an integral. If s be the arc AB included by tangents from any point (x, y) on the annulus, f fsdxdy = LS.
It has been shown (Phil. Trans., 1868, p. 191) that, if 0 he the angle between the tangents XA, XB, f fodxdy=7r(A-2S). The mean value of the tangent XA or XB may be shown to be M (XA) = SP/2A, where P = perimeter of locus of centre of gravity of the segment S.
87. When we go on to species of three dimensions further speculative difficulties occur. How is a random line through a given point to be defined ? Since it is usual to define a vector by two angles (viz. ¢ the angle made with the axis X by a vector r 'in the ' plane XY, and 0 (or zir the angle made by the vector p with r in die plane containing both p and r and the axis Z) it seems natural to treat the angles 4 and 0 as the equiprobable variables. In other words, if we take at random any meridian on the celestial globe and combine it with any right ascension the vector joining the centre to the point thus assigned is a random line. 3 It is possible that for some purposes this conception may be appropriate. For many purposes surely it is proper to assume a more symmetrical distribution of the terminal points on the surface of a sphere, a distribution such that each element of the surface shall contain an approximately equal number of points. Such an assumption is usually made in the kinetic theory of molecules with respect to the direction of the line joining the centres of two colliding spheres in a " molecular chaos." 4 It is safe to say with Czuber, " No discussion can remove indeterminateness." Let us hope with him that " though this branch of probability can for the present claim only a theoretic interest, in the future it will perhaps also lead to practical results." s 88. Illustrations of probability and expectation. - The close relation between probability and expectation is well illustrated by geometrical examples. As above stated, when a given space S is included within a given space A, if p is the probability that a point Cf. Bertrand, op. cit. § 135.
4 See e.g. Watson, Kinetic Theory of Gases, p. 2; Tait, Trans. Roy. Soc., Edin. (r888), xxxiii. 68.
Wahrscheinlichkeitstheorie, p. 64.