21.03.2013 Views

Review1 of Liber De Ludo Aleae (Book on Games of Chance) by ...

Review1 of Liber De Ludo Aleae (Book on Games of Chance) by ...

Review1 of Liber De Ludo Aleae (Book on Games of Chance) by ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Review 1 <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>Liber</str<strong>on</strong>g> <str<strong>on</strong>g>De</str<strong>on</strong>g> <str<strong>on</strong>g>Ludo</str<strong>on</strong>g> <str<strong>on</strong>g>Aleae</str<strong>on</strong>g> (<str<strong>on</strong>g>Book</str<strong>on</strong>g> <strong>on</strong> <strong>Games</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> <strong>Chance</strong>) <strong>by</strong> Gerolamo Cardano<br />

1. Biographical Notes C<strong>on</strong>cerning Cardano<br />

Gerolamo Cardano (also referred to in the literature as Jerome Cardan), was born in Pavia, in present<br />

day Italy, in 1501 and died at Rome in 1576. Educated at the universities <str<strong>on</strong>g>of</str<strong>on</strong>g> Pavia and Padua,<br />

Cardano practised as a medical doctor from 1524 to 1550 in the village <str<strong>on</strong>g>of</str<strong>on</strong>g> Sacco and in Milan.<br />

During this period he appears to have studied mathematics and other sciences. He published several<br />

works <strong>on</strong> medicine and in 1545 published a text <strong>on</strong> algebra, the Ars Magna. Am<strong>on</strong>g his books is the<br />

<str<strong>on</strong>g>Liber</str<strong>on</strong>g> de <str<strong>on</strong>g>Ludo</str<strong>on</strong>g> <str<strong>on</strong>g>Aleae</str<strong>on</strong>g> (<str<strong>on</strong>g>Book</str<strong>on</strong>g> <strong>on</strong> <strong>Games</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> <strong>Chance</strong>), written sometime in the mid 1500s, although<br />

unpublished until 1663.<br />

2. Review <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>De</str<strong>on</strong>g> <str<strong>on</strong>g>Ludo</str<strong>on</strong>g> <str<strong>on</strong>g>Aleae</str<strong>on</strong>g><br />

Cardano’s text was originally published in Latin, in 1663. An English translati<strong>on</strong> <strong>by</strong> Sydney Henry<br />

Gould is provided in Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Oystein Ore’s book Cardano, The Gambling Scholar (Princet<strong>on</strong><br />

University Press, 1953). Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Ore’s book provides both biographical informati<strong>on</strong> relating to<br />

Cardano, as well as commentary <strong>on</strong> Cardano’s presentati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> a probability theory relating to dice<br />

and card games.<br />

In <str<strong>on</strong>g>De</str<strong>on</strong>g> <str<strong>on</strong>g>Ludo</str<strong>on</strong>g> <str<strong>on</strong>g>Aleae</str<strong>on</strong>g> Cardano provides both advice and a theoretical c<strong>on</strong>siderati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> outcomes in dice<br />

and card games. The text (as published) is composed <str<strong>on</strong>g>of</str<strong>on</strong>g> 32 short chapters. The present review is<br />

c<strong>on</strong>cerned principally with chapters 9 to 15, illustrating aspects <str<strong>on</strong>g>of</str<strong>on</strong>g> the theory c<strong>on</strong>cerning dice.<br />

The first eight chapters provide a brief commentary <strong>on</strong> games and gambling, <str<strong>on</strong>g>of</str<strong>on</strong>g>fering advice to<br />

players, and suggesting both the dangers and benefits in playing. It may be <str<strong>on</strong>g>of</str<strong>on</strong>g> interest to quote some<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> Cardano’s comments regarding the playing <str<strong>on</strong>g>of</str<strong>on</strong>g> games <str<strong>on</strong>g>of</str<strong>on</strong>g> chance:<br />

“...in times <str<strong>on</strong>g>of</str<strong>on</strong>g> great anxiety and grief, it is c<strong>on</strong>sidered to be not <strong>on</strong>ly allowable, but even beneficial.”<br />

“..in times <str<strong>on</strong>g>of</str<strong>on</strong>g> great fear or sorrow, when even the greatest minds are much disturbed, gambling is far<br />

more efficacious in counteracting anxiety than a game like chess, since there is the c<strong>on</strong>tinual<br />

expectati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> what fortune will bring.”<br />

“In my own case, when it seemed to me after a l<strong>on</strong>g illness that death was close at hand, I found no little<br />

solace in playing c<strong>on</strong>stantly at dice.”<br />

“However, there must be moderati<strong>on</strong> in the amount <str<strong>on</strong>g>of</str<strong>on</strong>g> m<strong>on</strong>ey involved; otherwise, it is certain that no <strong>on</strong>e<br />

should ever play.”<br />

“..the losses incurred include lessening <str<strong>on</strong>g>of</str<strong>on</strong>g> reputati<strong>on</strong>, especially if <strong>on</strong>e has formerly enjoyed any<br />

c<strong>on</strong>siderable prestige; to this is added loss <str<strong>on</strong>g>of</str<strong>on</strong>g> time...neglect <str<strong>on</strong>g>of</str<strong>on</strong>g> <strong>on</strong>e’s own business, the danger it may<br />

become a settled habit, the time spent in planning after the game how <strong>on</strong>e may recuperate, and in<br />

remembering how badly <strong>on</strong>e has played.”<br />

1 Submitted for STA 4000H under the directi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Jeffrey Rosenthal.<br />

1


Cardano especially warns that lawyers, doctors and those in like pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essi<strong>on</strong>s avoid gambling, which could<br />

be injurious to their reputati<strong>on</strong>s and business. Interestingly, he adds:<br />

“Men <str<strong>on</strong>g>of</str<strong>on</strong>g> these pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essi<strong>on</strong>s incur the same judgement if they wish to practice music.”<br />

In Chapter 6 Cardano presents what he refers to as the Fundamental Principle <str<strong>on</strong>g>of</str<strong>on</strong>g> Gambling:<br />

“The most fundamental principle <str<strong>on</strong>g>of</str<strong>on</strong>g> all in gambling is simply equal c<strong>on</strong>diti<strong>on</strong>s...<str<strong>on</strong>g>of</str<strong>on</strong>g> m<strong>on</strong>ey, <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

situati<strong>on</strong>...and <str<strong>on</strong>g>of</str<strong>on</strong>g> the dice itself. To the extent to which you depart from that equality, if it is in your<br />

opp<strong>on</strong>ent’s favour, you are a fool, and if in your own, you are unjust.”<br />

What is most important for our purposes, is to recognise that Cardano’s fundamental principle states that<br />

games <str<strong>on</strong>g>of</str<strong>on</strong>g> chance can <strong>on</strong>ly be fairly played when there are equiprobable outcomes. This principle is the<br />

basis for his theory relating to outcomes in games <str<strong>on</strong>g>of</str<strong>on</strong>g> dice.<br />

Cardano begins to present (the results) <str<strong>on</strong>g>of</str<strong>on</strong>g> his theory <strong>on</strong> dice in Chapter 9: On the Cast <str<strong>on</strong>g>of</str<strong>on</strong>g> One Die. Given<br />

that a die has six points, he states:<br />

“...in six casts each point should turn up <strong>on</strong>ce; but since some will be repeated, it follows that others will<br />

not turn up.”<br />

We see here that his principle is at work (the symmetry <str<strong>on</strong>g>of</str<strong>on</strong>g> the die allows equiprobable outcomes), and<br />

also that he recognises (c<strong>on</strong>firmed <strong>by</strong> experience no doubt), that the principle is an ideal, and that in<br />

practice we will not have each point turn up <strong>on</strong>ce in every six casts. There would appear to be an implicit<br />

understanding <str<strong>on</strong>g>of</str<strong>on</strong>g> a “l<strong>on</strong>g range relative frequency” interpretati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> “in six casts each point should turn<br />

up <strong>on</strong>ce”. Or, in the c<strong>on</strong>temporary language <str<strong>on</strong>g>of</str<strong>on</strong>g> probability theory, we would say that we expect in six<br />

casts each point should turn up <strong>on</strong>ce.<br />

In this chapter, the c<strong>on</strong>cepts referred to as “circuit” and “equality” are introduced:<br />

“One-half <str<strong>on</strong>g>of</str<strong>on</strong>g> the total number <str<strong>on</strong>g>of</str<strong>on</strong>g> faces always represents equality; thus the chances are equal that a given<br />

point will turn up in three throws, for the total circuit is completed in six, or again that <strong>on</strong>e <str<strong>on</strong>g>of</str<strong>on</strong>g> three given<br />

points will turn up in <strong>on</strong>e throw. For example, I can as easily throw <strong>on</strong>e, three, or five as two, four, or<br />

six.”<br />

The “circuit” refers to the number <str<strong>on</strong>g>of</str<strong>on</strong>g> possible (elementary) outcomes, what in c<strong>on</strong>temporary probability<br />

theory may be referred to as “the size <str<strong>on</strong>g>of</str<strong>on</strong>g> the sample space”. “Equality” appears to be a c<strong>on</strong>cept related to<br />

expectati<strong>on</strong>. Since a given point <strong>on</strong> a die is expected to turn up <strong>on</strong>ce in six throws (the circuit), it could<br />

equally turn up in the first or sec<strong>on</strong>d three casts. Cardano also provides a variati<strong>on</strong> <strong>on</strong> this interpretati<strong>on</strong>,<br />

indicating that in <strong>on</strong>e throw, three given points (1,3,5) could turn up as easily as the three other points<br />

(2,4,6). Equality then can be understood as defined, that is, <strong>on</strong>e-half <str<strong>on</strong>g>of</str<strong>on</strong>g> the circuit, or as (in c<strong>on</strong>temporary<br />

terms) an event, which is as likely as its complementary event (that is, an event with probability <strong>on</strong>e-half).<br />

Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Ore suggests that the c<strong>on</strong>cept <str<strong>on</strong>g>of</str<strong>on</strong>g> equality is a c<strong>on</strong>sequence <str<strong>on</strong>g>of</str<strong>on</strong>g> Cardano having “the practical<br />

game in mind”:<br />

“...he seems to assume that usually there are <strong>on</strong>ly two [players]...each will stake the same amount A so<br />

that the whole pot is P = 2A. When a player c<strong>on</strong>siders how much he has w<strong>on</strong> or lost it is natural to relate<br />

it not to the whole pot 2A but to his own stake A. In terms <str<strong>on</strong>g>of</str<strong>on</strong>g> such a measure his expectati<strong>on</strong> becomes<br />

2


E = pP = 2pA = peA<br />

[where p refers to the proporti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> favourable outcomes for <strong>on</strong>e player and pe is called the equality<br />

proporti<strong>on</strong> <strong>by</strong> Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Ore]<br />

so that the equality proporti<strong>on</strong> or the double probability becomes the natural factor measuring loss or<br />

gain. …In a fair game…the number <str<strong>on</strong>g>of</str<strong>on</strong>g> favourable and unfavourable cases must be the same and each<br />

player has the same probability [1/2]. …This means that each player has equality in his favourable cases,<br />

so that the corresp<strong>on</strong>ding equality proporti<strong>on</strong>s are [1]. And Cardano expresses this simply <strong>by</strong> saying that<br />

“there is equality”.<br />

In Chapter 11 Cardano discusses the case <str<strong>on</strong>g>of</str<strong>on</strong>g> two dice. He enumerates the various possible throws:<br />

“…there are six throws with like faces, and fifteen combinati<strong>on</strong>s with unlike faces, which when doubled<br />

gives thirty, so that there are thirty-six throws in all, and half <str<strong>on</strong>g>of</str<strong>on</strong>g> these possible results is eighteen.”<br />

It is somewhat interesting that Cardano does not at this stage provide an illustrati<strong>on</strong>, or table to aid his<br />

explanati<strong>on</strong>. If the outcomes <str<strong>on</strong>g>of</str<strong>on</strong>g> the cast <str<strong>on</strong>g>of</str<strong>on</strong>g> two dice are represented <strong>by</strong> ordered pairs, the throws with like<br />

faces are (1,1), (2,2)…(6,6), six in all, and those with unlike faces are: (1,2), (1,3), (1,4), (1,5), (1,6),<br />

(2,3), (2,4), (2,5), (2,6), (3,4), (3,5), (3,6), (4,5), (4,6), (5,6), fifteen in number, and finally: (2,1)…(6,5),<br />

another fifteen. In total there are, as Cardano states, 36 possible outcomes. His use <str<strong>on</strong>g>of</str<strong>on</strong>g> the text <strong>on</strong>ly<br />

descripti<strong>on</strong> surely would have made the subject more difficult for the reader to appreciate his reas<strong>on</strong>ing –<br />

unless the reader was well versed in the subject matter. A lay reader for example, would likely ask why<br />

the unlike face combinati<strong>on</strong>s would have to be doubled. As it turns out, this manner <str<strong>on</strong>g>of</str<strong>on</strong>g> explanati<strong>on</strong> is<br />

typical in Cardano’s text, although he does provide some illustrati<strong>on</strong>s. For this reas<strong>on</strong> it would seem a<br />

reas<strong>on</strong>able c<strong>on</strong>jecture that the work is intended for those pers<strong>on</strong>s familiar with gambling.<br />

In this chapter, a result is given, comparing how likely, relative to equality, it is to get at least <strong>on</strong>e die with<br />

<strong>on</strong>e point (an ace) in each <str<strong>on</strong>g>of</str<strong>on</strong>g> two casts <str<strong>on</strong>g>of</str<strong>on</strong>g> two dice:<br />

“The number <str<strong>on</strong>g>of</str<strong>on</strong>g> throws c<strong>on</strong>taining at least <strong>on</strong>e ace is eleven out <str<strong>on</strong>g>of</str<strong>on</strong>g> the circuit <str<strong>on</strong>g>of</str<strong>on</strong>g> thirty-six; or somewhat<br />

more than half <str<strong>on</strong>g>of</str<strong>on</strong>g> equality; and in two casts <str<strong>on</strong>g>of</str<strong>on</strong>g> two dice the number <str<strong>on</strong>g>of</str<strong>on</strong>g> ways <str<strong>on</strong>g>of</str<strong>on</strong>g> getting at least <strong>on</strong>e ace<br />

twice is more than 1/6 but less than 1/4 <str<strong>on</strong>g>of</str<strong>on</strong>g> equality.”<br />

Cardano does not describe how he derived this result, however the following reas<strong>on</strong>ing seems quite<br />

possible. The number <str<strong>on</strong>g>of</str<strong>on</strong>g> ways <str<strong>on</strong>g>of</str<strong>on</strong>g> getting at least <strong>on</strong>e ace is 11 – (1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1),<br />

(3,1), (4,1), (5,1) and (6,1). As 11 is less than 12, there are fewer than 12 times 12 or 144 ways <str<strong>on</strong>g>of</str<strong>on</strong>g> getting<br />

at least an ace in both casts <str<strong>on</strong>g>of</str<strong>on</strong>g> the dice. In two casts <str<strong>on</strong>g>of</str<strong>on</strong>g> two dice, there are 36 times 36 or 1,296 possible<br />

outcomes (the circuit in this case). Equality is defined as half <str<strong>on</strong>g>of</str<strong>on</strong>g> 1,296, which is 648. 144 divided <strong>by</strong> 648<br />

is less than 1/4. Also, since 10 is less than 11, there are more than 10 times 10 or 100 ways <str<strong>on</strong>g>of</str<strong>on</strong>g> getting at<br />

least an ace in both casts. As 100 divided <strong>by</strong> 648 is more than 1/6, Cardano’s result is c<strong>on</strong>firmed. Note<br />

that the statement “more than 1/6 but less than 1/4 <str<strong>on</strong>g>of</str<strong>on</strong>g> equality” in terms <str<strong>on</strong>g>of</str<strong>on</strong>g> modern probability would be<br />

“more than 1/6 times 1/2 or 1/12 but less than 1/4 times 1/2 or 1/8”, that is the probability <str<strong>on</strong>g>of</str<strong>on</strong>g> the event is<br />

between 1/8 and 1/12. Why wasn’t Cardano more exact? If the above reas<strong>on</strong>ing was followed, he would<br />

have known that there were 121 possible ways in which the aces could turn up. Possibly, for his<br />

purposes, the precise fracti<strong>on</strong> is unnecessary. It may have been sufficient to explain that a game in which<br />

a player wagers <strong>on</strong> the occurrence <str<strong>on</strong>g>of</str<strong>on</strong>g> at least <strong>on</strong>e ace in two casts <str<strong>on</strong>g>of</str<strong>on</strong>g> two dice, was not a fair game. The<br />

player could expect <strong>on</strong>ly between 1/4 and 1/6 <str<strong>on</strong>g>of</str<strong>on</strong>g> their own stake. Also, interestingly, the use <str<strong>on</strong>g>of</str<strong>on</strong>g> upper and<br />

3


lower bounds suggests some variati<strong>on</strong> in practice – although we can not say for sure that this is what<br />

Cardano wished to express. It may have been seen as a more aesthetic way <str<strong>on</strong>g>of</str<strong>on</strong>g> describing the proporti<strong>on</strong>.<br />

Clearly however, c<strong>on</strong>cepts and problems familiar to modern students <str<strong>on</strong>g>of</str<strong>on</strong>g> probability are being c<strong>on</strong>sidered.<br />

In Chapter 12, the casting <str<strong>on</strong>g>of</str<strong>on</strong>g> three dice is c<strong>on</strong>sidered. Again the possible outcomes are enumerated, the<br />

total (circuit) being 216. While the wording is somewhat vague, Cardano appears to make an error in<br />

reporting the number <str<strong>on</strong>g>of</str<strong>on</strong>g> outcomes with at least <strong>on</strong>e unspecified point (such as an ace):<br />

“…out <str<strong>on</strong>g>of</str<strong>on</strong>g> the 216 possible results, each single face will be found in 108 and will not be found in as<br />

many.”<br />

According to Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Ore his reas<strong>on</strong>ing seems to be that for <strong>on</strong>e cast <str<strong>on</strong>g>of</str<strong>on</strong>g> a die, 1/6 <str<strong>on</strong>g>of</str<strong>on</strong>g> a given point can<br />

be expected to turn up. In 3 throws, 3 times 1/6 or 1/2 <str<strong>on</strong>g>of</str<strong>on</strong>g> the point will occur. Of 216 possible outcomes,<br />

108 would be favourable. However, if Cardano realised (as appears to be the case) that there were 11<br />

possible outcomes for a single point in two throws <str<strong>on</strong>g>of</str<strong>on</strong>g> the dice, would he make such an error? Was it an<br />

approximati<strong>on</strong> <strong>on</strong>ly? Cardano does indicate in subsequent chapters that there are 91 possible outcomes,<br />

which is correct.<br />

Chapters 13 and 14 c<strong>on</strong>cern outcomes <str<strong>on</strong>g>of</str<strong>on</strong>g> the sum <str<strong>on</strong>g>of</str<strong>on</strong>g> points. The theory is an extensi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the earlier<br />

results, and Cardano observes:<br />

“In the case <str<strong>on</strong>g>of</str<strong>on</strong>g> two dice, the points 12 and 11 can be obtained respectively as (6,6) and (6,5). The point<br />

10 c<strong>on</strong>sists <str<strong>on</strong>g>of</str<strong>on</strong>g> (5,5) and <str<strong>on</strong>g>of</str<strong>on</strong>g> (6,4) but the latter can occur in two ways, so that the whole number <str<strong>on</strong>g>of</str<strong>on</strong>g> ways <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

obtaining 10 will be 1/12 <str<strong>on</strong>g>of</str<strong>on</strong>g> the circuit and 1/6 <str<strong>on</strong>g>of</str<strong>on</strong>g> equality.”<br />

Note that Cardano uses ordered pairs to illustrate the outcomes (presuming that the translated text does<br />

not use this for simplificati<strong>on</strong>). Also note that the expressi<strong>on</strong> 1/12 <str<strong>on</strong>g>of</str<strong>on</strong>g> the circuit is what we would refer to<br />

as a probability <str<strong>on</strong>g>of</str<strong>on</strong>g> 1/12.<br />

In Chapter 14 the use <str<strong>on</strong>g>of</str<strong>on</strong>g> the term “odds” is found, as we would apply it today:<br />

“If therefore, some<strong>on</strong>e should say, ‘I want an ace, a deuce, or a trey, you know that there are 27<br />

favourable throws, and since the circuit is 36, the rest <str<strong>on</strong>g>of</str<strong>on</strong>g> the throws in which these points will not turn up<br />

will be 9; the odds will therefore be 3 to 1.’”<br />

An incorrect computati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> odds is found in the same chapter:<br />

“If it is necessary for some<strong>on</strong>e that he should throw at least twice, then you know that the throws<br />

favourable for it are 91 in number, and the remainder is 125; so we multiplying each <str<strong>on</strong>g>of</str<strong>on</strong>g> these numbers <strong>by</strong><br />

itself and get to 8,281 and 15,625, and the odds are about 2 to 1.”<br />

Cardano realises that an error has been made, and discusses this in chapter 15:<br />

“This reas<strong>on</strong>ing seems to be false... for example, the chance <str<strong>on</strong>g>of</str<strong>on</strong>g> getting <strong>on</strong>e <str<strong>on</strong>g>of</str<strong>on</strong>g> any three chosen faces in<br />

<strong>on</strong>e cast <str<strong>on</strong>g>of</str<strong>on</strong>g> <strong>on</strong>e dice is equal to the chance <str<strong>on</strong>g>of</str<strong>on</strong>g> getting <strong>on</strong>e <str<strong>on</strong>g>of</str<strong>on</strong>g> the other three, but according to this<br />

reas<strong>on</strong>ing there would be an even chance <str<strong>on</strong>g>of</str<strong>on</strong>g> getting a chosen face each time in two casts, and thus in<br />

three, and four, which is most absurd. For if a player with two dice can with equal chances throw an even<br />

and an odd number, it does not follow that he can with equal fortune throw an even number in each <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

three successive casts.”<br />

4


It is interesting that after having made this observati<strong>on</strong>, the earlier text is not corrected. Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Ore<br />

notes that this is typical <str<strong>on</strong>g>of</str<strong>on</strong>g> Cardano’s presentati<strong>on</strong>s. The passage is also interesting for its use <str<strong>on</strong>g>of</str<strong>on</strong>g> the words<br />

“chance” and “fortune” relating to the possible outcomes <str<strong>on</strong>g>of</str<strong>on</strong>g> throws. In the following paragraph the word<br />

“probability” is used:<br />

“In comparis<strong>on</strong> where the probability is <strong>on</strong>e half, as <str<strong>on</strong>g>of</str<strong>on</strong>g> even faces with odd, we shall multiply the number<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> casts <strong>by</strong> itself and subtract <strong>on</strong>e from the product, and the proporti<strong>on</strong> which the remainder bears to<br />

unity will be the proporti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the wagers to be staked. Thus, in 2 successive casts we shall multiply 2 <strong>by</strong><br />

itself, which will be 4; we shall subtract 1; the remainder is 3; therefore the player will rightly wager 3<br />

against 1...”<br />

Cardano c<strong>on</strong>tinues to discuss the computati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> odds in the chapter, however what is <str<strong>on</strong>g>of</str<strong>on</strong>g> principal interest<br />

is the use <str<strong>on</strong>g>of</str<strong>on</strong>g> the word “probability”. The above passage is quite possibly the first applicati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the word<br />

in written form, with the meaning comparable to its use in the modern theory (based <strong>on</strong> symmetry or a<br />

l<strong>on</strong>g-range relative frequency definiti<strong>on</strong>).<br />

It is well known that the theory <str<strong>on</strong>g>of</str<strong>on</strong>g> probability has its origins in questi<strong>on</strong>s <strong>on</strong> gambling. Why is this the<br />

case? Although people were aware <str<strong>on</strong>g>of</str<strong>on</strong>g> the variable and unpredictable character <str<strong>on</strong>g>of</str<strong>on</strong>g> every day phenomena<br />

(such as the weather, commodity prices, etc.), games <str<strong>on</strong>g>of</str<strong>on</strong>g> chance lend themselves to a mathematical<br />

discussi<strong>on</strong> because the universe <str<strong>on</strong>g>of</str<strong>on</strong>g> possibilities is (relatively) easily known and computed, at least for<br />

simple games.<br />

Cardano’s text would appear to be the first known mathematical work <strong>on</strong> the theory <str<strong>on</strong>g>of</str<strong>on</strong>g> probability,<br />

although published after the more famous corresp<strong>on</strong>dence between Pascal and Fermat.<br />

5


Review 1 <str<strong>on</strong>g>of</str<strong>on</strong>g> Sopra Le Scoperte dei Dadi (C<strong>on</strong>cerning an Investigati<strong>on</strong> <strong>on</strong> Dice)<br />

1. Biographical Notes<br />

Galileo Galilei was born in Pisa in 1564. His early educati<strong>on</strong> was at the Jesuit m<strong>on</strong>astery <str<strong>on</strong>g>of</str<strong>on</strong>g> Vallombrosa,<br />

and attended the University <str<strong>on</strong>g>of</str<strong>on</strong>g> Pisa (with the original intenti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> studying medicine). He became a<br />

pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor <str<strong>on</strong>g>of</str<strong>on</strong>g> mathematics at Pisa in 1589, and at Padua in 1592. He is famous for his interest in<br />

astr<strong>on</strong>omy and physics, including hydrostatics, and dynamics (through his study <str<strong>on</strong>g>of</str<strong>on</strong>g> properties relating to<br />

gravitati<strong>on</strong>). Works published in 1632 include support for the Copernican system. Although the<br />

publicati<strong>on</strong> was approved <strong>by</strong> the papal censor, it did (to some degree) c<strong>on</strong>tradict an edict in 1616,<br />

declaring the propositi<strong>on</strong> that the sun was the centre <str<strong>on</strong>g>of</str<strong>on</strong>g> the solar system to be false. After an inquiry,<br />

Galileo was placed under house arrest, and died near Florence in 1642.<br />

2. Review <str<strong>on</strong>g>of</str<strong>on</strong>g> Sopra Le Scoperte dei Dadi<br />

Galileo’s brief research summary <strong>on</strong> dice is believed to have been written between 1613 and 1623 2 . It is a<br />

resp<strong>on</strong>se to a request for an explanati<strong>on</strong> about an observati<strong>on</strong> c<strong>on</strong>cerning the playing <str<strong>on</strong>g>of</str<strong>on</strong>g> three dice. While<br />

the possible combinati<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g> dice sides totalling 9, 10, 11, and 12 are the same, in Galileo’s words:<br />

“…it is known that l<strong>on</strong>g observati<strong>on</strong> has made dice-players c<strong>on</strong>sider 10 and 11 to be more advantageous<br />

than 9 and 12.”<br />

Galileo notes in the opening paragraph <str<strong>on</strong>g>of</str<strong>on</strong>g> his article:<br />

“The fact that in a dice-game certain numbers are more advantageous than others has a very obvious<br />

reas<strong>on</strong>, i.e. that some are more easily and more frequently made than others…”<br />

Galileo explains the phenomen<strong>on</strong> <strong>by</strong> enumerating the possible combinati<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g> the three numbers<br />

composing the sum, and presents a tabular summary. The principles allowing the enumerati<strong>on</strong> are<br />

explained:<br />

“…we have so far declared these three fundamental points; first, that the triples, that is the sum <str<strong>on</strong>g>of</str<strong>on</strong>g> three-dice<br />

throws, which are made up <str<strong>on</strong>g>of</str<strong>on</strong>g> three equal numbers, can <strong>on</strong>ly be produced in <strong>on</strong>e way; sec<strong>on</strong>d, that the triples which<br />

are made up <str<strong>on</strong>g>of</str<strong>on</strong>g> two equal numbers and the third different, are produced in three ways; third, that those triples which<br />

are made up <str<strong>on</strong>g>of</str<strong>on</strong>g> three different numbers are produced in six ways. From these fundamental points we can easily<br />

deduce in how many ways, or rather in how many different throws, all the numbers <str<strong>on</strong>g>of</str<strong>on</strong>g> the three dice may be formed,<br />

which will easily be understood from the following table:”<br />

10 9 8 7 6 5 4 3<br />

631 6 621 6 611 3 511 3 411 3 311 3 211 3 111 1<br />

622 3 531 6 521 6 421 6 321 6 221 3<br />

541 6 522 3 431 6 331 3 222 1<br />

532 6 441 3 422 3 322 3<br />

442 3 432 6 332 3<br />

433 3 333 1<br />

27 25 21 15 10 6 3 1<br />

1 Submitted for STA 4000H under the directi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Jeffrey Rosenthal.<br />

2 Refer to F.N. David’s <strong>Games</strong>, Gods and Gambling – A History <str<strong>on</strong>g>of</str<strong>on</strong>g> Probability and Statistical Ideas, Dover<br />

6


The top row <str<strong>on</strong>g>of</str<strong>on</strong>g> the table presents the sum <str<strong>on</strong>g>of</str<strong>on</strong>g> the three dice. Galileo does not provide the enumerati<strong>on</strong> for<br />

sums 11 to 18, indicating earlier in his article that an investigati<strong>on</strong> from 3 to 10 is sufficient because:<br />

“…what pertains to <strong>on</strong>e <str<strong>on</strong>g>of</str<strong>on</strong>g> these numbers, will also pertain to that which is the <strong>on</strong>e immediately greater.”<br />

While the wording is awkward, he is referring to the symmetrical nature <str<strong>on</strong>g>of</str<strong>on</strong>g> the problem, however does<br />

not provide any more explanati<strong>on</strong>.<br />

The possible triples are shown under the sums, and to the right <str<strong>on</strong>g>of</str<strong>on</strong>g> each is the number <str<strong>on</strong>g>of</str<strong>on</strong>g> combinati<strong>on</strong>s for<br />

the triple. The last row sums those combinati<strong>on</strong>s.<br />

From Galileo’s table, it can be seen that 10 will show up in 27 ways out <str<strong>on</strong>g>of</str<strong>on</strong>g> all possible throws (which<br />

Galileo does indicate as 216). Since 9 can be found in 25 ways, this explains why it is at a<br />

“disadvantage” to 10 (even though each sum can be made from 6 different triples).<br />

The article is <str<strong>on</strong>g>of</str<strong>on</strong>g> interest for its antiquity in the development <str<strong>on</strong>g>of</str<strong>on</strong>g> ideas relating to the science <str<strong>on</strong>g>of</str<strong>on</strong>g> probability.<br />

Although words like “chance” and “probability” are not directly used, the idea is c<strong>on</strong>veyed <strong>by</strong> the<br />

applicati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> terms such as “advantage” or “disadvantage”. Combinatorial mathematics, and an<br />

appreciati<strong>on</strong> for the equipossibility <str<strong>on</strong>g>of</str<strong>on</strong>g> individual events (gained either <strong>by</strong> a recogniti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the symmetry<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> the die, or observati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> results) form the building material for early probability science.<br />

Publicati<strong>on</strong>s 1962, page 62.<br />

7


Review 1 <str<strong>on</strong>g>of</str<strong>on</strong>g> Corresp<strong>on</strong>dence between Pierre de Fermat and Blaise Pascal<br />

1. Biographical Notes<br />

Pierre de Fermat was born in 1601 at Beaum<strong>on</strong>t-de-Lomagne, studied law at the University <str<strong>on</strong>g>of</str<strong>on</strong>g> Toulouse,<br />

and served there as a judge. He appears to have corresp<strong>on</strong>ded a great deal with scientists in Paris, as well<br />

as with others, including Pascal, about mathematical ideas. His interests included the theory <str<strong>on</strong>g>of</str<strong>on</strong>g> numbers,<br />

and is well known for the propositi<strong>on</strong> that the equat<strong>on</strong> x n + y n = z n has no soluti<strong>on</strong>s in the positive integers<br />

(>2). He died at Castres in 1665.<br />

Blaise Pascal was born at Clerm<strong>on</strong>t in 1623, and died in Paris in 1662. In additi<strong>on</strong> to his c<strong>on</strong>tributi<strong>on</strong>,<br />

al<strong>on</strong>g with Fermat, to the science <str<strong>on</strong>g>of</str<strong>on</strong>g> probability, he is well known for his work in geometry and<br />

hydrostatics. Pascal wrote the Essai pour les C<strong>on</strong>iques, and invented (and sold) a mechanical calculating<br />

machine. He may be most famous for his philosophical and religious writings, and is the author <str<strong>on</strong>g>of</str<strong>on</strong>g> the<br />

Pensees.<br />

2. Review <str<strong>on</strong>g>of</str<strong>on</strong>g> the corresp<strong>on</strong>dence<br />

This summary is primarily c<strong>on</strong>cerned with ideas presented in the first two letters <str<strong>on</strong>g>of</str<strong>on</strong>g> a collecti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

corresp<strong>on</strong>dence written over a period from 1654 to 1660. The first letter in this series is from Fermat to<br />

Pascal, and is undated, although it was likely written in June or July <str<strong>on</strong>g>of</str<strong>on</strong>g> 1654 (based <strong>on</strong> the dates <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

subsequent corresp<strong>on</strong>dence from Pascal). It would seem that Pascal had earlier written to Fermat,<br />

discussing the problem relating to the divisi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> stakes in a wager <strong>on</strong> a game <str<strong>on</strong>g>of</str<strong>on</strong>g> dice, when the game is<br />

suspended before completi<strong>on</strong>. The questi<strong>on</strong> appears to have been: If a player needs to get 1 point (a<br />

specific side <str<strong>on</strong>g>of</str<strong>on</strong>g> the die) in eight throws <str<strong>on</strong>g>of</str<strong>on</strong>g> the die, and after the first three throws has not obtained the<br />

required point, how much <str<strong>on</strong>g>of</str<strong>on</strong>g> the wager should be distributed to each player if they agree to disc<strong>on</strong>tinue<br />

play?<br />

Fermat’s letter suggests that Pascal reas<strong>on</strong>ed 125/1,296 <str<strong>on</strong>g>of</str<strong>on</strong>g> the wager should be given to the player. Fermat<br />

disagrees with this, proposing that the player should receive 1/6 <str<strong>on</strong>g>of</str<strong>on</strong>g> the wager. Fermat’s argument is surely<br />

based <strong>on</strong> the equal possibility <str<strong>on</strong>g>of</str<strong>on</strong>g> outcomes for points 1 to 6, due to the symmetry <str<strong>on</strong>g>of</str<strong>on</strong>g> the die.<br />

Fermat distinguishes between an assessed value for a throw not taken, with subsequent c<strong>on</strong>tinuati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

the game, and the agreed completi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> play before eight throws. His reas<strong>on</strong>ing is as follows:<br />

“If I try to make a certain score with a single die in eight throws…[and] we agree that I will not make the<br />

first throw; then, according to my theory, I must take in compensati<strong>on</strong> 1/6 th <str<strong>on</strong>g>of</str<strong>on</strong>g> the total sum…Whilst if we<br />

agree further that I will not make the sec<strong>on</strong>d throw, I must, for compensati<strong>on</strong>, get a sixth <str<strong>on</strong>g>of</str<strong>on</strong>g> the remainder<br />

which comes to 5/36 th <str<strong>on</strong>g>of</str<strong>on</strong>g> the total sum…If, after this, we agree that I will not make the third throw, I must<br />

have…a sixth <str<strong>on</strong>g>of</str<strong>on</strong>g> the remaining sum which is 25/216 th <str<strong>on</strong>g>of</str<strong>on</strong>g> the total…And if after that we agree that I will<br />

not make the fourth throw…I must again have a sixth <str<strong>on</strong>g>of</str<strong>on</strong>g> what is left, which is 125/1,296 th <str<strong>on</strong>g>of</str<strong>on</strong>g> the total, and<br />

I agree with you that this is the value <str<strong>on</strong>g>of</str<strong>on</strong>g> the fourth throw, assuming that <strong>on</strong>e has already settled for the<br />

previous throws.”<br />

The argument can be summarised in the following table:<br />

1 Submitted for STA 4000H under the directi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Jeffrey Rosenthal.<br />

8


In Fermat’s theory, the player always has a chance <str<strong>on</strong>g>of</str<strong>on</strong>g> obtaining the whole wager (a chance at least proporti<strong>on</strong>al to<br />

the ratio <str<strong>on</strong>g>of</str<strong>on</strong>g> the <strong>on</strong>e number needed and the six sides <str<strong>on</strong>g>of</str<strong>on</strong>g> the die – that is, a <strong>on</strong>e in six chance). The table above shows<br />

the sequence <str<strong>on</strong>g>of</str<strong>on</strong>g> probabilities associated with the occurrence <str<strong>on</strong>g>of</str<strong>on</strong>g> the needed point <strong>on</strong> the last throw (1/6, 5/6 . 1/6,<br />

5/6 . 5/6 . 1/6 etc.). These are the probabilities associated with the negative binomial distributi<strong>on</strong>, having the required<br />

point occur <strong>on</strong> the last <str<strong>on</strong>g>of</str<strong>on</strong>g> a sequence <str<strong>on</strong>g>of</str<strong>on</strong>g> 1,2,…,8 throws.<br />

In his resp<strong>on</strong>se letter, dated July 29, 1654, Pascal agrees with Fermat’s reas<strong>on</strong>ing, and presents soluti<strong>on</strong>s to two<br />

specific cases <str<strong>on</strong>g>of</str<strong>on</strong>g> the problem <str<strong>on</strong>g>of</str<strong>on</strong>g> points:<br />

1) The case involving a player needing <strong>on</strong>e more point.<br />

2) The case in which a player has acquired the first point.<br />

For the first case, Pascal uses a recursive process to illustrate the soluti<strong>on</strong>. He provides the example <str<strong>on</strong>g>of</str<strong>on</strong>g> two players<br />

wagering 32 pistoles (gold coins <str<strong>on</strong>g>of</str<strong>on</strong>g> various denominati<strong>on</strong>s) each, and begins <strong>by</strong> c<strong>on</strong>sidering a dice game in which<br />

three points are needed. The players’ numbers have equal chances <str<strong>on</strong>g>of</str<strong>on</strong>g> turning up. The following table illustrates<br />

Pascal’s argument: (the ordered pair notati<strong>on</strong> (a,b) refers to the “state” <str<strong>on</strong>g>of</str<strong>on</strong>g> the game at some stage, with player A<br />

having thrown a points, and player B, b points; the pair {c,d} refers to the divisi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the wager).<br />

State <str<strong>on</strong>g>of</str<strong>on</strong>g> Game<br />

Throw Proporti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Wager<br />

distributed<br />

Divisi<strong>on</strong> if player A's<br />

number turns up next<br />

9<br />

Divisi<strong>on</strong> if player B's<br />

number turns up next<br />

Remainder <str<strong>on</strong>g>of</str<strong>on</strong>g> original<br />

Wager<br />

1 1/6 5/6<br />

2 1/6 (5/6) 25/36<br />

3 1/6 (25/36) 125/216<br />

4 1/6 (125/216) 625/1,296<br />

5 1/6 (625/1296) 3125/7,776<br />

6 1/6 (3125/7776) 15,625/46,656<br />

7 1/6 (15625/46656) 78,125/279,936<br />

8 1/6 (78125/279936) 390,625/1,679,616<br />

Accumulated<br />

Totals<br />

1,288,991/1,679,616 390,625/1,679,616<br />

(0.77) (0.23)<br />

Divisi<strong>on</strong> if players<br />

agree to suspend the<br />

game<br />

(2,1) {64,0} {32,32} {48,16}<br />

(2,0) {64,0} {48,16}* {56,8}<br />

*this corresp<strong>on</strong>ds to<br />

the state (2,1) shown<br />

above<br />

(1,0) {56,8}** {32,32} {44,20}<br />

**this corresp<strong>on</strong>ds to<br />

the state (2,0) shown<br />

above


The values distributed up<strong>on</strong> suspensi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the game, c<strong>on</strong>form to the expected values.<br />

The argument is an (early) example <str<strong>on</strong>g>of</str<strong>on</strong>g> the applicati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> a “minimax” principle. Both players wish to<br />

maximize the amount they would receive, and minimize their losses. The “motivati<strong>on</strong>” is illustrated <strong>by</strong><br />

the following “pay<str<strong>on</strong>g>of</str<strong>on</strong>g>f matrix”, with the expected proceeds for player A indicated under the relevant<br />

circumstances.<br />

Player A<br />

Rolls a favourable<br />

number<br />

Does not roll a<br />

favourable number<br />

Player A would like to maximize the row minimums, while player B wishes to minimize the column<br />

maximums. Both would settle <strong>on</strong> 48 (for player A).<br />

After discussing his theory relating to the equitable distributi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the wager amount, Pascal presents the<br />

following rule :<br />

“…the value (<strong>by</strong> which I mean <strong>on</strong>ly the value <str<strong>on</strong>g>of</str<strong>on</strong>g> the opp<strong>on</strong>ent’s m<strong>on</strong>ey) <str<strong>on</strong>g>of</str<strong>on</strong>g> the last game <str<strong>on</strong>g>of</str<strong>on</strong>g> two is double<br />

that <str<strong>on</strong>g>of</str<strong>on</strong>g> the last game <str<strong>on</strong>g>of</str<strong>on</strong>g> three and four times the last game <str<strong>on</strong>g>of</str<strong>on</strong>g> four and eight times the last game <str<strong>on</strong>g>of</str<strong>on</strong>g> five,<br />

etc.”<br />

Using the recursive procedure applied <strong>by</strong> Pascal in the case <str<strong>on</strong>g>of</str<strong>on</strong>g> a game <str<strong>on</strong>g>of</str<strong>on</strong>g> three, with a game <str<strong>on</strong>g>of</str<strong>on</strong>g> four, we<br />

would proceed as follows:<br />

State <str<strong>on</strong>g>of</str<strong>on</strong>g> Game<br />

Agreement to<br />

settle the<br />

wager<br />

Divisi<strong>on</strong> if player A's<br />

number turns up next<br />

10<br />

Player B<br />

No agreement<br />

to settle the<br />

wager<br />

Divisi<strong>on</strong> if player B's<br />

number turns up next<br />

Row<br />

Minimum<br />

48 64 48<br />

48 32 32<br />

Column Maximum 48 64<br />

Divisi<strong>on</strong> if players<br />

agree to suspend the<br />

game<br />

(3,2) {64,0} {32,32} {48,16}<br />

(3,1) {64,0} {48,16}* {56,8}<br />

*this corresp<strong>on</strong>ds to<br />

the state (3,2) shown<br />

above<br />

(3,0) {64,0} {56,8}**<br />

**this corresp<strong>on</strong>ds to<br />

the state (3,1) shown<br />

above<br />

{60,4}


Using Pascal’s terminology, the value <str<strong>on</strong>g>of</str<strong>on</strong>g> the last game <str<strong>on</strong>g>of</str<strong>on</strong>g> four is four.<br />

The rule for distributing a wager <str<strong>on</strong>g>of</str<strong>on</strong>g> 2W (each player providing W), when <strong>on</strong>e <str<strong>on</strong>g>of</str<strong>on</strong>g> the players requires <strong>on</strong>e<br />

more point, is then 2W-W/2 n , where n represents the number <str<strong>on</strong>g>of</str<strong>on</strong>g> points needed for the game (before play<br />

commences).<br />

Pascal suggests that the soluti<strong>on</strong> to the sec<strong>on</strong>d class <str<strong>on</strong>g>of</str<strong>on</strong>g> problems is more complicated:<br />

“…the proporti<strong>on</strong> for the first game is not so easy to find…[it] can be shown, but with a great deal <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

trouble, <strong>by</strong> combinatorial methods…and I have not been able to dem<strong>on</strong>strate it <strong>by</strong> this other method<br />

which I have just explained to you but <strong>on</strong>ly <strong>by</strong> combinati<strong>on</strong>s.”<br />

A rule is presented, without detailing a pro<str<strong>on</strong>g>of</str<strong>on</strong>g>:<br />

“Let the given number <str<strong>on</strong>g>of</str<strong>on</strong>g> games be, for example, 8. Take the first eight even numbers and the first eight<br />

odd numbers thus:<br />

2, 4, 6, 8, 10, 12, 14, 16<br />

and 1, 3, 5, 7, 9, 11, 13, 15.<br />

Multiply the even numbers in the following way: the first <strong>by</strong> the sec<strong>on</strong>d, the product <strong>by</strong> the third, the<br />

product <strong>by</strong> the fourth etc.; multiply the odd numbers in the same way…<br />

The last product <str<strong>on</strong>g>of</str<strong>on</strong>g> the even numbers is the denominator and the last product <str<strong>on</strong>g>of</str<strong>on</strong>g> the odd numbers is the<br />

numerator <str<strong>on</strong>g>of</str<strong>on</strong>g> the fracti<strong>on</strong> which expresses the value <str<strong>on</strong>g>of</str<strong>on</strong>g> the first <strong>on</strong>e <str<strong>on</strong>g>of</str<strong>on</strong>g> eight games…”<br />

If each player wagers W, then the distributi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the wager after the first throw would be<br />

W + W(1/2 . 3/4 . 5/6 . … . (2n-1)/2n)<br />

Where n is the number <str<strong>on</strong>g>of</str<strong>on</strong>g> points required (after getting the first point).<br />

How can this formula be shown to be reas<strong>on</strong>able, using elementary principles <str<strong>on</strong>g>of</str<strong>on</strong>g> probability? One<br />

approach is to examine the possible ways for completing various games after <strong>on</strong>e player acquires the first<br />

point. The tree diagrams in Exhibit 1 (last page) illustrate three cases.<br />

Player A has <strong>on</strong>e point and needs in case a) <strong>on</strong>e more point; b) two more points and c) three more points.<br />

Each c<strong>on</strong>necting line between possible states <str<strong>on</strong>g>of</str<strong>on</strong>g> the game represents an event with probability 1/2, the<br />

probability <str<strong>on</strong>g>of</str<strong>on</strong>g> going from <strong>on</strong>e state to the next. Using the diagrams, the probabilities for player A<br />

obtaining the required points may be assessed. For example, in a) there is a 1/2 probability <str<strong>on</strong>g>of</str<strong>on</strong>g> going from<br />

(1,0) to (2,0), and a 1/2 . 1/2 = 1/4 probability <str<strong>on</strong>g>of</str<strong>on</strong>g> going from (1,0) to (1,1) to (2,1). The probability for<br />

player A getting two points (given <strong>on</strong>e point) is then 1/2+1/4 = 3/4. Similarly, for case b) the probability<br />

is 11/16 and for case c) 21/32.<br />

If these probabilities are used to obtain the expected values for player A, we have:<br />

In case a) 3/4 . 2W = (2/4 + ¼)2W = W +W(1/2)<br />

In case b) 11/16 . 2W = (8/16 + 3/16)2W = W + 3/8 . W = W + W(1/2 . 3/4)<br />

In case c) 21/32 . 2W = (16/32 + 5/32)2W = W + 5/16 . W = W + W(1/2 . 3/4 . 5/6)<br />

11


The above sequence <str<strong>on</strong>g>of</str<strong>on</strong>g> expected values c<strong>on</strong>forms to Pascal’s rule.<br />

To illustrate, Pascal c<strong>on</strong>siders a game in which a player has obtained 1 point and needs 4 more. He notes<br />

that at most 8 plays would be required to complete the game (either player A throws 4 more points, or<br />

player B will throw the required 5). He observes that 1/2 <str<strong>on</strong>g>of</str<strong>on</strong>g> the number <str<strong>on</strong>g>of</str<strong>on</strong>g> combinati<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g> 4 from 8,<br />

divided <strong>by</strong> a sum c<strong>on</strong>sisting <str<strong>on</strong>g>of</str<strong>on</strong>g> this same value, plus the combinati<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g> 5,6,7 and 8 from 8, gives the<br />

same proporti<strong>on</strong> as 1/2 . 3/4 . 5/6 . 7/8 = 35/128.<br />

This is the case, since in general:<br />

1/2 . 3/4 . 5/6 . … . (2n-1)/(2n) = (2n-1)!/n!(n-1)! . 1/2 2n-1 , with:<br />

(2n-1)!/n!(n-1)! = 1/2 . (2n!/n!n!), and<br />

2 2n-1 = 1/2 . (1+1) 2n , and<br />

1/<br />

2 ⋅ ( 1+<br />

1)<br />

2n<br />

= 1/<br />

2 ⋅<br />

2n<br />

∑<br />

i=<br />

0<br />

⎛2n<br />

⎞<br />

⎜ ⎟<br />

⎝i<br />

⎠<br />

In the July 29 th letter, Pascal also provides two tables indicating a divisi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> wagers for games <str<strong>on</strong>g>of</str<strong>on</strong>g> dice<br />

suspended at different stages. The tables are not accompanied with detailed explanati<strong>on</strong>s. Pascal also<br />

relates the observati<strong>on</strong>s, and questi<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g> M<strong>on</strong>sieur de Mere, relating to a game <str<strong>on</strong>g>of</str<strong>on</strong>g> dice requiring (at least)<br />

<strong>on</strong>e six to turn up in 4 throws. The odds given in favour <str<strong>on</strong>g>of</str<strong>on</strong>g> this event are 671 to 625. Again, the<br />

computati<strong>on</strong>s are not provided, however, they corresp<strong>on</strong>d to the probability given <strong>by</strong>:<br />

∑<br />

i=<br />

i<br />

⎟<br />

4 4<br />

⎜ ( 1/<br />

6)<br />

( 5 / 6)<br />

1 ⎝ ⎠<br />

⎛ ⎞ i 4−i<br />

Also, it is noted that there is a “disadvantage” in throwing two sixes in 24 such plays. Using the above<br />

formula, summing the combinati<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g> 1,2,…24 out <str<strong>on</strong>g>of</str<strong>on</strong>g> 24, with associated probabilities 1/36 and 35/36<br />

(to the appropriate exp<strong>on</strong>ents) the probability for the event can be shown to be about 0.49. That M<strong>on</strong>sieur<br />

de Mere noticed in practice this “disadvantage” is remarkable (he must have observed, and / or played,<br />

many such games).<br />

The remaining corresp<strong>on</strong>dence includes an interesting dispute over the interpretati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> combinati<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

events, used as a means for computing equitable settlements for wagers (establishing the proporti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

funds to be distributed).<br />

However, for our purposes, at this stage, it is sufficient to appreciate that combinatorial methods, and the<br />

identificati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> equipossible events, are the cornerst<strong>on</strong>es for the emerging theory <str<strong>on</strong>g>of</str<strong>on</strong>g> probability, with<br />

early applicati<strong>on</strong>s for binomial expansi<strong>on</strong>s. Instead <str<strong>on</strong>g>of</str<strong>on</strong>g> using the terms “chance” or “probability”, our<br />

corresp<strong>on</strong>dents used words such as “favour” or “advantage” and “disadvantage”, which c<strong>on</strong>vey the same<br />

12


meaning, in the c<strong>on</strong>text <str<strong>on</strong>g>of</str<strong>on</strong>g> a gambling envir<strong>on</strong>ment. We also noted the early decisi<strong>on</strong> theory motivati<strong>on</strong>,<br />

and its influence <strong>on</strong> what we will later call expected value.<br />

A survey <str<strong>on</strong>g>of</str<strong>on</strong>g> the literature does give the impressi<strong>on</strong> that Fermat and Pascal, got the “die rolling” for the<br />

mathematical development <str<strong>on</strong>g>of</str<strong>on</strong>g> the science <str<strong>on</strong>g>of</str<strong>on</strong>g> probability.<br />

13


Exhibit1<br />

Tree graphs <str<strong>on</strong>g>of</str<strong>on</strong>g> possible ways for completing two player games<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> dice after <strong>on</strong>e player has acquired the first point<br />

a) Game requiring two points<br />

b) Game requiring three points<br />

c) Game requiring four points<br />

14


Review 1 <str<strong>on</strong>g>of</str<strong>on</strong>g> Christiaan Huygen’s <str<strong>on</strong>g>De</str<strong>on</strong>g> Ratiociniis in <str<strong>on</strong>g>Ludo</str<strong>on</strong>g> <str<strong>on</strong>g>Aleae</str<strong>on</strong>g> (On Reas<strong>on</strong>ing or Computing in<br />

<strong>Games</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> <strong>Chance</strong>)<br />

1. Biographical Notes<br />

Christiaan Huygens was born at The Hague, Netherlands, in 1629. He studied mathematics and law at the<br />

University <str<strong>on</strong>g>of</str<strong>on</strong>g> Leiden, and at the College <str<strong>on</strong>g>of</str<strong>on</strong>g> Orange in Breda. His father was a diplomat, and it would<br />

have been the normal practice for Huygens to follow in that vocati<strong>on</strong>. However, he was more interested<br />

in the natural sciences, and with support from his father he was able to c<strong>on</strong>duct studies and research in<br />

mathematics and physics. He is well known for his work relating to the manufacturing <str<strong>on</strong>g>of</str<strong>on</strong>g> lenses, which<br />

improved the quality <str<strong>on</strong>g>of</str<strong>on</strong>g> telescopes and microscopes. He discovered Titan, identified the rings <str<strong>on</strong>g>of</str<strong>on</strong>g> Saturn,<br />

and invented the first pendulum clock. He resided in Paris for some time, and made the acquaintance <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

pers<strong>on</strong>s familiar with Fermat and Pascal, and with their corresp<strong>on</strong>dence relating to “the problem <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

points” and similar c<strong>on</strong>cepts c<strong>on</strong>cerning games <str<strong>on</strong>g>of</str<strong>on</strong>g> chance. It is believed that Huygens died at The Hague,<br />

in 1695.<br />

2. Review <str<strong>on</strong>g>of</str<strong>on</strong>g> the <str<strong>on</strong>g>De</str<strong>on</strong>g> Ratiociniis in <str<strong>on</strong>g>Ludo</str<strong>on</strong>g> <str<strong>on</strong>g>Aleae</str<strong>on</strong>g><br />

On Reas<strong>on</strong>ing in <strong>Games</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> <strong>Chance</strong>, is cited in the literature as the first published mathematical treatise <strong>on</strong><br />

the subject <str<strong>on</strong>g>of</str<strong>on</strong>g> probability 2 . The work was first printed in 1657, before the earlier corresp<strong>on</strong>dence between<br />

Fermat and Pascal was published, although clearly influenced <strong>by</strong> the c<strong>on</strong>tent <str<strong>on</strong>g>of</str<strong>on</strong>g> those letters. The present<br />

review uses an English translati<strong>on</strong> printed in 1714 <strong>by</strong> S. Keimer, at Fleetstreet, L<strong>on</strong>d<strong>on</strong>.<br />

The treatise is composed <str<strong>on</strong>g>of</str<strong>on</strong>g> a brief introducti<strong>on</strong>, entitled The Value <str<strong>on</strong>g>of</str<strong>on</strong>g> <strong>Chance</strong>s; the statement <str<strong>on</strong>g>of</str<strong>on</strong>g> a<br />

fundamental postulate; 14 propositi<strong>on</strong>s, a corollary and a set <str<strong>on</strong>g>of</str<strong>on</strong>g> five problems for the reader to c<strong>on</strong>sider.<br />

The development <str<strong>on</strong>g>of</str<strong>on</strong>g> the theory is very systematic. Introducing the subject, Huygen’s writes:<br />

“Although in games depending entirely up<strong>on</strong> Fortune, the Success is always uncertain; yet it may be<br />

exactly determined at the same time how much more probability there is that [<strong>on</strong>e] should lose than win”<br />

<strong>Games</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> chance have outcomes that are (generally) unpredictable. At the same time, Huygens claims<br />

that it is possible to make meaningful statements, or measurements, relating to those systems. While the<br />

c<strong>on</strong>cept <str<strong>on</strong>g>of</str<strong>on</strong>g> probability, perhaps even the word itself 3 , is observed within our earlier readings, Huygens’<br />

associati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the phenomena (games <str<strong>on</strong>g>of</str<strong>on</strong>g> chance) with a relative measure <str<strong>on</strong>g>of</str<strong>on</strong>g> chance, is comparable to a<br />

modern treatment <str<strong>on</strong>g>of</str<strong>on</strong>g> the theory <strong>by</strong> first defining a random system or process, and the c<strong>on</strong>cept <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

probability. Having defined the system and the measure, Huygen’s states his fundamental principle:<br />

“As a Foundati<strong>on</strong> to the following Propositi<strong>on</strong>, I shall take Leave to lay down this Self-evident Truth:<br />

That any <strong>on</strong>e <strong>Chance</strong> or Expectati<strong>on</strong> to win any thing is worth just such a Sum, as would procure in the<br />

same <strong>Chance</strong> and Expectati<strong>on</strong> at a fair Lay [or wager]”.<br />

The wording is somewhat difficult follow, however he gives an example, from which it is evident that the<br />

1 Submitted for STA 4000H under the directi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Jeffrey Rosenthal.<br />

2 See for example, Ian Hacking’s The Emergence <str<strong>on</strong>g>of</str<strong>on</strong>g> Probability (Cambridge University Press, 1975), page 92 or<br />

William S. Peters’ Counting for Something – Statistical Principles and Pers<strong>on</strong>alities (Springer – Verlag, 1987),<br />

page 39.<br />

3 Refer to Gerolamo Cardano’s <str<strong>on</strong>g>De</str<strong>on</strong>g> <str<strong>on</strong>g>Ludo</str<strong>on</strong>g> <str<strong>on</strong>g>Aleae</str<strong>on</strong>g>.<br />

15


“Expectati<strong>on</strong>” or value <str<strong>on</strong>g>of</str<strong>on</strong>g> a wager, is the mean <str<strong>on</strong>g>of</str<strong>on</strong>g> the possible proceeds:<br />

“If any <strong>on</strong>e should put 3 Shillings in <strong>on</strong>e Hand, without telling me [which hand it is in], and 7 in the<br />

other, and give me Choice <str<strong>on</strong>g>of</str<strong>on</strong>g> either <str<strong>on</strong>g>of</str<strong>on</strong>g> them; I say, it is the same thing as if he should give me 5<br />

Shillings.”<br />

Although the word “expectati<strong>on</strong>” (the Latin “expectatio” was used in the original work) 1 is used to refer<br />

to the value <str<strong>on</strong>g>of</str<strong>on</strong>g> a wager, its meaning in this example does corresp<strong>on</strong>d to its use in modern probability<br />

theory.<br />

The first propositi<strong>on</strong> states:<br />

“If I expect a or b, and have an equal chance <str<strong>on</strong>g>of</str<strong>on</strong>g> gaining either <str<strong>on</strong>g>of</str<strong>on</strong>g> them, my Expectati<strong>on</strong> is worth<br />

(a+b)/2.”<br />

The expectati<strong>on</strong> is the fair value for a wager. How can this value be calculated in a game where the prizes<br />

are received with equal chance? Huygens reas<strong>on</strong>s as follows: Suppose there is a lottery with two players,<br />

and each player buys a ticket for x, and that it is agreed that the proceeds are a and 2x-a, then each player<br />

could just as easily receive a or 2x-a. Setting 2x-a = b, it follows that the value <str<strong>on</strong>g>of</str<strong>on</strong>g> the lottery ticket is x<br />

=(a+b)/2.<br />

The sec<strong>on</strong>d propositi<strong>on</strong> extends the first to the case <str<strong>on</strong>g>of</str<strong>on</strong>g> three prizes, a,b and c, such that x = (a+b+c)/3 is<br />

the value <str<strong>on</strong>g>of</str<strong>on</strong>g> the expectati<strong>on</strong>; then in the same manner to four prizes, and so <strong>on</strong>.<br />

Propositi<strong>on</strong> III states:<br />

“If the number <str<strong>on</strong>g>of</str<strong>on</strong>g> <strong>Chance</strong>s I have to gain a, be p, and the number <str<strong>on</strong>g>of</str<strong>on</strong>g> <strong>Chance</strong>s I have to gain b, be q.<br />

Supposing the chances equal; my Expectati<strong>on</strong> will then be worth ap+bq / p+q.”<br />

This propositi<strong>on</strong> generalizes the expectati<strong>on</strong> to lotteries with prizes having different chances <str<strong>on</strong>g>of</str<strong>on</strong>g> being<br />

distributed. Huygens gives the following example:<br />

“If I have 3 Expectati<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g> 13 and 2 Expectati<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g> 8, the value <str<strong>on</strong>g>of</str<strong>on</strong>g> my Expectati<strong>on</strong> would <strong>by</strong> this rule be<br />

11.”<br />

Propositi<strong>on</strong>s IV to IX illustrate soluti<strong>on</strong>s to “the problem <str<strong>on</strong>g>of</str<strong>on</strong>g> points”, in a manner analogous to the<br />

reas<strong>on</strong>ing <str<strong>on</strong>g>of</str<strong>on</strong>g> Pascal, although Huygens is looking more generally at the problem, not using any specific<br />

type <str<strong>on</strong>g>of</str<strong>on</strong>g> game as an example. Beginning with simple cases, Huygens solves more complicated problems,<br />

suggesting:<br />

“The best way will be to begin with the most easy Cases <str<strong>on</strong>g>of</str<strong>on</strong>g> the Kind.”<br />

To appreciate the form <str<strong>on</strong>g>of</str<strong>on</strong>g> Huygens’ arguments, c<strong>on</strong>sider Propositi<strong>on</strong> VII:<br />

“Suppose I want two <strong>Games</strong>, and my Adversary four.<br />

Therefore it will either fall out, that <strong>by</strong> winning the next Game, I shall want but <strong>on</strong>e more, and he four, or<br />

<strong>by</strong> losing it I shall want two, and he shall want three. So that <strong>by</strong> Schol Prop.5. and Prop.6., I shall have<br />

an equal <strong>Chance</strong> for 15/16a or 11/16a, which, <strong>by</strong> Prop.1 is just worth 13/16a.”<br />

1 See Ian Hacking, page 95.<br />

16


Propositi<strong>on</strong>s 5 and 6 explain the expectati<strong>on</strong>s when <strong>on</strong>e player needs 1 game, and the other 4 games, and<br />

when <strong>on</strong>e player needs 2 and the other 3 games. Then applying Propositi<strong>on</strong> 1, the expectati<strong>on</strong>s are<br />

effectively averaged to provide the relevant value. A corollary is then given:<br />

“From whence it appears, that he who is to get two <strong>Games</strong>, before another shall get four, has a better<br />

<strong>Chance</strong> than he is to get <strong>on</strong>e, before another gets two <strong>Games</strong>. For in this last Case, namely <str<strong>on</strong>g>of</str<strong>on</strong>g> 1 to 2 his<br />

Share <strong>by</strong> Prop. 4 is but 3/4a, which is less than 13/16a.”<br />

This corollary compares the probabilities relevant to the two cases, using the expectati<strong>on</strong>s. This form <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

comparis<strong>on</strong> has been made <strong>by</strong> earlier writers, using the “odds” approach 1 .<br />

In Popositi<strong>on</strong> IX, Huygens provides a table showing the relative chances for three players in various game<br />

states.<br />

Propositi<strong>on</strong>s IX to XIV present soluti<strong>on</strong>s for problems relating to games <str<strong>on</strong>g>of</str<strong>on</strong>g> dice. The soluti<strong>on</strong>s rely up<strong>on</strong><br />

the earlier propositi<strong>on</strong>s, especially Propositi<strong>on</strong> 3. As an example, Propositi<strong>on</strong> X relates to the rolling <str<strong>on</strong>g>of</str<strong>on</strong>g> a<br />

single die:<br />

“To find how many Throws <strong>on</strong>e may undertake to throw the Number 6 with a single Die.”<br />

Huygens reas<strong>on</strong>s that for the simplest case, <strong>on</strong>e throw, there is 1 chance to get a six, receiving the wager<br />

proceeds a, and 5 chances to receive nothing, so that <strong>by</strong> Propositi<strong>on</strong> 3, the expectati<strong>on</strong> is (1 . a + 5 . 0) /<br />

(1+5) = 1/6a. To compute the expectati<strong>on</strong> for 1 six in two throws, it is noted that if the six turns up <strong>on</strong> the<br />

first die, the expectati<strong>on</strong> will again be a. If not, then referring to the simplest case, there is an expectati<strong>on</strong><br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> 1/6a. Using Propositi<strong>on</strong> 3, there is <strong>on</strong>e way to receive a, and 5 ways to receive the 1/6a (the sides 1 to<br />

5 <strong>on</strong> the die):<br />

(1 . a + 5 . (1/6a)) / (1+5) = 11/36a<br />

This corresp<strong>on</strong>ds to the six appearing <strong>on</strong> the first throw with probability 1/6, or <strong>on</strong> the sec<strong>on</strong>d throw with<br />

probability (5/6) . (1/6). In Huygens system, expectati<strong>on</strong>s for simpler cases are combined using<br />

Propositi<strong>on</strong> 3, to solve more complex problems.<br />

Propositi<strong>on</strong> XIV uses two linear equati<strong>on</strong>s to derive the ratio <str<strong>on</strong>g>of</str<strong>on</strong>g> expected values for two players in the<br />

following case:<br />

“If my self and another play <strong>by</strong> turns with a pair <str<strong>on</strong>g>of</str<strong>on</strong>g> Dice up<strong>on</strong> these Terms, That I shall win if I throw the<br />

Number 7, or he if he throw 6 so<strong>on</strong>est, and he to have the Advantage <str<strong>on</strong>g>of</str<strong>on</strong>g> first Throw: To find the<br />

Proporti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> our <strong>Chance</strong>s.”<br />

With total proceeds set at a, Huygens uses a-x to represent the expectati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the player to throw first, and<br />

x for the sec<strong>on</strong>d player. When it is the first player’s turn, the sec<strong>on</strong>d player’s expectati<strong>on</strong> is x. Huygens<br />

reas<strong>on</strong>s that when the sec<strong>on</strong>d player is to throw, the expectati<strong>on</strong> must be higher (it is c<strong>on</strong>diti<strong>on</strong>ed <strong>on</strong> the<br />

first player not throwing a 6). He refers to this expectati<strong>on</strong> as y. On the first player’s turn, the sec<strong>on</strong>d<br />

player’s expectati<strong>on</strong> (using propositi<strong>on</strong> 3) will be (5 . 0 + 31y)/36 = 31/36y (as there are 5 ways for the first<br />

player to get 6). It follows that 31/36y = x (or x=36/31y). On the sec<strong>on</strong>d player’s turn the expectati<strong>on</strong> is<br />

(6a + 30x) / 36 (there are 6 ways to roll 7) which equals y. Then:<br />

1 Odds are given in the text <strong>by</strong> Cardano and in the corresp<strong>on</strong>dence between Fermat and Pascal.<br />

17


(6a +30x) / 36 = 36/31x ,<br />

with soluti<strong>on</strong> x = 31/61a. The ratio <str<strong>on</strong>g>of</str<strong>on</strong>g> the expected values x to a-x is then 31:30.<br />

The text c<strong>on</strong>cludes with a set <str<strong>on</strong>g>of</str<strong>on</strong>g> five problems for the reader to solve (also the practice in most<br />

c<strong>on</strong>temporary texts in mathematics).<br />

18


Review 1 <str<strong>on</strong>g>of</str<strong>on</strong>g> Dr. John Arbuthnott’s An Argument for Divine Providence, taken from the c<strong>on</strong>stant<br />

Regularity observed in the Births <str<strong>on</strong>g>of</str<strong>on</strong>g> both Sexes<br />

1. Biographical Notes<br />

John Arbuthnott was born in 1667 at Kincardineshire, Scotland. The following passage from Annotated<br />

Readings in the History <str<strong>on</strong>g>of</str<strong>on</strong>g> Statistics, <strong>by</strong> H.A. David and A.W.F Edwards, provides interesting<br />

biographical informati<strong>on</strong>:<br />

John Arbuthnott, physician to Queen Anne, friend <str<strong>on</strong>g>of</str<strong>on</strong>g> J<strong>on</strong>athan Swift and Isaac<br />

Newt<strong>on</strong>…was no stranger to probability…In 1692 he had published (an<strong>on</strong>ymously) a<br />

translati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Huygens’ <str<strong>on</strong>g>De</str<strong>on</strong>g> ratiociniis in ludo aleae (1657) as Of the Laws <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

<strong>Chance</strong>, adding “I believe the Calculati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the Quantity <str<strong>on</strong>g>of</str<strong>on</strong>g> Probability might be<br />

improved to a very useful and pleasant Speculati<strong>on</strong>, and applied to a great many<br />

events which are accidental, besides those <str<strong>on</strong>g>of</str<strong>on</strong>g> <strong>Games</strong>.” There exists a 1694<br />

manuscript <str<strong>on</strong>g>of</str<strong>on</strong>g> Arbuthnott’s which foreshadows his 1710 paper [An Argument for<br />

Divine Providence].<br />

Dr. Arbuthnott died at L<strong>on</strong>d<strong>on</strong> in 1735.<br />

2. Review <str<strong>on</strong>g>of</str<strong>on</strong>g> An Argument for Divine Providence<br />

Dr. Arbuthnott’s paper is the first in our series <str<strong>on</strong>g>of</str<strong>on</strong>g> readings to apply the developing theory <str<strong>on</strong>g>of</str<strong>on</strong>g> probability to<br />

phenomena other than games <str<strong>on</strong>g>of</str<strong>on</strong>g> chance. Earlier, Pascal did use probabilistic reas<strong>on</strong>ing in his article “The<br />

Wager” to advocate a life <str<strong>on</strong>g>of</str<strong>on</strong>g> faith, in an “age <str<strong>on</strong>g>of</str<strong>on</strong>g> reas<strong>on</strong>”. Interestingly, Arbuthnott’s subject is related to<br />

Pascal’s.<br />

There are two principal arguments made in the article:<br />

1) It is not <strong>by</strong> chance that the number <str<strong>on</strong>g>of</str<strong>on</strong>g> male births is about the same as the number <str<strong>on</strong>g>of</str<strong>on</strong>g> female births.<br />

2) It is not <strong>by</strong> chance that there are more males born than females, and in a c<strong>on</strong>stant proporti<strong>on</strong>.<br />

To support the first propositi<strong>on</strong>, applicati<strong>on</strong> is made <str<strong>on</strong>g>of</str<strong>on</strong>g> the binomial expansi<strong>on</strong> relating to a die with two<br />

sides marked M (male) and F (female). Essentially, (M+F) n is a model for the possible combinati<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

male and female children born. Arbuthnott observes that as n increases, the binomial coefficient<br />

associated with the term having identical numbers <str<strong>on</strong>g>of</str<strong>on</strong>g> M and F, becomes small compared to the sum <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

the other terms.<br />

“It is visible from what has been said, that with a very great number <str<strong>on</strong>g>of</str<strong>on</strong>g> Dice…(supposing M to denote<br />

Male and F Female) that in the vast number <str<strong>on</strong>g>of</str<strong>on</strong>g> Mortals, there would be but a small part <str<strong>on</strong>g>of</str<strong>on</strong>g> all the<br />

possible <strong>Chance</strong>s for its happening at any assignable time, an equal Number <str<strong>on</strong>g>of</str<strong>on</strong>g> Males and Females<br />

should be born.”<br />

Arbuthnott is aware that in reality there is variati<strong>on</strong> between the number <str<strong>on</strong>g>of</str<strong>on</strong>g> males and females:<br />

“It is indeed to be c<strong>on</strong>fessed that this Equality <str<strong>on</strong>g>of</str<strong>on</strong>g> Males and Females is not Mathematical but Physical,<br />

1 Submitted for STA 4000H under the directi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Jeffrey Rosenthal.<br />

19


which alters much <str<strong>on</strong>g>of</str<strong>on</strong>g> the foregoing Calculati<strong>on</strong>; for in this Case [the number <str<strong>on</strong>g>of</str<strong>on</strong>g> male and female terms]<br />

…will lean to <strong>on</strong>e side or the other.”<br />

However, he writes:<br />

“But it is very improbable (if mere <strong>Chance</strong> governed) that they would never reach as far as the<br />

Extremities…”<br />

While it would be possible to have large differences in the numbers <str<strong>on</strong>g>of</str<strong>on</strong>g> males and females (with<br />

binomially distributed data having probability 1/2), the probability <str<strong>on</strong>g>of</str<strong>on</strong>g> this becomes very small when n is<br />

large. C<strong>on</strong>trary to Arbuthnott’s argument, it could be reas<strong>on</strong>ed that chance would account for the<br />

approximate equality in numbers <str<strong>on</strong>g>of</str<strong>on</strong>g> males and females.<br />

The sec<strong>on</strong>d propositi<strong>on</strong> discounts chance as the cause for the larger number <str<strong>on</strong>g>of</str<strong>on</strong>g> male births observed<br />

annually. The form <str<strong>on</strong>g>of</str<strong>on</strong>g> the argument is interesting, because it is similar to a test <str<strong>on</strong>g>of</str<strong>on</strong>g> significance.<br />

Arbuthnott states the Problem:<br />

“A lays against B, that every Year there shall be born more Males than Females: To find A’s Lot, or the<br />

Value <str<strong>on</strong>g>of</str<strong>on</strong>g> his Expectati<strong>on</strong>.”<br />

A hypothesis is being made in the form <str<strong>on</strong>g>of</str<strong>on</strong>g> a wager. Arbuthnott notes that the probability that there are<br />

more males than females born must be less than 1 /2 (assuming that there is an equal chance for a male or<br />

female birth). For this “test” however, he sets the chance at 1 /2 (which would result in a higher<br />

probability), and notes that for the number <str<strong>on</strong>g>of</str<strong>on</strong>g> males to be larger than the number <str<strong>on</strong>g>of</str<strong>on</strong>g> females in 82<br />

c<strong>on</strong>secutive years (for which he has data <strong>on</strong> christenings), the lot would be 1/2 82 . The lot would be even<br />

less if the numbers were to be in “c<strong>on</strong>stant proporti<strong>on</strong>”. Since the data do not support B (in every year<br />

from 1629 to 1710, male christenings outnumber female christenings), Arbutnott reas<strong>on</strong>s:<br />

“From whence it follows, that it is Art, not <strong>Chance</strong>, that governs.”<br />

The hypothesis <str<strong>on</strong>g>of</str<strong>on</strong>g> equal probability is rejected, and Arbuthnott attributes the observed proporti<strong>on</strong>s to<br />

Divine Providence.<br />

The sec<strong>on</strong>d argument has been referred to as the first published test <str<strong>on</strong>g>of</str<strong>on</strong>g> significance 1 .<br />

1 Ian Hacking, The Emergence <str<strong>on</strong>g>of</str<strong>on</strong>g> Probability, page 168.<br />

20


Review 1 <str<strong>on</strong>g>of</str<strong>on</strong>g> Pierre Rem<strong>on</strong>d de M<strong>on</strong>tmort’s On the Game <str<strong>on</strong>g>of</str<strong>on</strong>g> Thirteen<br />

1. Biographical Notes<br />

With reference to Isaac Todhunter’s text, A History <str<strong>on</strong>g>of</str<strong>on</strong>g> the Mathematical Theory <str<strong>on</strong>g>of</str<strong>on</strong>g> Probability2, Pierre<br />

Rem<strong>on</strong>d de M<strong>on</strong>tmort devoted himself to religi<strong>on</strong>, philosophy and mathematics. He served in the<br />

capacity <str<strong>on</strong>g>of</str<strong>on</strong>g> cathedral can<strong>on</strong> at Notre-Dame in Paris, from which he resigned, in order to marry. In 1708<br />

he published his treatise <strong>on</strong> “chances”, Essai d’Analyse sur les Jeux de Hazards. L.E Maistrov, in his<br />

text, Probability Theory – A Historical Sketch3, provides the following biographical informati<strong>on</strong>:<br />

2. Review <str<strong>on</strong>g>of</str<strong>on</strong>g> the article<br />

“Pierre Rem<strong>on</strong>d de M<strong>on</strong>tmort (1678-1719) was a French mathematician<br />

as well as a student <str<strong>on</strong>g>of</str<strong>on</strong>g> philosophy and religi<strong>on</strong>. He was in<br />

corresp<strong>on</strong>dence with a number <str<strong>on</strong>g>of</str<strong>on</strong>g> prominent mathematicians (N.<br />

Bernoulli, J. Bernoulli, Leibniz, etc.) and was a well-established and<br />

authoritative member <str<strong>on</strong>g>of</str<strong>on</strong>g> the scientific community. In particular, Leibniz<br />

selected him as his representative at the commissi<strong>on</strong> set up <strong>by</strong> the Royal<br />

Society to rule <strong>on</strong> the c<strong>on</strong>troversy between Newt<strong>on</strong> and Leibniz<br />

c<strong>on</strong>cerning priority in the discovery <str<strong>on</strong>g>of</str<strong>on</strong>g> differential and integral<br />

calculus…His basic work <strong>on</strong> probability theory was the “Essai<br />

d’Analyse sur les Jeux de Hazard”. It went through two editi<strong>on</strong>s, the first<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> which was printed in Paris in 1708…the sec<strong>on</strong>d…appeared in 1713,<br />

although Todhunter claims [1714]. The first part c<strong>on</strong>tains the theory <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

combinatorics; the sec<strong>on</strong>d discusses certain games <str<strong>on</strong>g>of</str<strong>on</strong>g> chance with cards;<br />

the third deals with games <str<strong>on</strong>g>of</str<strong>on</strong>g> chance with dice, the fourth part c<strong>on</strong>tains<br />

the soluti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> various problems including the five problems proposed <strong>by</strong><br />

Huygens…”<br />

The present review c<strong>on</strong>cerns an article in the sec<strong>on</strong>d part <str<strong>on</strong>g>of</str<strong>on</strong>g> M<strong>on</strong>tmort’s Essai d’Analyse, entitled “On the<br />

Game <str<strong>on</strong>g>of</str<strong>on</strong>g> Thirteen”. An English translati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> that article is available in Annotated Readings in the<br />

History <str<strong>on</strong>g>of</str<strong>on</strong>g> Statistics <strong>by</strong> H.A. Davids and A.W.F. Edwards.4<br />

In the first secti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the article, M<strong>on</strong>tmort provides a descripti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the play <str<strong>on</strong>g>of</str<strong>on</strong>g> Thirteen:<br />

“The players first draw a card to determine the banker. Let us suppose<br />

that this is Peter and that the number <str<strong>on</strong>g>of</str<strong>on</strong>g> players is as desired. From a<br />

complete pack <str<strong>on</strong>g>of</str<strong>on</strong>g> fifty-two cards, judged adequately shuffled, Peter<br />

draws <strong>on</strong>e after the other, calling <strong>on</strong>e as he draws the first card, two as<br />

he draws the sec<strong>on</strong>d, three as he draws the third, and so <strong>on</strong>, until the<br />

1 Submitted for STA 4000H under the directi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Jeffrey Rosenthal.<br />

2 A History <str<strong>on</strong>g>of</str<strong>on</strong>g> the Mathematical Theory <str<strong>on</strong>g>of</str<strong>on</strong>g> Probability, page 78, <strong>by</strong> Isaac Todhunter, Chelsea Publishing Co., New<br />

York (1965 unaltered reprint <str<strong>on</strong>g>of</str<strong>on</strong>g> the First Editi<strong>on</strong>, Cambridge 1865).<br />

3 Probability Theory – A Historical Sketch, page 76, <strong>by</strong> Le<strong>on</strong>id E. Maistrov, Academic Press Inc., New York 1974.<br />

4 Annotated Readings in the History <str<strong>on</strong>g>of</str<strong>on</strong>g> Statistics , Springer-Verlag, New York 2001.<br />

21


thirteenth, calling king. Then, if in this entire sequence <str<strong>on</strong>g>of</str<strong>on</strong>g> cards he has<br />

not drawn any with the rank he has called, he pays what each <str<strong>on</strong>g>of</str<strong>on</strong>g> the<br />

players has staked and yields to the player <strong>on</strong> his right. But if in the<br />

sequence <str<strong>on</strong>g>of</str<strong>on</strong>g> thirteen cards, he happens to draw the card he calls, for<br />

example, drawing an ace as he calls <strong>on</strong>e, or a two as he calls two, or a<br />

three as he calls three, and so <strong>on</strong>, then he takes all the stakes and begins<br />

again as before, calling <strong>on</strong>e, then two, and so <strong>on</strong>…”<br />

Provisi<strong>on</strong> is made in the rules for a new deck <str<strong>on</strong>g>of</str<strong>on</strong>g> cards should the dealer use all the cards in the first set.<br />

The game <str<strong>on</strong>g>of</str<strong>on</strong>g> Thirteen provides an early example <str<strong>on</strong>g>of</str<strong>on</strong>g> a problem relating to coincidences or matches.<br />

M<strong>on</strong>tmort describes a method for computing the chance or expectati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> drawing a card matching the<br />

number called <strong>by</strong> the banker. Since the time <str<strong>on</strong>g>of</str<strong>on</strong>g> Cardano, questi<strong>on</strong>s relating to expectati<strong>on</strong> have been<br />

solved <strong>by</strong> enumerating the favourable cases and the total possible events. M<strong>on</strong>tmort observes:<br />

“Let the cards with which Peter plays be represented <strong>by</strong> a,b,c, d, etc… it<br />

must be noted that these letters do not always find their place in a<br />

manner useful to the banker. For example, a, b, c produces <strong>on</strong>ly <strong>on</strong>e to<br />

the pers<strong>on</strong> with the cards although each <str<strong>on</strong>g>of</str<strong>on</strong>g> these three letters is in its<br />

place. Likewise, b, a, c, d produces <strong>on</strong>ly <strong>on</strong>e win for Peter, although <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

the letters c and d is in its place. The difficulty <str<strong>on</strong>g>of</str<strong>on</strong>g> this problem is in<br />

disentangling how many times each letter is in its place useful and how<br />

many times it is useless.”<br />

To solve the problem, M<strong>on</strong>tmort first c<strong>on</strong>siders a game with <strong>on</strong>ly two cards, an ace and a two. There is<br />

<strong>on</strong>ly <strong>on</strong>e way in which the banker can receive the proceeds <str<strong>on</strong>g>of</str<strong>on</strong>g> the wager, an ace has to be the first card.<br />

M<strong>on</strong>tmort then computes the expectati<strong>on</strong>, essentially an applicati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Huygens' 3rd Propositi<strong>on</strong> in <str<strong>on</strong>g>De</str<strong>on</strong>g><br />

Ratiociniis. If the proceeds total A, then the banker's expectati<strong>on</strong> is (1 . A + 1 . 0) / 2 = 1/2 A.<br />

The next case c<strong>on</strong>sidered is a game with three cards, represented <strong>by</strong> the letters a,b,c. M<strong>on</strong>tmort observes<br />

that <str<strong>on</strong>g>of</str<strong>on</strong>g> the six possible combinati<strong>on</strong>s for the letters (representing the possible orders for dealing the cards),<br />

four are favourable to the banker (and two are not favourable):<br />

"...there are two with a in first place; there is <strong>on</strong>e with b in sec<strong>on</strong>d place,<br />

a not having been in first place; and <strong>on</strong>e where c is in third place, a not<br />

having been in first place and b not having been in sec<strong>on</strong>d place."<br />

It follows that the expectati<strong>on</strong> is (4 . A + 2 . 0) / 6 = 2/3 A.<br />

Similarly four and five card games are c<strong>on</strong>sidered, indicating the expectati<strong>on</strong>s as 5/8 A and 19/30 A,<br />

respectively. The expectati<strong>on</strong>s for games with <strong>on</strong>e to five cards allows M<strong>on</strong>tmort to suggest a formula for<br />

computing the banker's expectati<strong>on</strong> generally, in a recursive manner:<br />

[g(p-1)+d] / p<br />

where:<br />

p is the number <str<strong>on</strong>g>of</str<strong>on</strong>g> cards;<br />

g is the espectati<strong>on</strong> when there are p-1 cards, and<br />

d is the expectati<strong>on</strong> when there are p-2 cards.<br />

22


A table is given showing the expectati<strong>on</strong>s for games up to 13 cards. The banker's expectati<strong>on</strong> for the<br />

game <str<strong>on</strong>g>of</str<strong>on</strong>g> Thirteen is presented as 109,339,663/172,972,800 A.<br />

M<strong>on</strong>tmort observes that the expectati<strong>on</strong>s can be expressed as series in the form:<br />

1 - 1/(1.2) + 1/(1.2.3) - 1/(1.2.3.4) +...,<br />

with alternating positive and negative terms, c<strong>on</strong>sisting <str<strong>on</strong>g>of</str<strong>on</strong>g> numerators 1 and denominators 1…(p-2)(p-1)p,<br />

where p is the number <str<strong>on</strong>g>of</str<strong>on</strong>g> cards. The rapid c<strong>on</strong>vergence <str<strong>on</strong>g>of</str<strong>on</strong>g> the expectati<strong>on</strong>s to a value between 5/8 and<br />

19/30, does not appear to have been noticed. An examinati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the recursive formula may have<br />

indicated that as the number <str<strong>on</strong>g>of</str<strong>on</strong>g> cards (p) in the game increases, the value identified as g (expectati<strong>on</strong>)<br />

stabilizes, since (p-1)/p gets closer to 1, and d/p becomes small. M<strong>on</strong>tmort is principally interested in<br />

explaining how the expectati<strong>on</strong> is computed in the game <str<strong>on</strong>g>of</str<strong>on</strong>g> Thirteen, and describing the mathematical<br />

expressi<strong>on</strong>s from which the expectati<strong>on</strong>s can be computed.<br />

In additi<strong>on</strong> to explaining how the expectati<strong>on</strong> is derived, M<strong>on</strong>tmort provides a table showing the number<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> possible favourable deals from specific cards in a game. For example, in a game with five cards, there<br />

are 24 ways for an ace to be dealt as the banker calls <strong>on</strong>e, 18 ways for the two to be dealt as two is called<br />

and so <strong>on</strong>. The table can be used for games having up to eight cards. M<strong>on</strong>tmort completes the article<br />

with some commentary relating to patterns in that table.<br />

M<strong>on</strong>tmort’s article adds to the variety <str<strong>on</strong>g>of</str<strong>on</strong>g> problems c<strong>on</strong>sidered <strong>by</strong> probability science since the time <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

Cardano. M<strong>on</strong>tmort has clearly used Huygens approach, and the principle stated in Propositi<strong>on</strong> IV <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>De</str<strong>on</strong>g><br />

Ratiociniis:<br />

“…the best way will be to begin with the most easy Cases <str<strong>on</strong>g>of</str<strong>on</strong>g> the kind.”<br />

23


Review 1 <str<strong>on</strong>g>of</str<strong>on</strong>g> James Bernoulli’s Theorem…From Artis C<strong>on</strong>jectandi<br />

1. Biographical Notes<br />

According to W.W.R. Ball in A Short Account <str<strong>on</strong>g>of</str<strong>on</strong>g> the History <str<strong>on</strong>g>of</str<strong>on</strong>g> Mathematics2:<br />

2. Review <str<strong>on</strong>g>of</str<strong>on</strong>g> the article<br />

“Jacob or James Bernoulli was born at Bâle <strong>on</strong> <str<strong>on</strong>g>De</str<strong>on</strong>g>cember 27, 1654; in<br />

1687 he was appointed chair <str<strong>on</strong>g>of</str<strong>on</strong>g> mathematics in the university there; and<br />

occupied it until his death <strong>on</strong> August 16, 1705…In his Artis C<strong>on</strong>jectandi,<br />

published in 1713, he established the fundamental principles <str<strong>on</strong>g>of</str<strong>on</strong>g> the<br />

calculus <str<strong>on</strong>g>of</str<strong>on</strong>g> probabilities…His higher lectures were mostly <strong>on</strong> the theory<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> series…”<br />

In Chapter IV <str<strong>on</strong>g>of</str<strong>on</strong>g> Part IV <str<strong>on</strong>g>of</str<strong>on</strong>g> his text <strong>on</strong> probability, Artis C<strong>on</strong>jectandi (published in 1713), James<br />

Bernoulli writes:<br />

“Something further must be c<strong>on</strong>templated here which perhaps no <strong>on</strong>e<br />

has thought <str<strong>on</strong>g>of</str<strong>on</strong>g> about till now. It certainly remains to be inquired<br />

whether after the number <str<strong>on</strong>g>of</str<strong>on</strong>g> observati<strong>on</strong>s has been increased, the<br />

probability is increased <str<strong>on</strong>g>of</str<strong>on</strong>g> attaining the true ratio between the numbers<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> cases in which some event can happen and in which it cannot happen,<br />

so that the probability finally exceeds any given degree <str<strong>on</strong>g>of</str<strong>on</strong>g> certainty…”3<br />

This review summarizes Bernoulli’s proposed soluti<strong>on</strong> to this inquiry largely in his own words, as<br />

presented in Chapter V <str<strong>on</strong>g>of</str<strong>on</strong>g> Part IV <str<strong>on</strong>g>of</str<strong>on</strong>g> Artis C<strong>on</strong>jectandi.<br />

For the purpose <str<strong>on</strong>g>of</str<strong>on</strong>g> this summary, an abridged English translati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> a copy available in German4 has been<br />

prepared. The Latin, German and abridged English versi<strong>on</strong>s are attached for reference. It should be<br />

noted that in the German and English copies, some mathematical notati<strong>on</strong> differs from the original, using<br />

instead a more c<strong>on</strong>temporary system.<br />

Bernoulli begins <strong>by</strong> presenting a set <str<strong>on</strong>g>of</str<strong>on</strong>g> increasingly complex lemmas which will be used to prove his<br />

propositi<strong>on</strong>:<br />

Lemma 1<br />

Given a set <str<strong>on</strong>g>of</str<strong>on</strong>g> natural numbers<br />

1 Submitted for STA 4000H under the directi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Jeffrey Rosenthal.<br />

2 Originally published <strong>by</strong> MacMillan & Co. Ltd., L<strong>on</strong>d<strong>on</strong> 1912, pages 366 and 367.<br />

3 Translati<strong>on</strong> <strong>by</strong> Bing Sung (1966). Translati<strong>on</strong>s from James Bernoulli. <str<strong>on</strong>g>De</str<strong>on</strong>g>partment <str<strong>on</strong>g>of</str<strong>on</strong>g> Statistics, Harvard<br />

University, Cambridge, Massachusetts.<br />

4 Electr<strong>on</strong>ic Research Archive for Mathematics: (Ostwald's Klassiker d. exact. Wissensch. No. 107 u. 108.)<br />

Published: (1899).<br />

24


…<br />

0, 1, 2, ..., r-1, r, r+1, ..., r+s<br />

c<strong>on</strong>tinued such that the last member is a multiple <str<strong>on</strong>g>of</str<strong>on</strong>g> r+s, for example<br />

nr+ns, the new set is:<br />

0, 1, 2, ..., nr-n, ..., nr, ..., nr+n, ..., nr+ns.<br />

With increasing n, the terms between nr and nr+n or nr-n, similarly the<br />

terms between nr+n, nr-n or nr+ns and 0 increase. No matter how large<br />

n is, the terms greater than nr+n will not exceed s-1 times the number <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

terms between nr and nr+n. The number <str<strong>on</strong>g>of</str<strong>on</strong>g> terms below nr-n will not<br />

exceed r-1 times the terms between nr-n and nr.<br />

Lemma 2<br />

If r+s is raised to an exp<strong>on</strong>ent, then the expansi<strong>on</strong> will have <strong>on</strong>e more<br />

term than the exp<strong>on</strong>ent.<br />

Lemma 3<br />

In the expansi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the binomial r+s with exp<strong>on</strong>ent an integral multiple<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> r+s=t, for example n(r+s)=nt, then first, there is a term M, the largest<br />

value <str<strong>on</strong>g>of</str<strong>on</strong>g> the terms, if the number <str<strong>on</strong>g>of</str<strong>on</strong>g> terms before and after M are in the<br />

proporti<strong>on</strong> s to r ... the closer terms to M <strong>on</strong> the left or right are larger<br />

than the more distant terms. Sec<strong>on</strong>dly, the ratio <str<strong>on</strong>g>of</str<strong>on</strong>g> M to a term closer to<br />

it is smaller than the ratio <str<strong>on</strong>g>of</str<strong>on</strong>g> that closer term to <strong>on</strong>e more distant,<br />

provided the number <str<strong>on</strong>g>of</str<strong>on</strong>g> intermediate terms is the same.<br />

Lemma 4<br />

In the expansi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> a binomial with exp<strong>on</strong>ent nt, n can be made so large<br />

that the ratio <str<strong>on</strong>g>of</str<strong>on</strong>g> the largest term M to other terms Ln and Rn which are n<br />

terms to the left or right from M, can be made arbitrarily large.<br />

The pro<str<strong>on</strong>g>of</str<strong>on</strong>g>s for Lemmas 3 and 4 are detailed in L.E. Maistrov’s book Probability Theory – A Historical<br />

Sketch1. In both cases binomial expansi<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g> the terms are divided, and algebraic simplificati<strong>on</strong><br />

presents the required limiting results.<br />

Lemma 5<br />

In the expansi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> a binomial with exp<strong>on</strong>ent nt, n may be selected so<br />

large that the ratio <str<strong>on</strong>g>of</str<strong>on</strong>g> the sum <str<strong>on</strong>g>of</str<strong>on</strong>g> all terms from the largest term M to the<br />

terms Ln and Rn to the sum <str<strong>on</strong>g>of</str<strong>on</strong>g> the remaining terms, may be made<br />

arbitrarily large.<br />

1 Probability Theory – A Historical Sketch, pages 72 and 73, <strong>by</strong> Le<strong>on</strong>id E. Maistrov, Academic Press Inc., New<br />

York 1974.<br />

25


Pro<str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>of</str<strong>on</strong>g> Lemma5<br />

According to Lemma 4, as n becomes infinitely large, M / Ln becomes<br />

infinite, then the ratios L1 / Ln+1 , L2 / Ln+2, L3 / Ln+3 become all the more<br />

infinite. So it then follows that:<br />

that is, the sum <str<strong>on</strong>g>of</str<strong>on</strong>g> the terms between M and Ln is infinitely greater than<br />

the sum <str<strong>on</strong>g>of</str<strong>on</strong>g> the terms left <str<strong>on</strong>g>of</str<strong>on</strong>g> Ln. Since <strong>by</strong> Lemma 1 the number <str<strong>on</strong>g>of</str<strong>on</strong>g> terms<br />

left <str<strong>on</strong>g>of</str<strong>on</strong>g> Ln exceeds the terms between Ln and M <strong>by</strong> <strong>on</strong>ly (s-1) times (that is,<br />

a finite number <str<strong>on</strong>g>of</str<strong>on</strong>g> times), and then from Lemma 3 the terms become<br />

smaller more distant from Ln, then (the sum <str<strong>on</strong>g>of</str<strong>on</strong>g>) all the terms between Ln<br />

and M (even if M is not included) will be infinitely larger than (the sum)<br />

left <str<strong>on</strong>g>of</str<strong>on</strong>g> Ln. In the same way ... [for the right side]<br />

The Propositi<strong>on</strong> is then stated as:<br />

Let the number <str<strong>on</strong>g>of</str<strong>on</strong>g> favourable cases to the number <str<strong>on</strong>g>of</str<strong>on</strong>g> unfavourable cases be exactly or<br />

nearly r/s, therefore to all the cases as r/r+s = r/t - if r+s = t - this last ratio is between<br />

r+1/t and r-1/t. We can show, as many observati<strong>on</strong>s can be taken that it becomes more<br />

probable arbitrarily <str<strong>on</strong>g>of</str<strong>on</strong>g>ten (for example, c - times) that the ratio <str<strong>on</strong>g>of</str<strong>on</strong>g> favourable to all<br />

observati<strong>on</strong>s lies in the range with boundaries r+1/t and r-1/t.<br />

Bernoulli observes that given a probability r/t for a favourable outcome, and a probability s/t for an<br />

unfavourable outcome, in nt trials (with t = r+s), the number <str<strong>on</strong>g>of</str<strong>on</strong>g> (possible) events with all favourable<br />

outcomes, all but <strong>on</strong>e favourable outcomes, all but two favourable outcomes, etc. are<br />

nt<br />

r ,<br />

nt ⎞ nt−1<br />

⎟r<br />

s<br />

⎛<br />

⎜<br />

⎝1<br />

⎠<br />

,<br />

L<br />

L + L + L + ...<br />

+ L<br />

1<br />

+ L<br />

nt ⎞ nt−2<br />

2<br />

⎟r<br />

s<br />

⎛<br />

⎜<br />

⎝2<br />

⎠<br />

2<br />

+ L<br />

,<br />

3<br />

+ ... + L<br />

n+<br />

1 n+<br />

2 n+<br />

3<br />

2n<br />

These corresp<strong>on</strong>d to the terms in the expansi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> r+s, for which a number <str<strong>on</strong>g>of</str<strong>on</strong>g> useful properties were<br />

established in the lemmas. First, the number <str<strong>on</strong>g>of</str<strong>on</strong>g> trials with nr favourable outcomes and ns unfavourable<br />

outcomes is M. Next, the number <str<strong>on</strong>g>of</str<strong>on</strong>g> trials with at least nr-n and at most nr+n favourable outcomes is the<br />

sum <str<strong>on</strong>g>of</str<strong>on</strong>g> terms between the two limits Ln and Rn defined in Lemma 4. Bernoulli can then write:<br />

Since the binomial exp<strong>on</strong>ent can be selected so large that the sum <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

terms which are between both bounds Ln and Rn is more than c times the<br />

sum <str<strong>on</strong>g>of</str<strong>on</strong>g> all the remaining terms outside <str<strong>on</strong>g>of</str<strong>on</strong>g> these bounds, (from Lemmas 4<br />

and 5), it follows then that the number <str<strong>on</strong>g>of</str<strong>on</strong>g> observati<strong>on</strong>s can be taken so<br />

large that the number <str<strong>on</strong>g>of</str<strong>on</strong>g> trials in which the ratio <str<strong>on</strong>g>of</str<strong>on</strong>g> the number <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

favourable cases to all cases will not cross over the bounds (nr+n)/nt<br />

and (nr-n)/nt or (r+1)/t and (r-1)/t, is more than c times the remaining<br />

cases, that is, that it is more than c times probable that the ratio <str<strong>on</strong>g>of</str<strong>on</strong>g> the<br />

26<br />

n<br />

= ∞


number <str<strong>on</strong>g>of</str<strong>on</strong>g> favourable to all cases does not cross over the bounds (r+1)/t<br />

and (r-1)/t.<br />

James Bernoulli’s soluti<strong>on</strong> is apparently the first pro<str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>of</str<strong>on</strong>g> the Law <str<strong>on</strong>g>of</str<strong>on</strong>g> Large Numbers, which informally is<br />

stated as:<br />

The law which states that the larger a sample, the nearer its mean is to<br />

that <str<strong>on</strong>g>of</str<strong>on</strong>g> the parent populati<strong>on</strong> from which the sample is drawn.<br />

27


Review 1 <str<strong>on</strong>g>of</str<strong>on</strong>g> Abraham de Moivre’s A Method <str<strong>on</strong>g>of</str<strong>on</strong>g> approximating the Sum <str<strong>on</strong>g>of</str<strong>on</strong>g> Terms <str<strong>on</strong>g>of</str<strong>on</strong>g> the Binomial<br />

(a+b) n …From The Doctrine <str<strong>on</strong>g>of</str<strong>on</strong>g> <strong>Chance</strong>s<br />

1. Biographical Notes<br />

According to Isaac Todhunter in his text, A History <str<strong>on</strong>g>of</str<strong>on</strong>g> the Mathematical Theory <str<strong>on</strong>g>of</str<strong>on</strong>g> Probability2:<br />

“Abraham de Moivre was born at Vitri, in Champagne, in 1667. On<br />

account <str<strong>on</strong>g>of</str<strong>on</strong>g> the revocati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the edict <str<strong>on</strong>g>of</str<strong>on</strong>g> Nantes3, in 1685, he took shelter<br />

in England, where he supported himself <strong>by</strong> giving instructi<strong>on</strong> in<br />

mathematics and answers to questi<strong>on</strong>s relating to chances and annuities.<br />

He died at L<strong>on</strong>d<strong>on</strong> in 1754…<str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre was elected a Fellow <str<strong>on</strong>g>of</str<strong>on</strong>g> the<br />

Royal Society in 1697…It is recorded that Newt<strong>on</strong> himself, in the later<br />

years <str<strong>on</strong>g>of</str<strong>on</strong>g> his life, used to reply to inquirers respecting mathematics in<br />

these words: ‘Go to Mr. <str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre, he knows these things better than I<br />

do’…”<br />

<str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre is well known for the theorem:<br />

n<br />

[cos( θ ) + isin(<br />

θ )] = cos( nθ<br />

) + isin(<br />

nθ<br />

)<br />

2. Review <str<strong>on</strong>g>of</str<strong>on</strong>g> the article<br />

This review relates to a supplementary article entitled A Method <str<strong>on</strong>g>of</str<strong>on</strong>g> approximating the Sum <str<strong>on</strong>g>of</str<strong>on</strong>g> the Terms <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

the Binomial (a+b) n expanded into a Series, from whence are deduced some practical Rules to estimate<br />

the <str<strong>on</strong>g>De</str<strong>on</strong>g>gree <str<strong>on</strong>g>of</str<strong>on</strong>g> Assent which is to be given to Experiments (referred to as the Approximatio), which appears<br />

in later editi<strong>on</strong>s (after 1733) <str<strong>on</strong>g>of</str<strong>on</strong>g> Abraham <str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre’s text <strong>on</strong> probabilities, The Doctrine <str<strong>on</strong>g>of</str<strong>on</strong>g> <strong>Chance</strong>s, first<br />

published in 1718.<br />

<str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre’s mathematical presentati<strong>on</strong> in the Approximatio begins with a discussi<strong>on</strong> relating to<br />

approximating the ratio <str<strong>on</strong>g>of</str<strong>on</strong>g> the middle term <str<strong>on</strong>g>of</str<strong>on</strong>g> the binomial (1+1) raised to very large n, to the sum <str<strong>on</strong>g>of</str<strong>on</strong>g> all<br />

terms (2 n ). It is indicated that this approximati<strong>on</strong> was developed several years earlier. As a result <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

c<strong>on</strong>tributi<strong>on</strong>s from James Stirling, it was found that the approximate ratio could be written as<br />

where c is the circumference <str<strong>on</strong>g>of</str<strong>on</strong>g> a circle with radius equal to 1. The value <str<strong>on</strong>g>of</str<strong>on</strong>g> c is then 2π.<br />

<str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre next states:<br />

2<br />

nc<br />

1 Submitted for STA 4000H under the directi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Jeffrey Rosenthal.<br />

2 A History <str<strong>on</strong>g>of</str<strong>on</strong>g> the Mathematical Theory <str<strong>on</strong>g>of</str<strong>on</strong>g> Probability, page 78, <strong>by</strong> Isaac Todhunter, Chelsea Publishing Co., New<br />

York (1965 unaltered reprint <str<strong>on</strong>g>of</str<strong>on</strong>g> the First Editi<strong>on</strong>, Cambridge 1865).<br />

3 The Edict <str<strong>on</strong>g>of</str<strong>on</strong>g> Nantes was a proclamati<strong>on</strong> <strong>by</strong> King Henry IV <str<strong>on</strong>g>of</str<strong>on</strong>g> France and Navarre, guaranteeing civil and religious<br />

rights to the Huguenots.<br />

28


“…the Logarithm <str<strong>on</strong>g>of</str<strong>on</strong>g> the Ratio which the middle term <str<strong>on</strong>g>of</str<strong>on</strong>g> a high Power<br />

has to any Term distant from it <strong>by</strong> an Interval denoted l, would be<br />

denoted <strong>by</strong> a very near approximati<strong>on</strong>, (supposing m = 1 / 2n) <strong>by</strong> the<br />

Quantities<br />

(m+l-1/2) x log(m+l-1) + (m-l+1/2) x log(m-l+1)<br />

- 2m x log m + log ((m+l)/m). ”<br />

<str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre does not provide details <strong>on</strong> the derivati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the above formulae. Anders Hald describes their<br />

derivati<strong>on</strong> in his book A History <str<strong>on</strong>g>of</str<strong>on</strong>g> Probability and Statistics and Their Applicati<strong>on</strong>s before 1750 1 .<br />

He then presents Corollary 1:<br />

“This being admitted, I c<strong>on</strong>clude, that if m or 1/ 2n be a Quantity<br />

infinitely great, then the Logarithm <str<strong>on</strong>g>of</str<strong>on</strong>g> the Ratio, which a Term distant<br />

from the middle <strong>by</strong> the Interval l, has to the middle Term, is –2ll/n.”<br />

Again, the derivati<strong>on</strong> is not shown. It follows noting that:<br />

(m+l-1/2)log(m+l-1) +(m-l+1/2)log(m-l+1) –2mlogm + log((m+l)/m)<br />

is equivalent to<br />

(m+l-1/2)log(m+l-1) - (m+l-1/2)logm +(m-l+1/2)log(m-l+1) - (m-l+1/2)logm + log((m+l)/m).<br />

Then approximating log(m+l-1) <strong>by</strong> log(m+l) and log(m-l+1) <strong>by</strong> log(m-l), for large m, and re-writing the<br />

above terms using the properties <str<strong>on</strong>g>of</str<strong>on</strong>g> logarithms, we have<br />

(m+l-1/2)log((m+l)/m) + (m-l+1/2)log((m-l)/m) + log((m+l)/m).<br />

Recalling that log(1+x) = x – x 2 /2 + x 3 /3 - … when -1 < x ≤ 1, then the above can be approximated, for l<br />

less than the square root <str<strong>on</strong>g>of</str<strong>on</strong>g> n, <strong>by</strong><br />

(m+l-1/2)(l/m – l 2 /2m 2 ) + (m-l+1/2)(-l/m – l 2 /2m 2 ) + l/m - l 2 /2m 2 = 2ll/n<br />

Then the approximati<strong>on</strong> to the inverse ratio’s logarithm is –2ll/n.<br />

In Corollary 2, <str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre notes that the number with “hyperbolic logarithm” (natural logarithm) -2ll/n is:<br />

2<br />

6<br />

1−<br />

2ll<br />

/ n + 4l<br />

/ 2nn<br />

− 8l<br />

/ 6n<br />

+ ...<br />

This is the series for e -2ll/n , which then approximates the ratio <str<strong>on</strong>g>of</str<strong>on</strong>g> a term l terms distant from the middle<br />

term, to the middle term. If we represent the middle term <strong>by</strong> T0, and terms 1, 2, …, l places distant <strong>by</strong> T1,<br />

T2, … Tl, then to this point <str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre has obtained:<br />

T<br />

2<br />

0<br />

≈<br />

n<br />

2<br />

2πn<br />

1 Published <strong>by</strong> John Wiley & S<strong>on</strong>s, 1990, pages 473 to 476.<br />

3<br />

29


and<br />

T l<br />

− 2ll<br />

log( ) ≈<br />

T n<br />

0<br />

Then Tl = T0e -2ll/n , or as <str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre would write:<br />

Note that<br />

2<br />

T0 ≈ ⋅ 2<br />

2πn<br />

n<br />

for a binomial (1+1) to the exp<strong>on</strong>ent n very large. If we c<strong>on</strong>sider the binomial (1/2+1/2) n , which is 1 n = 1,<br />

the middle term would be<br />

n<br />

n / 2<br />

⎟<br />

⎛ ⎞<br />

⎜<br />

⎝ ⎠<br />

( 1/<br />

2)<br />

n<br />

Since in the binomial (1+1) n , the middle term is<br />

⎛n<br />

⎞<br />

⎜ ⎟<br />

⎝n<br />

/ 2⎠<br />

Then this is equivalent to T0, and we can write<br />

2 n −n<br />

2<br />

⋅ 2 ⋅ 2 =<br />

2πn<br />

2πn<br />

for the middle term <str<strong>on</strong>g>of</str<strong>on</strong>g> the binomial (1/2+1/2) n .<br />

T l<br />

2<br />

6<br />

= T ( 1−<br />

2ll<br />

/ n + 4l<br />

/ 2nn<br />

− 8l<br />

/ 6n<br />

0<br />

Using the previously defined symbols T0, T1,…, Tl, then sums <str<strong>on</strong>g>of</str<strong>on</strong>g> terms between the middle term and <strong>on</strong>e l<br />

places distant can be obtained as<br />

T0 + T1 + T2 + … + Tl = T0 (1+exp(-2 . 1 2 /n) + exp(-2 . 2 2 /n)+…+exp(-2 . l 2 /n))<br />

30<br />

3<br />

+ ...)


This is essentially what <str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre does in Corollary 2. Using the “hyperbolic logarithm” series<br />

expansi<strong>on</strong>s for each <str<strong>on</strong>g>of</str<strong>on</strong>g> the terms, he develops a sum <str<strong>on</strong>g>of</str<strong>on</strong>g> the binomial terms from the middle, to a term l<br />

places distant for the case <str<strong>on</strong>g>of</str<strong>on</strong>g> a binomial (1/2 + 1 /2) n :<br />

2 2<br />

3<br />

5<br />

⋅ ( l − 2l<br />

/ 1⋅<br />

3n<br />

+ 4l<br />

/ 2 ⋅ 5n<br />

2πn<br />

Setting l = s n , with s = 1/2,<br />

− ...)<br />

he gets<br />

2<br />

⋅ ( 1/<br />

2 −1/<br />

3⋅<br />

4 + 1/<br />

2 ⋅5<br />

⋅8<br />

− ...)<br />

2πn<br />

Which he observes c<strong>on</strong>verges very quickly, and after using a few terms obtains the estimate 0.341344 for<br />

the sum <str<strong>on</strong>g>of</str<strong>on</strong>g> terms from the middle term to a term, which is about 1/2√n terms distant. Having obtained<br />

this result, <str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre states in Corollary 3:<br />

“And therefore, if it was possible to take an infinite number <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

Experiments, the Probability that an Event which has an equal number <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

<strong>Chance</strong>s to happen or fail, shall neither appear more frequently than<br />

1/2n + 1/2√n times, not more rarely than 1/2n – 1/2√n times, will be<br />

expressed <strong>by</strong> the double Sum <str<strong>on</strong>g>of</str<strong>on</strong>g> the number exhibited in the sec<strong>on</strong>d<br />

Corollary, that is <strong>by</strong> 0.682688…”<br />

We have here, in 1733, a result about what would later be called a “normal distributi<strong>on</strong>”. In his book, The<br />

Life and Times <str<strong>on</strong>g>of</str<strong>on</strong>g> the Central Limit Theorem1, William Adams states:<br />

“<str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre did not name√n/2, which is what we would today call<br />

standard deviati<strong>on</strong> within the c<strong>on</strong>text c<strong>on</strong>sidered, but in Corollary 6 he<br />

referred to √n as the Modulus <strong>by</strong> which we are to regulate our<br />

estimati<strong>on</strong>.”<br />

<str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre generalizes the results for the binomial (a+b) n acquiring as William Adams indicates in<br />

modern notati<strong>on</strong><br />

T =<br />

T e<br />

l<br />

0<br />

2 2<br />

−(<br />

a+<br />

b)<br />

l / 2abn<br />

using the symbols Tl and T0 defined earlier.<br />

<str<strong>on</strong>g>De</str<strong>on</strong>g> Moivre’s intenti<strong>on</strong> was to develop a method for approximating binomial sums or probabilities when<br />

the number <str<strong>on</strong>g>of</str<strong>on</strong>g> trials was very large. He was able to do this with the aid <str<strong>on</strong>g>of</str<strong>on</strong>g> mathematics relating to series<br />

1 Kaedm<strong>on</strong> Publishing Company, New York 1974, page 24.<br />

31


expansi<strong>on</strong>s for logarithms and exp<strong>on</strong>entials, as well as approximati<strong>on</strong> methods for factorials. The<br />

approximati<strong>on</strong> itself is a normal distributi<strong>on</strong>. In additi<strong>on</strong> to being an early occurrence <str<strong>on</strong>g>of</str<strong>on</strong>g> this distributi<strong>on</strong>,<br />

for practical applicati<strong>on</strong>, the approximati<strong>on</strong> is an early illustrati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> a central limit theorem. For large n,<br />

the middle term (average value) is associated with a normal distributi<strong>on</strong>, and this value can be limited <strong>by</strong><br />

a measure related to √n.<br />

32


Review 1 <str<strong>on</strong>g>of</str<strong>on</strong>g> Friedrich Robert Helmert’s The Calculati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the Probable Error from the Squares <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

the Adjusted Direct Observati<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g> Equal Precisi<strong>on</strong> and Fechner’s Formula2<br />

1. Biographical Notes<br />

Friedrich Robert Helmert was born at Freiberg in 1841. He studied engineering sciences at the technical<br />

university in Dresden. He was as a lecturer and pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor <str<strong>on</strong>g>of</str<strong>on</strong>g> geodesy at the technical university in<br />

Aachen, where he wrote "The Mathematical and Physical Theories <str<strong>on</strong>g>of</str<strong>on</strong>g> Higher Geodesy" (2 volumes,<br />

Leipzig 1884). In 1886 Helmert became the director <str<strong>on</strong>g>of</str<strong>on</strong>g> the Prussian Geodetic Institute and the<br />

Internati<strong>on</strong>al Earth Measurement Central Office, as well as pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor at the university in Berlin. He died<br />

at Potsdam in 1917.<br />

2. Review <str<strong>on</strong>g>of</str<strong>on</strong>g> The Calculati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Probable Error…<br />

In this 1876 article <strong>by</strong> F.R. Helmert, there is apparently for the first time 3 a dem<strong>on</strong>strati<strong>on</strong> that, given<br />

X1,…,Xn independent N(µ,σ 2 ) random variables, then<br />

n<br />

∑<br />

i=<br />

1<br />

( X − X)<br />

i<br />

σ<br />

2<br />

2<br />

is distributed as (what would be called) a chi-square distributi<strong>on</strong> with n-1 degrees <str<strong>on</strong>g>of</str<strong>on</strong>g> freedom. The<br />

present review is c<strong>on</strong>cerned primarily with how Helmert effectively shows this, although his purpose was<br />

to apply the result to the calculati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the mean squared error for σ ) , in order to estimate “the probable<br />

error” 4 .<br />

To begin, Helmert c<strong>on</strong>siders e1,…,en, the “true errors” <str<strong>on</strong>g>of</str<strong>on</strong>g> a set <str<strong>on</strong>g>of</str<strong>on</strong>g> observati<strong>on</strong>s X1,…,Xn. The true errors<br />

are defined as ei = Xi – µ, where µ is the (true) mean for the populati<strong>on</strong> from which the Xi are observed.<br />

The joint probability “volume” (referred to as the “future probability” 5 ) <str<strong>on</strong>g>of</str<strong>on</strong>g> the ei, given that Xi ~ N(µ,σ 2 )<br />

is presented as:<br />

n<br />

⎡ h ⎤ 2<br />

−h<br />

[ee]<br />

⎢ ⎥ e de1Lden ⎣ π ⎦<br />

where h (referred to as “the precisi<strong>on</strong>”) is equal to 1/σ√2 and [ee] is the sum <str<strong>on</strong>g>of</str<strong>on</strong>g> squares <str<strong>on</strong>g>of</str<strong>on</strong>g> the ei:<br />

e 2 1+e 2 2+…+e 2 n.<br />

1 Submitted for STA 4000H under the directi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Jeffrey Rosenthal.<br />

2 An abridged versi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the article is found in Annotated Readings in the History <str<strong>on</strong>g>of</str<strong>on</strong>g> Statistics pages 109 to 113, <strong>by</strong><br />

H.A. David and A.W.F. Edwards (Springer-Verlag, New York 2001). The secti<strong>on</strong> relating to Fechner’s formula has<br />

been deleted.<br />

3 Refer to Annotated Readings in the History <str<strong>on</strong>g>of</str<strong>on</strong>g> Statistics page 103, <strong>by</strong> H.A. David and A.W.F. Edwards (Springer-<br />

Verlag, New York 2001).<br />

−1<br />

4 <str<strong>on</strong>g>De</str<strong>on</strong>g>fined as σΦ (0.75), where Φ is the standard normal distributi<strong>on</strong> functi<strong>on</strong>. See Annotated Readings, page 103.<br />

5 The likelihood functi<strong>on</strong> from the set <str<strong>on</strong>g>of</str<strong>on</strong>g> observati<strong>on</strong>s.<br />

33


Helmert notes that [ee] is not known, since the parameter µ is not known (assuming that <strong>on</strong>ly sampling <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

the populati<strong>on</strong> is possible or feasible).<br />

As the true mean can <strong>on</strong>ly be estimated <strong>by</strong> X , then the true errors are estimated <strong>by</strong> i X-X, written λi <strong>by</strong><br />

Helmert, and referred to as the “deviati<strong>on</strong>s” (from the arithmetic mean <str<strong>on</strong>g>of</str<strong>on</strong>g> the sample). Noting that λ1 + λ2<br />

+ … + λn = 0, then λn = -λ1 – λ2 - … - λn-1, and with ē = X - µ, the true errors are related to the deviati<strong>on</strong>s<br />

as:<br />

e1 = λ1 + ē,<br />

e2 = λ2 + ē<br />

en-1 = λn-1 + ē<br />

en = -λ1-…- λn-1 + ē<br />

Helmert identifies the following matrix with the above transformati<strong>on</strong>s:<br />

⎛1000L 01⎞<br />

⎜<br />

0100 01<br />

⎟<br />

⎜<br />

L<br />

⎟<br />

⎜0010L 01⎟<br />

⎜ ⎟<br />

⎜ O ⎟<br />

⎜0000L 11⎟<br />

⎜ ⎟<br />

−1−1−1 −1<br />

1⎟<br />

⎝ L ⎠<br />

which will be referred to as H. In matrix equati<strong>on</strong> form, the transformati<strong>on</strong> may be written:<br />

⎛ e ⎞ ⎛1000L 01⎞⎛<br />

λ ⎞<br />

1 1<br />

⎜ ⎟<br />

e<br />

⎜ ⎟⎜<br />

2 0100 01 λ<br />

⎟<br />

⎜ ⎟ ⎜<br />

L<br />

⎟⎜<br />

2 ⎟<br />

⎜ e ⎟ ⎜ 3 0010L 01⎟⎜<br />

λ ⎟ 3<br />

⎜ ⎟= ⎜ ⎟⎜<br />

⎟<br />

⎜ M ⎟ ⎜ O ⎟⎜<br />

M ⎟<br />

⎜ e ⎟ ⎜0000L 11⎟⎜<br />

λ ⎟<br />

n−1 n−1<br />

⎜<br />

⎟<br />

e ⎟ ⎜ ⎟⎜<br />

⎟<br />

n −1−1−1L−1 1⎟⎜ e ⎟<br />

⎝ ⎠<br />

⎝ ⎠⎝ ⎠<br />

The determinant <str<strong>on</strong>g>of</str<strong>on</strong>g> the matrix H is n. This can be shown using two properties relating to determinants:<br />

1) If a matrix M is formed <strong>by</strong> adding a multiple <str<strong>on</strong>g>of</str<strong>on</strong>g> <strong>on</strong>e column to another<br />

column in H, then the determinant <str<strong>on</strong>g>of</str<strong>on</strong>g> M equals that <str<strong>on</strong>g>of</str<strong>on</strong>g> H.<br />

2) The determinant <str<strong>on</strong>g>of</str<strong>on</strong>g> a triangular matrix is equal to the product <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

diag<strong>on</strong>al elements.<br />

New matrices can be formed <strong>by</strong> c<strong>on</strong>secutively adding –1 times columns 1 to n-1, to the last column,<br />

resulting in:<br />

34


The determinant is then 1 . 1 . … . n = n.<br />

⎛1000L 00⎞<br />

⎜ ⎟<br />

⎜<br />

0100L 00<br />

⎟<br />

⎜0010L 00⎟<br />

⎜ ⎟<br />

⎜ O ⎟<br />

⎜0000L 10⎟<br />

⎜ ⎟<br />

−1−1−1 −1<br />

n ⎟<br />

⎝ L ⎠<br />

Observing that the Jacobian <str<strong>on</strong>g>of</str<strong>on</strong>g> the transformati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> variables is equivalent to the determinant <str<strong>on</strong>g>of</str<strong>on</strong>g> H, the<br />

joint probability for the change <str<strong>on</strong>g>of</str<strong>on</strong>g> variables becomes:<br />

n<br />

⎡ h ⎤ 2 2 2<br />

− h [ λλ ] + h ne<br />

n⎢ ⎥ e dλ Ldλ<br />

d<br />

⎣ π ⎦<br />

35<br />

1 1 e n−<br />

Helmert then notes that integrating the above expressi<strong>on</strong> over all possible values <str<strong>on</strong>g>of</str<strong>on</strong>g> ē results in the<br />

probability <str<strong>on</strong>g>of</str<strong>on</strong>g> the set λ1,…,λn :<br />

Then<br />

n−1<br />

⎡ h ⎤<br />

2<br />

−h<br />

[ λλ ]<br />

n⎢ ⎥ e dλ Ldλ<br />

⎣ π ⎦<br />

n−1<br />

1 n−1<br />

⎡ h ⎤<br />

2<br />

−h<br />

[ λλ ]<br />

n⎢ ⎥ ∫L∫e dλ Ldλ<br />

⎣ π ⎦<br />

is the probability that [λλ] lies between values u and u+du.<br />

1 n−1<br />

Next, a transformati<strong>on</strong> is devised for n-1 new variables t , i = 1,…,n-1, such that [tt] is equivalent to the<br />

sum <str<strong>on</strong>g>of</str<strong>on</strong>g> n-1 true errors. The transformati<strong>on</strong> in matrix form, is given <strong>by</strong>:


To illustrate this transformati<strong>on</strong>, c<strong>on</strong>sider the two variable case, using:<br />

Then<br />

t + t = λ + λ + λ + 2λλ<br />

+ λ<br />

2 2 2 2 2 2<br />

1 2 1 2 1 1 2 2<br />

⎛ 2 2 2 2 ⎞<br />

⎛ t1<br />

⎞ ⎜ 2<br />

L ⎟⎛<br />

λ1<br />

⎞<br />

⎜ ⎟ ⎜ 2 2 2 2 ⎟⎜<br />

⎟<br />

⎜ ⎟ ⎜ 3 3 1 3 1 3 1 ⎟⎜<br />

⎟<br />

⎜ t ⎟<br />

2 ⎜ 0 ⋅ ⋅ L ⋅ ⎟⎜<br />

λ ⎟<br />

2<br />

⎜ ⎟ ⎜ 2 2 3 2 3 2 3 ⎟⎜<br />

⎟<br />

⎜ ⎟ ⎜ 4 4 1 4 1 ⎟⎜<br />

⎟<br />

⎜ t ⎟<br />

3 = ⎜ 0 0<br />

⋅ L ⋅ ⎟⎜<br />

λ ⎟<br />

3<br />

⎜ ⎟ ⎜ 3 3 4 3 4 ⎟⎜<br />

⎟<br />

⎜ ⎟ ⎜ ⎟⎜<br />

⎟<br />

⎜ ⎟ ⎜ ⎟⎜<br />

⎟<br />

⎜<br />

M<br />

⎟ ⎜<br />

M<br />

⎜ O<br />

⎟ ⎟<br />

⎜ ⎟ ⎜ ⎟⎜<br />

⎟<br />

⎜ ⎟ n 1<br />

t ⎜ ⎜ ⎟<br />

n−1 0 0 0 0<br />

⎟<br />

⎝ ⎠ ⎜ L<br />

⋅ ⎟⎝λn−1⎠<br />

⎝ n−1n⎠ t<br />

2<br />

1 2( λ1<br />

2 )<br />

t<br />

= +<br />

=<br />

3 λ<br />

2<br />

2 2<br />

2 2 2<br />

1 2 3<br />

36<br />

λ<br />

= λ + λ + λ , since λ3 =−λ1− λ2.<br />

The determinant <str<strong>on</strong>g>of</str<strong>on</strong>g> the transformati<strong>on</strong> is √n, noting that the associated matrix is upper triangular, with<br />

product <str<strong>on</strong>g>of</str<strong>on</strong>g> the diag<strong>on</strong>al terms:<br />

2 ⋅<br />

3<br />

⋅<br />

2<br />

4<br />

⋅L⋅ 3<br />

n<br />

n −1<br />

= 2⋅ 1<br />

⋅<br />

2<br />

3⋅ 1<br />

⋅L⋅ 3<br />

n−1⋅ 1<br />

⋅<br />

n −1<br />

n<br />

= n<br />

Then the probability that [tt] is between u and u + du is given <strong>by</strong><br />

⎡ h ⎤<br />

⎢ ⎥<br />

⎣ π ⎦<br />

n−1<br />

∫ ∫<br />

2<br />

−h<br />

[ tt]<br />

2<br />

L e dt Ldt<br />

1 n−1<br />

Helmert then refers to a result he obtained in 1875: The probability that the sum [tt] <str<strong>on</strong>g>of</str<strong>on</strong>g> n-1 true errors<br />

equals u, is given <strong>by</strong>:<br />

Γ<br />

h<br />

n−1<br />

n − 1<br />

( )<br />

2<br />

n−3<br />

2<br />

2<br />

−hu<br />

⋅u ⋅e<br />

du


where again, h is “the precisi<strong>on</strong>”. Since [tt] = [λλ], the density applies to the sum <str<strong>on</strong>g>of</str<strong>on</strong>g> squares <str<strong>on</strong>g>of</str<strong>on</strong>g> n<br />

deviati<strong>on</strong>s. The above density is Gamma( n-1 ,h) for variable hu. This can seen, recalling that if a random<br />

2<br />

variable v ~ Gamma (α,β), then the probability density functi<strong>on</strong> can be written as:<br />

Substituting hu for v,<br />

α<br />

β α−1−βv v e<br />

Γ()<br />

α<br />

n − 1<br />

for α, and h for β:<br />

2<br />

n−1<br />

2<br />

=<br />

h<br />

n−3<br />

2 ⋅( hu) ⋅e<br />

Γ<br />

n − 1<br />

( )<br />

2<br />

n−1 n−3<br />

( + )<br />

2 2<br />

n−3<br />

h<br />

2<br />

= ⋅u ⋅e<br />

Γ<br />

n − 1<br />

( )<br />

2<br />

=<br />

n−<br />

2<br />

h<br />

n−3<br />

2 ⋅u ⋅e<br />

Γ<br />

n − 1<br />

( )<br />

Then the probability associated with volume d(hu) is:<br />

2<br />

n − 1<br />

( )<br />

2<br />

37<br />

2<br />

−hu<br />

2<br />

−hu<br />

2<br />

−hu<br />

=<br />

n−2<br />

h<br />

n−3<br />

2<br />

2 −hu<br />

⋅u ⋅e<br />

d( hu)<br />

Γ<br />

=<br />

n−1<br />

h<br />

n−3<br />

2<br />

2 −hu<br />

⋅u ⋅e<br />

du<br />

Γ<br />

n − 1<br />

( )<br />

Since hu ~ Gamma( n-1 ,h) , then u ~ Gamma( n-1 ,h) h<br />

2<br />

2<br />

probability density functi<strong>on</strong> for u is:<br />

u<br />

Then 2<br />

σ ~<br />

result:<br />

χ − , so that u ~<br />

2<br />

n 1<br />

2<br />

n 1 1 u<br />

u<br />

2 2<br />

2 e 2<br />

n 1<br />

1 n 1<br />

2 ( ) 2<br />

σ<br />

−<br />

−<br />

⋅<br />

−<br />

− −<br />

Γ<br />

( )<br />

σ<br />

2<br />

2<br />

χn<br />

− 1 . Recalling that u = [λλ], and that [λλ] =<br />

2<br />

σ<br />

. For h = 1 σ 2,<br />

it can be shown that the<br />

n<br />

∑<br />

i=<br />

1<br />

( X − X)<br />

i<br />

2<br />

, we have the


n<br />

∑<br />

i=<br />

1<br />

( X − X)<br />

i<br />

σ<br />

2<br />

Helmert does not assign a name to the distributi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the sum <str<strong>on</strong>g>of</str<strong>on</strong>g> squares [λλ]. His objective is to use the<br />

result to estimate the probable error.<br />

Also <str<strong>on</strong>g>of</str<strong>on</strong>g> interest in the article is the estimati<strong>on</strong> related to the precisi<strong>on</strong> h (and there<strong>by</strong> σ) in a maximum<br />

likelihood manner in secti<strong>on</strong> 2, such that:<br />

38<br />

2<br />

2<br />

2h n<br />

χ<br />

1 [ λλ]<br />

= σ =<br />

−1<br />

2<br />

n−1


Review 1 <str<strong>on</strong>g>of</str<strong>on</strong>g> R.A. Fisher’s Inverse Probability<br />

1. Biographical Notes<br />

R<strong>on</strong>ald Aylmer Fisher was born in L<strong>on</strong>d<strong>on</strong> in 1890. He received scholarships to study mathematics,<br />

statistical mechanics and quantum theory at Cambridge University, where he also studied evoluti<strong>on</strong>ary<br />

theory and biometrics. After graduati<strong>on</strong>, he worked for an investment company and taught mathematics<br />

and physics at public schools from 1915 to 1919. From 1919 to 1943, he was associated with the<br />

Rothamsted (Agricultural) Experimental Stati<strong>on</strong>, c<strong>on</strong>tributing to experimental design theory and the<br />

development <str<strong>on</strong>g>of</str<strong>on</strong>g> a Statistics <str<strong>on</strong>g>De</str<strong>on</strong>g>partment. In 1943 he became pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor <str<strong>on</strong>g>of</str<strong>on</strong>g> genetics at Cambridge,<br />

remaining there until his retirement in 1957. He died at Adelaide in Australia, in 1962.<br />

2. Review <str<strong>on</strong>g>of</str<strong>on</strong>g> Inverse Probability<br />

In this article, published in 1930, R.A. Fisher cauti<strong>on</strong>s against the applicati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> prior probability<br />

densities for parameter estimati<strong>on</strong> using inverse probability 2 , when a priori knowledge <str<strong>on</strong>g>of</str<strong>on</strong>g> the distributi<strong>on</strong><br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> the parameters is not available (e.g., from known frequency distributi<strong>on</strong>s). Fisher indicates in the first<br />

paragraph, that the subject had been c<strong>on</strong>troversial for some time, suggesting that:<br />

“Bayes, who seems to have first attempted to apply the noti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

probability, not <strong>on</strong>ly to effects in relati<strong>on</strong> to their causes but also to<br />

causes in relati<strong>on</strong> to their effects, invented a theory 3 , and evidently<br />

doubted its soundness, for he did not publish it during his life.”<br />

Fisher describes the manner in which a (known) prior density can be used in the calculati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

probabilities:<br />

“Suppose that we know that a populati<strong>on</strong> from which our<br />

observati<strong>on</strong>s were drawn had itself been drawn at random from<br />

a super-populati<strong>on</strong>…that the probability that θ1, θ2, θ3,... shall lie<br />

in any defined infinitesimal range 1 2 3 ... dθ dθ dθ<br />

is given <strong>by</strong><br />

dF =Ψ(<br />

θ , θ , θ ,...) dθ dθ dθ<br />

...,<br />

1 2 3 1 2 3<br />

then the probability <str<strong>on</strong>g>of</str<strong>on</strong>g> successive events (a) drawing from the<br />

super-populati<strong>on</strong> a populati<strong>on</strong> with parameters having the<br />

particular values θ1, θ2, θ 3,...<br />

and (b) drawing from such a<br />

populati<strong>on</strong> the sample values 1 ,..., x x n , will have a joint<br />

probability<br />

1 Submitted for STA 4000H under the directi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>essor Jeffrey Rosenthal.<br />

2 According to H.A. David and A.W.F. Edwards in Annotated Readings in the History <str<strong>on</strong>g>of</str<strong>on</strong>g> Statistics, Springer-Verlag<br />

2001, page 189:<br />

“…“Inverse probability”…would not necessarily have been taken to<br />

refer exclusively to the Bayesian method (which in the paper Fisher<br />

calls “inverse probability strictly speaking”) but to the general<br />

problem <str<strong>on</strong>g>of</str<strong>on</strong>g> arguing “inversely” from sample to parameter…”<br />

3 In the mid-1700s.<br />

39


Ψ ( θ , θ , θ ,...) dθ dθ dθ ... × Π{<br />

φ( x , θ , θ , θ ,...) dx }.<br />

1 2 3 1 2 3 p 1 2 3 p<br />

p=<br />

1<br />

If we integrate this over all possible values <str<strong>on</strong>g>of</str<strong>on</strong>g> θ1, θ2, θ3,... and<br />

divide the original expressi<strong>on</strong> <strong>by</strong> the integral we shall then have<br />

a perfectly definite value for the probability…that<br />

θ , θ , θ ,... shall lie in any assigned limits.”<br />

1 2 3<br />

n<br />

It is noted that this is a direct argument, which provides the frequency distributi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the populati<strong>on</strong><br />

parameters θ. Fisher’s cauti<strong>on</strong> relates to cases in which the functi<strong>on</strong> Ψ is not known, and is then taken to<br />

be c<strong>on</strong>stant. He argues that this assumpti<strong>on</strong> is as arbitrary as any other, and will have inc<strong>on</strong>sistent results.<br />

While an example is not given in the Inverse Probability paper, it is helpful to c<strong>on</strong>sider an illustrati<strong>on</strong><br />

provided elsewhere <strong>by</strong> Fisher, related <strong>by</strong> Anders Hald in A History <str<strong>on</strong>g>of</str<strong>on</strong>g> Mathematical Statistics From 1750<br />

to 1930 1 .<br />

C<strong>on</strong>sider the posterior probability element:<br />

Then if the parameter ς is defined <strong>by</strong><br />

such that<br />

θ θ ∝ θ −θ θ ≤θ ≤<br />

a n−a P( |a,n)d (1 ) d , 0 1<br />

1 1<br />

sin ς = 2θ-1, - π ≤ς≤ π<br />

2 2<br />

ς = arcsin(2θ −1)<br />

and ς is assumed to be uniformly distributed, the posterior probability element becomes:<br />

Since<br />

it follows that:<br />

ς ς ∝ ς − ς ς<br />

a<br />

n−a P( |a,n)d (1+sin ) (1 sin ) d<br />

dς darcsin(2θ<br />

−1)<br />

=<br />

dθ dθ<br />

1 d(2θ−1)<br />

= ⋅<br />

2<br />

1 −(2θ−1) dθ<br />

1 1<br />

− −<br />

2 2<br />

= θ (1 −θ)<br />

1 1<br />

− −<br />

2 2<br />

dς = θ (1 −θ)<br />

dθ,<br />

1 A History <str<strong>on</strong>g>of</str<strong>on</strong>g> Mathematical Statistics From 1750 to 1930, John Wiley and S<strong>on</strong>s Inc., 1998, page 277.<br />

40


1 1<br />

a− n−a− 2 2<br />

P( ς|a,n)d ς ∝ θ (1 − θ) dθ<br />

.<br />

However, since under the assumpti<strong>on</strong>s we can show that P( θ|a,n)dθ ∝ P( ς|a,n)dς , then<br />

1 1<br />

a− n−a− 2 2<br />

a n−a P( ς|a,n)d ς ∝ θ (1 − θ) dθ<br />

is inc<strong>on</strong>sistent with P( θ|a,n)d θ ∝ θ (1 − θ) dθ<br />

.<br />

To Fisher, the use <str<strong>on</strong>g>of</str<strong>on</strong>g> prior densities (not based <strong>on</strong> known frequencies) implies that nothing can be known<br />

about the parameters, regardless <str<strong>on</strong>g>of</str<strong>on</strong>g> the amount <str<strong>on</strong>g>of</str<strong>on</strong>g> informati<strong>on</strong> available in the observati<strong>on</strong>s. What is<br />

needed, according to Fisher, is a “rati<strong>on</strong>al theory <str<strong>on</strong>g>of</str<strong>on</strong>g> learning <strong>by</strong> experience”.<br />

It is noted that (for c<strong>on</strong>tinuous distributi<strong>on</strong>s) the likelihood (functi<strong>on</strong>) is not a probability, however it is a<br />

measure <str<strong>on</strong>g>of</str<strong>on</strong>g> “rati<strong>on</strong>al belief”. He writes:<br />

“Knowing the populati<strong>on</strong> we can express our incomplete<br />

knowledge <str<strong>on</strong>g>of</str<strong>on</strong>g>, or expectati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g>, the sample in terms <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

probability; knowing the sample we can express our<br />

incomplete knowledge <str<strong>on</strong>g>of</str<strong>on</strong>g> the populati<strong>on</strong> in terms <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

likelihood.”<br />

Next the c<strong>on</strong>cept <str<strong>on</strong>g>of</str<strong>on</strong>g> fiducial distributi<strong>on</strong> is introduced, which is an example <str<strong>on</strong>g>of</str<strong>on</strong>g> the c<strong>on</strong>fidence c<strong>on</strong>cept<br />

described <strong>by</strong> H.A. David and A.W.F. Edwards as:<br />

Fisher writes:<br />

“…the idea that a probability statement may be made<br />

about an unknown parameter (such as limits between<br />

which it lies, or a value which it exceeds) will be<br />

correct under repeated sampling from the same<br />

populati<strong>on</strong>.” 1<br />

“In many cases the random sampling distributi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> a statistic, T,<br />

calculable directly from the observati<strong>on</strong>s, is expressible solely in terms<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> a single parameter, <str<strong>on</strong>g>of</str<strong>on</strong>g> which T is the estimate found <strong>by</strong> the method <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

maximum likelihood. If T is a statistic <str<strong>on</strong>g>of</str<strong>on</strong>g> c<strong>on</strong>tinuous variati<strong>on</strong>, and P the<br />

probability that T should be less than any specified value, we have then a<br />

relati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the form<br />

P =F(T,θ)<br />

If we now give to P any particular value such as .95, we have a<br />

relati<strong>on</strong>ship between the statistic T and the parameter θ, such that T is the<br />

95 per cent. Value corresp<strong>on</strong>ding to a given θ…”<br />

In Principles <str<strong>on</strong>g>of</str<strong>on</strong>g> Statistics, <strong>by</strong> M.G. Bulmer 2, an illustrati<strong>on</strong> from a 1935 paper <strong>by</strong> Fisher is described, with<br />

sampling from a normal distributi<strong>on</strong>. If a sample <str<strong>on</strong>g>of</str<strong>on</strong>g> size n is taken, the quantity<br />

1 Annotated Readings in the History <str<strong>on</strong>g>of</str<strong>on</strong>g> Statistics, page 187.<br />

2 Principles <str<strong>on</strong>g>of</str<strong>on</strong>g> Statistics <strong>by</strong> M.G. Bulmer, Dover Publicati<strong>on</strong>s, Inc., New York 1979, page 177.<br />

41


x<br />

− µ<br />

s/ n<br />

(with notati<strong>on</strong> as usually defined in c<strong>on</strong>temporary statistics) follows a t distributi<strong>on</strong> with n-1 degrees <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

freedom. Then 100P per cent <str<strong>on</strong>g>of</str<strong>on</strong>g> those values would be expected to be less than tp, or the probability that<br />

x − µ<br />

≤ t<br />

s/ n<br />

is equal to P. Fisher notes that the above inequality is equivalent to<br />

42<br />

P<br />

µ ≥ x −st<br />

/ n<br />

and reas<strong>on</strong>s that the probability that µ ≥ x − stP/ n is also P. In this case, µ is a random variable and<br />

x and s are c<strong>on</strong>stants. By varying tP, the probability that µ is greater than specific values may be<br />

obtained, establishing a fiducial distributi<strong>on</strong> for µ, from which fiducial intervals may be c<strong>on</strong>structed.<br />

Such intervals would corresp<strong>on</strong>d to the c<strong>on</strong>fidence intervals (as defined in c<strong>on</strong>temporary statistics).<br />

The interpretati<strong>on</strong> however, is different 1 . In his paper, Fisher provides a table, associated with correlati<strong>on</strong>s<br />

derived from four pairs <str<strong>on</strong>g>of</str<strong>on</strong>g> observati<strong>on</strong>s.<br />

H.A. David and A.W.F. Edwards suggest that Inverse Probability is the first paper clearly identifying the<br />

c<strong>on</strong>fidence c<strong>on</strong>cept (although similar approximate c<strong>on</strong>structs such as “probable error” had been in use for<br />

some time). It is also suggested that Student (W.S. Gosset) first expressed the noti<strong>on</strong> (in an exact way)<br />

remarking in his 1908 paper:<br />

“…if two observati<strong>on</strong>s have been made and we have no other<br />

informati<strong>on</strong>, it is an even chance that the mean <str<strong>on</strong>g>of</str<strong>on</strong>g> the (normal)<br />

populati<strong>on</strong> will lie between them.” 2<br />

1 In a c<strong>on</strong>fidence interval, µ is a c<strong>on</strong>stant, with a certain probability <str<strong>on</strong>g>of</str<strong>on</strong>g> being c<strong>on</strong>tained in a random interval.<br />

2 Annotated Readings…page 187.<br />

P

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!