## Wednesday, 26 November 2008

### The Birthday Problems (Yes, there is more than one)

"Happy Birthday to YOU!!! ... ok, it's my mom's birthday and I was thinking about that old birthday problem. You've probably seen it. How many people must be in a room before the probability of at least two sharing a birth date (just month and year) is greater than 1/2?. Yeah, that one.

It is one of those problems that is best attacked by answering a different question, (there seem to be a lot of those in probability) what is the probability that NO two have a common birth date. If you begin by thinking of people entering a room one at a time, and calculate the probability that no two have a common birth date, then the probability after the nth person enters is given by P(n)=P(n-1)*(366-n) /365. We are assuming a 365 day year here.. so P(1) = 1 (if you are the only one in the room, there is no match)..
Now the next guy has 364/365 probability of preserving the NO MATCH and so on for each new body. After each calculation we check to see if the probability of NO MATCH is less than 1/2, in which case the proabability of a match would be more than 1/2. It turns out that at n=23 the probability of at least one match is slightly over 1/2, about .507.

Actually, it only seems like the problem has been around forever. I saw a quote that it was created in 1932 by Richard von Mises, the Austrian born scientist whose work spans several areas of engineering science, however, the earliest citation I can find is " Mises, R. von. "Über Aufteilungs--und Besetzungs-Wahrscheinlichkeiten." Revue de la Faculté des Sciences de l'Université d'Istanbul, N. S. 4, 145-163, 1939.", so you decide which is right.

After you play with that for a little while, you might want to tackle one of several different variations, that can take it well beyond a high school classroom exercise. For example, the classical problem ask for when there is at least one match, but if you ask for the number so that the probability is 1/2 of exactly one matched pair, it gets a little tougher (OK, a lot tougher). With modern technology, it can be examined by students withour some of the formal probability training if they are good at simulations. I gave this problem to one class and asked them to describe a way to simulate the event using a graphing calculator. The best answer I got was this one: a) Put n (for example 23) random integers from 1 to 365 in a list 1. b) sort the list in ascending order c) copy this list 2 and then replace the last number with zero and resort just this list (effectivly this moves the zero to the top and each number down one in the list from its natural position). d) Now do a logical L1=L2 store L3. This produces a list of ones when the n th term in the list is equal to the n+1 st term, ie, for each match. Sum this list to get the number of matches. e) repeat like crazy.

I thought it was a great way to solve it, although if you have Fathom or other good statistical software for simulations, it can be done much easier, but sometimes the software hides some of the understanding in the early stages of learning simulations. When I tried it (with Fathom) and using 23 for the sample size, and 5000 trials of the experiment, there were no matches in 2431 or 48.6% of them, there was exactly one match in 1829 or 36.6% of them, and to my surprise, there were 604 or 12% that had either two seperate matches, or three of a kind.... and if you do the arithmetic, that leaves another 136 or almost 3% that had multiple matches beyond these (I used the unique value function of the program so I only knew there were less than 21 unique numbers in those cases). With the student method, adjacent ones on the third list would identify triplets etc, while seperated ones would indicate multiple pairs.

A couple of examples you might want to explore, and I will try to get back to solutions if they are not submitted sooner:
1) What happens to the birthday problem if you switch to a different number of days in the year (say 400 or 1000) or what if we use weeks or months... in short, is there a general formula for the number of people to reach the p=1/2 that there is at least one match.

2) what is the number of people needed to make the probability that EVERYONE has a match equal to or greater than 1/2?

3) and what if we only wanted to have two people who came close. What is the number of people that must be present to make the probaiblity that at least two have birthdays no more than one day apart?

4) and finally, back to the original problem, suppose a group (several hundred) are entering an auditorium, and you know that the first person to enter having a birthday that matches someone already inside will win a prize. Ok, you know not to go first, and if you go too late, it will already be gone.... so where in the line do you insert yourself to have the highest probability of being the first to enter and have a birthday match in the room?

More on all of these later..good luck, and have fun..

## Sunday, 23 November 2008

### Whither the Schoolhouse

Recently reading Andrew Robinson's The Last Man Who Knew Everything, about the polymath Thomas Young, I came across a quote he wrote to his brother, "Although I have readily fallen in with the idea of assisting you in your learning, yet (there) is in reality very little that a person who is seriously and industriously disposed to improve may not obtain from books with more advantage than from a living instructor....Masters and mistresses are very necessary to compensate for want of inclination and exertion, but whoever would arrive at excellence must be self taught."

Yikes!!!. I am a teacher, one who struggles to "compensate for want of inclination and exertion" and I am finding I agree with him. I mean, REALLY strongly agree with him. And in the age of the internet, is not the availability of information more than ever before available to those who have the "inclination"? When you can independently peruse many of the great courses at MIT on line for free, is ignorance not a self-inflicted predicament?

All of which leads me to wonder what is the role of the typical school in the future. In many areas they have already been reduced to day-care for those intended to be held out of the workforce until a later date. My daughter in law teachers language and literature classes with more students than she has desks or textbooks, and this in a moderately wealthy community which is just outside Detroit. What educational atrocities will she face with the impending economic crisis in that area? Can the schools descend to such a state that home-schooling becomes the predominant choice of college bound students.

I can almost imagine a rebirth of that approach that was common here in England when knowledgeable people would support themselves by sitting up small instruction programs in a single topic where students would come and pay for instruction until they reached the level of competence they desired. An instruction guided mostly by independent study guided by the mentor with a few choice insights of the "masters." And if the public schools do not fade away completely, they will serve a purpose much like the gladiatorial combats of ancient Rome, to entertain the festering masses and keep them mostly off the streets while the talented few, with inclination and exertion, begin to widen the gap ever more greatly between the educational (and probably financial) haves and have-nots. I think the system will prevail, at some level for my grandchildren, but wonder at the education of their children.

Opinions are welcomed.

## Saturday, 22 November 2008

### Harmony, and Harmonic Problems

Things happen in threes according to the old myth, and in this case it was true. I was doing some research on the early history of a mathematical problem often called the "cistern" problem. You probably know the type; "If one pipe can fill a cistern in 6 hours and another can fill it in four hours, how long would it take both pipes working together." While I was working on that, I got a nice article sent to me on the first proof that the harmonic sequence diverges... and then, I was reading a blog by Dave Marain Math Notationsin which he posed a problem that asked, in its general form, given a square inscribed in a right triangle (with one corner at the right angle of the triangle), what is the length of a side of the square in terms of the legs of the triangle.
So what do all these have in common with each other. dare I say what makes them in "harmony"?.... the answer is Harmony, or at least the mathematical relationship of the harmonic mean.
To the early Greeks, if Nichomachus can be believed, all the means were descriptive of musical relations. Much is often made of the Harmonic Mean in relation to a musical sense, but this may not represent the Greek view. Euclid used the word enarmozein to describe a segment that just fits in a given circle. The word is a form of the word Harmozein which the more competent Greek Scholars tell me means to join or to fit together. Jeff Miller's Web site on the first use of Mathematical terms contains a reference to the very early origin of the harmonic mean, 'A surviving fragment of the work of Archytas of Tarentum (ca. 350 BC) states, 'There are three means in music: one is the arithmetic, the second is the geometric, and the third is the subcontrary, which they call harmonic.' The term harmonic mean was also used by Aristotle. "
My search for the early roots of the cistern problem had taken me back to Heron's Metre'seis around the year fifty of the common era. The problem became a staple in arithmetics and problem books and was used by Alcuin (775) and appears in the Lilavati of Bhaskara (1150). I found the illustration I used on the blog for The First Illustrated Arithmetic a few days ago, from the 1492 arithmetic, Trattato di aritmetica by Filippo Calandri.
The solution to a cistern problem is the harmonic mean of the times taken by each pipe. For example, one problem asks "If one pipe can fill a cistern in three hours, and a second can fill it in five hours, how fast will the two pipes take to fill the cistern if both are opened at once. The solution is given by the harmonic mean of three and five, which is three and three-quarter hours.
The Harmonic mean is the reciprocal of the mean of the reciprocals of the values, so for values a and b, the harmonic mean is given by which can be simplified to the more economical .
Heron might have been the first recorded example of a cistern problem, but a problem calling on the solve to use the harmonic mean occurs even earlier in the Rhind Mathematical Papyrus, now located in the British Museum, in problem 76. The problem involves making loaves of bread with different qualities, but the solution is still the harmonic mean. (I have learned from David Singmaster's Chronology of Recreational Mathematics that the cistern problem appeared perhaps 300 years before Heron's use in China by Chiu Chang Suan Shu (around 150 BC).
The series of terms formed by the reciprocals of the positive integers is a common torment for college students in their first introduction to analysis. The sequence in which each number gets smaller and smaller seems to very slowly approach some upper limit. Even after adding 250,000,000 terms, the sum is still less than twenty, and yet... in the mid 1300's, Nichole d'Oresme showed that it will eventually pass any value you can name. In short, it diverges, slowly, very, very slowly, to infinity. Even when warned, it seems like students want to believe it converges. A well-known anecdote about a teacher trying to get student's to remember that it diverges goes,
"Today I said to the calculus students, “I know, you’re looking at this series and you don’t see what I’m warning you about. You look and it and you think, ‘I trust this series. I would take candy from this series. I would get in a car with this series.’ But I’m going to warn you, this series is out to get you. Always remember: The harmonic series diverges. Never forget it.”
By the way, each number in the harmonic series is the harmonic mean of the numbers on each side of it, and in fact, of any numbers equally spaced away from it.
And then, I came across that little problem of a square inscribed in a right triangle. If the two legs are a and b, then the side of the square will have a length equal to the harmonic mean of a and b.
So I guess things do come in threes, unless I come across another one, but whether it comes in threes or fours, it all seems to work together, in perfect harmony.

## Wednesday, 19 November 2008

### Problem Solving?

Here is a neat little inspirational video my beautiful wife forwarded to me. You can draw your own meaning. For me it was a reminder of what I tell kids is the first rule of problem solving; "When you don't know what to do, do something!". Enjoy.

## Monday, 17 November 2008

### The First Illustrated Arithmetic

I was researching problems related to the harmonic mean (more of which I hope to share in a later blog or blogs) when I came across a note in David E. Smith's "History of Mathematics" (There are actually used copies for a nickel!) about Filippo Calandri's 1492 arithmetic, Trattato di aritmetica. Smith cites it as the first "illustrated" arithmetic, and checking around, David Singmaster seems to agree.

An actual copy is in the Metropolitan Museum of Art in New York, and they have some images from the woodcuts in the book posted here . The cut above was the one of interest to me as it describes a "cistern problem" which was one of the common recreational problems since the First Century, and one of the problems I was researching when I came across this. The book has another first, it seems to have been the first book to publish an example of long division essentially as we now know it.

Here are some additional notes from my web notes on division that pertain to the long division algorithm.:

..... is the true ancestor of the method most used for long division in schools today, and was called a danda, "by giving". In his Capitalism and Arithmetic, Frank J Swetz gives “The rationale for this term was explained by Cataneo (1546), who noted that during the division process, after each subtraction of partial products, another figure from the dividend is ‘given’ to the remainder.” He also says that the first appearance in print of this method was in an arithmetic book by Calandri in 1491. The method was frequently called “the Italian method” even into the 20th century (Public School Arithmetic, by Baker and Bourne, 1961) although sometimes the term “Italian method” was used to describe a form of long division in which the partial products are omitted by doing the multiplication and subtraction in one step.

The early uses of this method tend to have the divisor on one side of the dividend, and the quotient on the other as the work is finished, as shown in the image below taken from the 1822 "The Common School Arithmetic : prepared for the use of academies and common schools in the United States" by Charles Davies. Swetz suggests that it remained on the right by custom after the galley method gave way to “the Italian method” in the 17th century. It was only the advent of decimal division, he says, and the greater need for alignment of decimal places, that the quotient was moved to above the number to be divided.

click on the image to see full picture

I recently found a site called The Algorithm Collection Project. where the authors have tried to collect the long division process as used by different cultures around the world. Very few of the ones I saw actually put the quotient on top as American students are usually taught. In one interesting note, a respondent from Norway showed one method, then explained that s/he had been taught another way, and then demonstrates the common American algorithm, but adds a note that says, “but ‘no one’ is using this algorithm in Norway anymore.” I might point out that the colon, ":" seems to be the division symbol of choice if this sample can be generalized as it was used in Norway, Germany, Italy, and Denmark. The Spanish example uses the obelisk, and the other three use a modification of the "a danda" long division process. The method labled "Catalan" is like the "Italian Method" shown above where the partial products are omitted.

## Friday, 14 November 2008

### The Math Check Joke

My loving little sister from Ft. Worth got this one in the e-mail and forwarded it to me. It seems to have been going around for awhile under the title "How to Tell if you Pi**ed off a Mathematician", and mostly, with a mis-answer for the amount. The copy my sister sent to me had this explanation for the check amount:

Unfortunatly, if you look closly at the amount, it is NOT e2 pi, which would give the above amount (more or less). The exponent of e, as most mathematicians would expect, is in fact i * pi, where i is the imaginary constant. Leonhard Euler showed (although Cotes did a lot of the spade work for this) that ei pi is actually equal to -1. It is often called the most beautiful theorem in mathematics, and it is certainly one of the more useful as it allows us to tie the real and complex values together. I used the theorem not too long ago in a blog I Don't Get It! about a tongue in cheek quote from De Morgan.

Using the correct expression, the check turns out to be written for \$0.002, which is 2 mills, or two tenths of a cent. To me that makes the whole problem a little funnier and a little more interesting, but I'm not sure part of that is not a little intellectual snootieness (I get the joke and you probably won't); which I also spoke of not quite as recently Prime Time Fun.

## Thursday, 13 November 2008

### Some additional History on the Problem of Points

After my recent post on the probability history I got a nice note from Jim Kiernan advising me about an article he posted in the March, 2001 issue of The Mathematics Teacher. He is (or at least was according to the article) a teacher at the Edward R. Murrow High School in Brooklyn, NY. His interests were listed as math history, and in particular, the origins of probability and statistics.

The entire article is well worth the read, and if you are also interested in math/stats history, it is worth the trip just for the references. In particular, and I suspect that a stats-history buff like Jim already knows, but one of the references, A. W. F. (Tony) Edwards, not only wrote a really neat book on "Pascal's Arithmetical Triangle, The Story of an Idea", but he was also the last student under the tutelage of the great statistician, R. A. Fisher at Cambridge. My personal appreciation to Professor Edwards include thanks for a guided tour around the great hall at Gonville and Caius ( pronounced "keys") to view the stained glass tribute to John Venn (and several other math/science people), and for giving me directions to the (totally hidden in vines at the time) grave of John Venn (photo at top). In his book on Pascal, Professor Edwards points out that it was the problem of points that prompted Pascal to write his famous treatise.

I have taken the liberty of copying a few key remarks from Mr. Keirnan's article that compliment the previous post, and am grateful to Jim for the note.

As early as the twelfth century, the Arabs were acquainted with the binomial triangle and used it to solve problems that involved combinations. Islamic tradition also deals with problems of dividing inheritances. Tartaglia repeated the tradition that Leonardo of Pisa (ca. 1200), commonly known as Fibonacci, was responsible for bringing the practice of algebra to Italy from Arabia. "Although no mention of the "problem" is attributed to Leonardo, its origins apparently also lie with the Arabs. Oystein Ore refers to an Italian manuscript, dating from approximately 1380, that is probably of Arab origin and that contains the "problem." The Arabs seem to have had all the right tools, but no record of a solution exists.

Tartaglia and Cardano both tried (and failed) to solve the problem and he includes their wrong answers in the article. Then, Several other futile attempts were made to solve the problem before it fell into obscurity. Galileo wrote about probability, but no extant version of the problem appears in his papers. Widespread knowledge of the binomial triangle existed throughout Europe. It appears in the works of Cardano, Tartaglia, and Mersenne. Yet no record exists of anyone's applying it to the problem. Finally, during the summer of 1654, the problem was solved in three different ways as the result of a correspondence between two of the greatest French mathematicians of the seventeenth century: Blaise Pascal and Pierre de Fermat.

The correspondence began in response to a pair of questions submitted by the Chevalier de Mere. The second of these problems would be the catalyst for the founding of probability theory. The first letter from Fermat "on division" is missing, but Pascal (1952, p. 475) responded on 29 July that the "method is very reliable and is the first that had occurred to me." Pascal claims to have found a "different method much shorter and simpler." The letter ends with the heartwarming observation that "truth is the same at Toulouse and at Paris."

Pascal's first method can best be explained using the ideas of recursion and weighted averages. When a total of three games is required to win, he considers three cases: (2, 1), (2, 0), and (1, 0). The first case, (2, 1), is a simple example; the second, (2, 0), gives the answer to Pacioli's problem; and the third, (1, 0), gives the answer to de Mere's problem. In each case, a total of 32 pistoles is wagered by each player. This number seems to have been selected so that the solution would be a simple ratio.

Analyzing the simple case, "they now play a game ... if the first player wins, he wins all the money ... if the second player wins each should withdraw his own stake" (Pascal 1952, p. 475). The result is a split of either [64; 01 or [32; 32]. The first player is "sure of having 32... as for the other 32 ... let us share equally." So if the game is interrupted before the next round, the correct split should be [48; 16]. The second case reverts to the first case when the second player wins the next game. If the game is interrupted at (2, 0), the player who has two games should get 48 pistoles plus half of 16. So the correct split for this case and Pacioli's problem would be [56; 81, or 7 : 1 in simplest form.

De Mere's problem requires finding "the value ... when two players are playing for three games and ... one player has only one game and the other none" (Pascal 1952, p. 475). Using the process of recursion developed so far brings the situation back to the previous case. If the first player wins, the status becomes (2, 0), which entitles him to 56 pistoles. If the first player loses, the status is even, (1, 1), which entitles him to 32. So in the case of an interruption, the first player should get 32 plus half of (56 - 32). The correct split is [44; 20].

Pierre de Fermat's solution, dated 24 August, depended on determining the number of games required to declare a winner. If player 1 needs m games more and player 2 needs n games more to win, then a winner must be declared after m + n - 1 more games. Fermat then listed all possible outcomes for four more games

and formed the ratio of wins by each player where a is a win for player 1 and b is a win for player 2: aaaa 1 abaa 1 baaa 1 bbaa 1 aaab 1 abab 1 baab 1 bbab 2 aaba 1 abba 1 baba 1 bbba 2 aabb 1 abbb 2 babb 2 bbbb 2

"Therefore, they must share the sum in the ratio of 11 to 5" (Pascal 1952, p. 475).

This result is equivalent to Pascal's solution [44; 20]. Gilles de Roberval, a member of Pascal's intellectual circle in Paris, was not pleased with this means of listing outcomes. He criticized the use of four games when two or three would determine a win.

The last of the three methods used to solve this problem is contained in Pascal's Treatise on the Arithmetic Triangle, which was written in 1654 but not published until 1665.

Thanks again Jim...

## Tuesday, 11 November 2008

### White Rabbit Mathematics

One of the things that amazes me, and I think most people who are attracted to math, is the mysterious way that different parts of math come together in unexpected ways. I tried to explain this to someone once using a literary analogy..."It is as if you were reading along in some great drama, or trying to understand the message in some grand poem, and suddenly the White Rabbit from Alice in Wonderland comes running through muttering, "Oh dear! Oh dear! I shall be too late!"

It is not the White Rabbit you see in math, but the effect is the same. Euler must have felt that feeling after he struggled to find the value of the series .. and finds that it turns out to be . Wait.... Pi is the ratio of the circumference to the diameter of a circle, but there are no circles in the sum of the squares of the reciprocals of the integers; and yet, there it is, the mathematical white rabbit coming seemingly from nowhere. Certainly none of the many mathematicians of great repute who had worked on the problem found (or expected) Pi to appear.

The normal distribution is another example; De Moivre takes the binomial probability distribution for flipping a coin and generalizes it toward an infinite number of flips, and POW, the normal or bell-shaped curve that is ubiquitous in intro stats. And what happens? Right there in the middle, the height of the normal curve at Z=0 is .39894... No, NO, NO, NOT JUST .39894.. but the .39894... that is exactly equal to .

Ok, so what brought this sudden rebirth of excitement about mathematical interrelationships? Well recently I came across a blog that referred to another blog that (as these things sometimes do) led me to a paper on just such a mathematical "white rabbit". The paper was about partitions of numbers as powers of two (1, 2, 4, 8, 16, etc..)

It began with a simple question, what is the number of ways to write a number n as a sum of powers of two if each value can be expressed no more than two times. For example, we could express 4 as 4, or as 2+2, or as 2 + 1 + 1 since each value is a power of two, and none appears more than twice. You couldn't use 1+1+1+1 since it appears more than twice. For n= 4 it turns out that the number of partitions, as shown above, is three. If we assume that there is one way to express zero, and one way to express one, and figure out the others we get a string like this

1, 1, 2, 1, 3, 2, 3, 1, 4, 3, 5, 2, 5, 3, 4, 1, 5, 4, 7,..

Ok, you don't see a white rabbit yet... but then someone ask you a different question. Is it possible to write out ALL the rational numbers in simplified form without repeating any of them. The answer is "Yes, of course, see the list above."

"What?", you ask, "How?", but there it is... The sequence of rational numbers is formed by taking each of the numbers to be the numerator, and using the number behind it to be the denominator. 1/1; 1/2; 2/1; 1/3; 3/2; ... and you never get a repeat, never get an unsimplified form, and you eventually get them ALL, the entire Infinite Set.....

No way you would expect that partitions of powers of two should give you the rational numbers in their entirety... there is (it would seem) nothing to relate the two questions... and yet... there it is. I think that is what makes math the most exciting area of study in the world.

Prove it you say? Nope, In truth I ain't man enough, but you can find the entire paper
Recounting the rationals, by Neil Calkin and Herb Wilf. Read their proof and Enjoy.

## Saturday, 8 November 2008

### Some Early Probability History Notes

So we make a fair bet, I roll one die, you roll the other, and who ever gets the highest scores a point. If we tie, we just redo the roll, and the first one to five points wins. Easy enough, but then, when the score is three to one my favor, you get an emergency phone call and have to leave. How should we distribute the stakes?

It was just such a problem that formed the foundation of early probability, and when it was solved, it sparked a rapid development of problems, and applications of probability.

I would tell you more, but I just read a neat blog by Keith Devlin that covered just such a development, so here, in part, are the words of a master:

"The Unfinished Game,
The problem of the unfinished game, also known as the problem of the points, was described in a book on arithmetic and geometry written by the Italian mathematician Luca Pacioli in 1494, [The text was Summa de arithmetica, geometrica, proportioni et proportionalita, and you can view it here PAT] though it is known to predate that mention. It asks how the pot should be fairly divided when a multi-round tournament has to be abandoned before it is finished. For instance, suppose two players are rolling a pair of dice and agree to playa best of five rounds tournament. Three rounds are played, leaving one player ahead 2 to 1, at which point they must abandon the game. How should they divide the pot?

Pacioli was unable to solve this problem. So too were a number of other mathematicians (and gamblers) who tried, including Girolamo Cardano, Niccolo Tartaglia, and Lorenzo Forstani. The consensus was that the problem could not be solved.

Then, early in 1654, a gambler by the name of Antoine Gombaud, more often referred to in modern history books by his French nobleman's title of the Chevalier de Mere, asked his friend the mathematician Blaise Pascal. Pascal produced a complicated argument that can be made to work, but was not happy with it, so at a friend's urging he wrote to Fermat about it. Fermat quickly found a simple solution.

There are two rounds left unplayed, argued Fermat. In each round, either player can win, so there are in all four different ways the game could continue to its five-round completion. The player who has won one round to the other's two must win both those final rounds in order to win the contest; in the other three possible endings, the player who is ahead after three rounds will win. Therefore, said Fermat, the player who is ahead when the game is abandoned should take 3/4 of the pot, with the other player taking 1/4.

To anyone who sees this solution today, it seems simple enough. (The solution assumes the tournament is thought of as a "best-of-five" rounds, as opposed to a "first-to-three". You need a slightly more complicated argument in the latter case, but the answer is the same, a 3 to 1 division of the pot.) But no one before Fermat saw it, including Cardano who did work out all of the basic rules we use today to combine probabilities. Moreover, when he did see Fermat's solution, Pascal could not accept it, and nor could various of his colleagues he showed it to. What was their problem?

Since the computation is trivial, indeed no different from the calculation of the odds in any game of chance (and actually much simpler than many), the only thing that could be holding everyone back was the fact that what Fermat was counting were "possible futures." Something that two thousand years of received wisdom said was not possible.

Once word got out about Fermat's breakthrough, however - presumably through the highly mobile network of gambling European noblemen - it did not take long for others to jump into the "future prediction" act. Within a single lifespan, modern future prediction and risk management were in place.

The speed of developments that followed the solution to the problem of the unfinished game is staggering.

1657. Christian Huyghens writes a 16-page paper that lays out pretty well all of modern probability theory, including the notion of expectation, which he introduces.[This one is LIBELLUS DE RATIOCINIIS IN LUDO ALEAE and can be found here

1662. John Graunt, an English haberdasher, publishes an analysis of the London mortality tables, and in so doing establishes the beginnings of modern statistical inference.

1669. Huyghens uses his new probability theory to re-compute Graunt's mortality tables with greater precision.

1709. Nikolas Bernoulli writes a book describing applications of the new methods in the law. One problem he shows how to solve is how long must elapse after an individual goes missing before the court can declare him dead and allow his estate to be divided among his heirs.

1713. Jakob Bernoulli writes a book showing how the new probability theory can be used to predict the future in the everyday world. This is the first time the word "probability" is used in the precise, mathematical sense we use it today. He also proves the law of large numbers, of which more in a moment.

1732. The first American insurance company begins in Charleston, S.C., restricted to fire insurance.

1732. Edward Lloyd starts the precursor of what in 1734 becomes Lloyd's List, and eventually gives birth to the insurance company Lloyds of London.
1733. Abraham de Moivre discovers the bell curve, the icon of modern data collection.

1738. Daniel Bernoulli introduces the concept of utility to try to get a better handle on human decision making under uncertainty.

1760s. The first life insurance companies begin.

"

A pretty concise History for one blog... If you have additional notes to offer, please do.

## Tuesday, 4 November 2008

### Use Your Math Education, Count Fish!

Earn Big Bucks, Count Fish

Ok, maybe not BIG bucks, but it seems they need more people in the fisheries and wildlife areas to count things, and they are looking for mathematicians who can fill the bill. From an article on "Counting Fish," by Karen Kaplan. Nature, 16 October 2008.

"The Fisheries Service of the U.S. National Oceanic and Atmospheric Administration (NOAA) is looking for a few good fish counters." So beings this brief article in the Career View section of Nature. Scientists with a background in mathematics, computer science, and/or conservation are needed---and are in short supply---to fulfill positions as "stock-assessment scientists." Reports say that U.S. institutions will graduate only 160 such scientists to fill at least 340 positions. Stock assessment scientists "gather data on species populations, on the basis of catches and aerial surveys. The data inform mathematical models that help design monitoring programs and predict populations under different managemant scenarios. This in turn helps regulators to set quotas." One such scientist, Larry Alade, says "he's now in a job he loves---contributing to sustainability."

## Monday, 3 November 2008

### more on margins of error from NPR

Keith Devlin did a thing on NPR explaining how Polling Margins of Error work (the accepted view) but messed up on a couple of points in the history.....as pointed out by Peggie Lewis:
"Keith Devlin conflated two events when describing the polling errors in the Dewey Truman race in 1948. The main problem with that year's polling was that it stopped several weeks before the election and missed the late swing to Truman (See Scholastic News, 2008) The previous major polling error was made in 1936 and reported by the Litery Digest on the FDR Alf Landon race. The Digest mailed out straw ballots drawing names from telephone books and DMV records. Among many other things wrong with their polling methods was the biased sample this reflected (In the depression year of 1936, a sample drawn from those who owned autos and/or who had phones was necessarily a biased sample."

Thanks to Shelli Temple from the AP Stats group for bringing this to everyones attention.

## Sunday, 2 November 2008

### Political Polls and Margins of Error????

Well, we are almost to the election, which means an end, finally, to the interminable projection polls. Ok, I actually like statistics, but I'm not sure I accept that political polls are not playing a little fast and loose with the assumptions that are needed to compute confidence intervals. I love it when the election goes the wrong way and they have to come up with scenarios for WHY they blew it. Of course with so many of them out there making 95% confidence intervals, about five percent of the ones you hear SHOULD be wrong... but I think there is more to the problem than just that.

I came across a blog from Iowahawk ( I didn't provide a link because my students come here and some of his language is not the sort of thing I display for my students..they know all the words anyway, but they won't hear them from me) that had a nice expression of what I felt, so I stole parts of it shamelessly...

Statisticians love balls and urns. A typical Stats 101 midterm, for example, usually includes a question along these lines:

"You take a simple random sample of 1000 balls from an urn containing 120,000,000 red and blue balls, and your sample shows 450 red balls and 550 blue balls. Construct a 95% confidence interval for the true proportion of blue balls in the urn."
From this the typical Intro stats student can deduce that they are 95% certain the real proportion of blue balls in that urn is 55%, plus or minus 3.1% .

"This is, for all intents and purposes, how political pollsters compute the mysterious "margin of error," which has everything to do (and only to do) with pure mathematical sampling error. If you look at the formula above and round it just a smidge, you get a simple rule of thumb for the margin of error of a sampled probability:
Margin of Error = 1 / sqrt(n)

So if the sample size is 400, the margin of error is 1/20 = 5%; if the sample size is 625 the margin of error is 1/25 = 4%; if the sample size is 1000, it's about 3%.

"It works pretty well if you're interested in hypothetical colored balls in hypothetical urns, or survival rates of plants in a controlled experiment, or defects in a batch of factory products. It may even work well if you're interested in blind cola taste tests. But what if the thing you are studying doesn't quite fit the balls & urns template?"

What if 40% of the balls have personally chosen to live in an urn that you legally can't stick your hand into?

What if 50% of the balls who live in the legal urn explicitly refuse to let you select them?

What if the balls inside the urn are constantly interacting and talking and arguing with each other, and can decide to change their color on a whim?

What if you have to rely on the balls to report their own color, and some unknown number are probably lying to you?

What if you've been hired to count balls by a company who has endorsed blue as their favorite color?

What if you have outsourced the urn-ball counting to part-time temp balls, most of whom happen to be blue?

What if the balls inside the urn are listening to you counting out there, and it affects whether they want to be counted, and/or which color they want to be?

If one or more of the above statements are true, then the formula for margin of error simplifies to

Margin of Error = Who the heck knows?