Interview with Persi Diaconis

A most unusual book by Persi Diaconis and Ron Graham has recently appeared, titled Magical Mathematics: The Mathematical Ideas That Animate Great Magic Tricks. It relates some series of card manipulations and tricks with deep mathematics, of different kinds, but with a minimal degree of technicity, and beautifully shows how the two domains really intersect, at least for both authors. Magical Mathematics is a wonderful book, potentially enjoyable by a large range of individuals and not only mathematicians. The following interview took place on the phone on February 22, 2012.

Christian Robert: Persi, I was very glad to read this book because I had always wondered about this side of yours, mysterious as it should be because it’s magic, so I set to read the book and I really loved it. My first question is, how did you find the time to write a general audience book in addition to doing research in mathematics?

Persi Diaconis: Well, why did I write the book, I don’t really know. My interaction between magic and the academic side has always been to keep the magic side out of the academic side. Magic was something I used to do and now I’m an academic. On the other hand, magic has done a lot of good for me and every once in a while you have to give something back and I somehow was persuaded that it would be a good idea. I have hopes that it’ll reach young people, like I was, who have some interest in magic, but could be swayed toward mathematics or stochastic things, and best thing for me would be if over the years some 16 year old would find it in a library or a bookstore and says, hey, that’s cool!

The writing took forever. We signed a contract with Princeton University more than 25 years ago and they paid a big healthy advance and they were very, very nice about it. Every once in a while they would say, how’s the book going? Once a year I’d get a call and I’d say, great, we’re working on it. And then my coauthor Ron Graham said, you know, maybe it’s time we should finish it. We worked pretty hard for a couple of years, intensely flying up and down and getting photos and stuff, and when you decide to finish things then you’re in finishing mode and you can finish it.

Of course that entails taking all kinds of projects that we hope to work on and, deciding we’re not going to do that in this book or in this life and here’s what we have. How it came to be written is just deciding, okay, I’m going to do it. And that is a decision, because it does mean a lot of work focused in the direction of writing for the public instead of doing our technical work and/or analyzing data or doing teaching. So, I’m happy it’s done and onto the next!

Christian Robert: What is next?

Persi Diaconis: Well, I have a long-time project, even longer overdue than this one, about coincidences. I’ve written a little bit about it with Fred Mosteller. I promised Fred that I would do such a book and I’m headed in that direction. I have a public talk, which is about how to think about some of the weird things that happen and then I stopped giving it because I just got tired. Now when I’m asked to give public talks, which is plenty, I just sign up and say, well I’ll give this coincidence talk, I hope to trick myself into getting down to work now I see that it can be done. There’s a real tension between contributing to our field and writing for the public and I have shied away from writing for the public. I did promise Fred I’d do it and if I wait too long, I won’t do it.

So, I think that that will be the next project but I wouldn’t hold my breath. It’s been more than 30 years. Fred Mosteller, who was a marvelous statistician, was fascinated by coincidences and collected in a kind of focused way, articles and clippings and weird stories. When he’d read something, like Jesuits die in three, there would be a story in the paper that somebody noticed the triples—he would make a little back of the envelope calculation or model and then say, it’s not so unlikely or that is unlikely and can I get the data and see if it’s really true? Over the years, he had many giant loose leaf binders filled with these kinds of things, and we wrote an article, it was his presidential address for the American Statistical Association, on methods for studying coincidences.

The New York Times did a front-page article about it, and one thing is, if you get on the front page of The New York Times as professors study coincidences, you get a lot of weird letters. So, we got hundreds of stories. “This happened to me, how can you guys explain it? I’m haunted by that …” We saved every single one of them and made little calculations. All of that material is around, sitting in my office and taunting me. I hope to make some sense out of it. It’s not easy, in the sense that, if you want to figure out what’s really going on, a lot of it is human psychology; why people are surprised by things that aren’t actually so surprising and that’s not something that we’re so trained to do or understand.

Sam Behseta: Persi, one gets the impression that there is a decent number of mathematicians involved with magic.

Persi Diaconis: I wouldn’t say so many. I mean it doesn’t seem to be more than in the population of academics, for example. The most well-known one is Raymond Smullyan who writes about philosophical paradoxes and chess problems and books called, “What is the name of this book.” He was a very serious amateurish magician and a good musician. I don’t know many mathematicians who are interested in magic. Nor statisticians, it’s a handful. One of the things that happen with the book having published is that people come out of the woodwork and send you their tricks. I just returned from a trip so I got a bunch of letters.

I received a letter saying, “here’s a trick with correcting codes and I wrote an article but nobody will publish it. Maybe we could write it up together since you seem to know how to get things published.” I haven’t looked at it yet. Probably it’s an awful trick. One of the problems is that mathematical tricks are usually deadly. We did have a trick that I had hoped to put in the book which was based on error correcting codes like the Hamming code where there’s an array of cards, face up on the table, say 48 cards in 6 rows of 8. Somebody thinks of a card and then points to the row the card is in and calls out the colors and lies when he comes to his card. So you hear something like red, red, red, black, red. Because he has lied, there will be an error in that, and it’ll be an error on the card he is thinking of and because it is an error correcting code you know when he lied and therefore you know what the card is. It’s not a great trick, the way I described it, but it is a start. That’s a trick that didn’t make into the book but it could. Maybe volume two.

Christian Robert: Back to what you were saying about this tension between academic publishing and writing for the general public: Aren’t you afraid of getting too much pressure from writing this book and for giving talks to the general public?

Persi Diaconis: I’m acutely aware of it. If one of our colleagues always has his name in the newspapers … people know that in order to communicate with the public, there is some extent to which you have to lie. If you say the absolute truth with all the weasel words that make your conclusions carefully true, nobody will listen to you. So you do need to sort of simplify things and shortcut the truth and if somebody has always have their name in the paper, I find, at least for myself, why is this person always has his name in front of the public? What is wrong? You trust them less.

I think you can get publicity about every two or three years, and still be a respected academic. You have to keep your head down in some way and, that is another thing. I am pretty good at saying no to giving talks even though. If you ask my wife, Susan Holmes, she says, “you’re always saying yes.” But I say no, about three times for every time I say yes. The end result is still I wind up giving about 50 outside talks a year and I would say about a quarter of them are talks for the general public like the talk on coincidences or what’s the nature of randomness or mathematics and magic tricks. That has increased a little bit because of the publicity.

But I also say no. I had a call from some national organization for something like the Boy Scouts and some earnest person says “our kids don’t get to see any science and you can communicate and would you come to our national meeting?” It is an opportunity of a sort but not for me. On the other hand, I am currently answering emails from having been away for two weeks and two of them are to give big public lectures, I think I said yes to those two. It has been like this for 35 years and it cannot get much worse.

Sam Behseta: I really liked that chapter that you titled “Is this stuff actually good for anything?”

Persi Diaconis: Thank you. I was trying to explain to people what we do a little bit, once I had their attention with those tricks, to say it is a pretty good life, what we do. You know, we get to talk to each other and find a few other people who care about the silly things we do and they get interested and then the progress gets made so I did try to communicate that doing research in mathematics or statistics is an exciting activity.

Sam Behseta: Especially there’s a part that you talk about genetic applications, you mention de Finetti’s exchangeability, Bayes, and reversible Markov chains and all of that, and I found it fascinating, but you kept it at the minimum technical explanation.

Persi Diaconis: Well it wasn’t for you, it was for the public. But for this interview I could say it more carefully. That development keeps going. That was David Freedman’s thesis actually. It is a funny story: David Freedman was a kind of cantankerous guy, a little more honest than most of us and a difficult guy. He was a graduate student at Princeton. He had come to work with John Tukey but he couldn’t figure out what he was talking about. John Tukey spoke his own language and he wouldn’t translate, so one day in 1960, David walked into Feller’s office and said “Professor Feller, I’ve decided I would like to try to do a thesis in probability.” Feller, without looking up, said, all right young man, prove de Finetti’s theorem for Markov chains and then looked back down at his papers and David walked out and went and found out, what is de Finetti’s theorem, what is a Markov chain. He figured out a way of detecting when a stochastic process is a mixture of Markov chains and developed a notion of partial exchangeability.

In order to make his theorem go, he needed to assume that the process was stationary so that the Markov chain started in its stationary distribution. When I came out to the West Coast, I knew that you did not need that assumption that you could let the Markov chain start at any place and still have the theorem. Also, I knew how to make finite versions of it. I met David and Lester Dubins at one of these Berkeley-Stanford meetings—four times a year we have a joint meeting—we started talking and I said I knew about his thesis work. He then invited me to give a talk at Berkeley and I remember Dubins being very hostile, “Who is this kid? He’s not going to tell us anything.” This is 1975. I was a starting assistant professor, but Friedman realized there was something there and we started to work together and the condition of partial exchangeability all worked out. But when we went to making a finite version of the theorem without assuming that the process could be extended to infinity, we had to develop some new theory.

Suppose I give you a sequence, say 0–1 sequence and you can record the transition counts in a 2 by 2 matrix: how many times did it go from 0 to 0, 0 to 1, etc. If I give you such a matrix, you can ask conversely, how many sequences give rise to such a matrix? We know the answer to that by combinatorics. Then, how do I generate a uniform sequence of fixed length with a given transition count matrix? That involves some funny work about trees and Eulerian paths and tricky combinatorics but we figured it out. That allowed us to have a finite version of the de Finetti’s theorem for Markov chains.

About eight years later, a very similar problem came up in thinking about quantifying DNA string matching algorithms. As you know, in DNA, there are strings such as A-C-G-T, or often just 0s and 1s. People might have two strings and then ask: are these two strings close together? There are various moves you are allowed to make: leading, reversing, blocks and rather complicated sets of moves that would allow you to define a metric between strings. Any of the algorithms such as BLAST do that. Suppose you give it two strings. One is a string from your data and another is a string from some library of strings, and then the algorithm finds the distance between your string and the closest matching string in the catalog to be say, 135. Well is that big or small? How do you calibrate that? What the biologists wanted to do is to generate those strings at random, fixing the transition counts or the higher order transition counts, not just 0 to 1 but maybe the previous five symbols to 1. As we know, that just can be seen as a Markov chain on a larger state-space, and the way they were doing it was by naïve simulation: just generate strings at random until you find one that had the right transition counts. But because we had done our math, we could give a closed form algorithm that was exactly generating strings with a given transition count matrix.

That became part of the program BLAST which is used millions of times every day when somebody has a string and wants to go to the big database and find another string that is close. The program also gives you the detail on how sure we are about it. That is a thing I did for a philosophy problem that is used millions of times a day in practice. It doesn’t happen all the time. I have certainly done a lot of things that are not used a million times a day. This work we did on de Finetti’s theorem for Markov chains now has a very healthy development. Nice story, Christian will like: some chemists are studying protein folding and this is the part of the IBM Blue Gene project which is IBM’s big new effort. They had Deep Blue, which played chess and now they are trying to do protein folding.

They are actually doing the molecular dynamics very carefully. They have a molecule with 40 atoms and then 10,000 water molecules around it. Then they all interact with quantum mechanical forces. The problem is, it’s a very, very high dimensional state space. To describe an atom is, to describe anything with 12 numbers: position, velocity, angular position, rotation, and three angular variables. It’s a 100,000 dimensional dynamical system. They wanted to coarse-grain space and time and break it into something like 5000 to 50,000 boxes. Then there are images that the dynamical system will kind of get random within the little boxes and it’s the transitions from box to box that matters.

They are modeling that by a Markov chain. It’s a standard thing to do in chemistry, but then they are Bayesians, hooray! it came as a surprise to me. They said—sort of sheepishly—we are Bayesians and I said, great. And they wanted to do a Bayesian analysis but they were very insistent that they did not just want to put a prior on the Markov transition matrix, but because the laws of mechanics and physics are time-reversible, they were sure that the Markov chain would be reversible so they wanted to put a prior on reversible Markov chains. Well, that is not an off-the-shelf distribution because it is a curved subfamily of the big family.

But because I had done this work along de Finetti’s theorem for Markov chains I knew how to do it. I had written a paper in the Annals of Statistics with Silke Rolles a couple years before on how to do that. There is something like a conjugate prior which updates reasonably neatly and does exactly what they wanted. So they had one of their graduate students working on it as his thesis. Anytime theory meets practice, practice has all kinds of demands that the theory didn’t think about. In our theory, you have a long run of this Markov chain. What they had is many short runs of the Markov chain and the theory needed to be changed and then they wanted to adapt it for Markov chains which were in some states Markovian but in other states second-order Markovian. A student, a marvelous young man named Sergio Bacalado, became interested, did his thesis on the subject, and has two papers in the Annals that extended the work that Freedman and I did and now it is being used: that is the nice thing. It is not just somebody writing papers in the Annals, they are actually doing the inference and they say it makes a big difference. So there is a little leaf out there growing which comes from that so it’s alive today. And of course those sorts of stories are what keep us going. I wish it would happen more often but it is nice that it happens every once in a while that somebody actually read the paper or cares about it.

Christian Robert: Is that related with the paper you wrote on importance sampling for contingency tables with fixed marginals?

Persi Diaconis: No, this is different because in that paper we did not know how to sample. We needed to use this fancy algebra in order to derive Markov chain Monte Carlo to sample from contingency tables with given margins. Here, we have a string, say an ACGT string and the simulation task is the following: fix the transition matrix, how many times did it go A to A, A to G, A to C, etc. Then I want to generate uniformly, at random, a string with that given transition matrix. We do not need to run an MCMC for that. The combinatorics is an analog of perfect sampling, but without iterating or going backwards. It is like sampling from the hypergeometric or something you can just do it. You don’t have to run an MCMC.

So we had a closed form algorithm which we did in order to be able to write down something that we could do calculus on to do our asymptotics. Let me take the two state case: I have a two by two transition matrix, I fixed those numbers and there should be at least one string that is compatible with them and then I want to generate a random string. Well here is what you do. You take two urns and on the outside of one of the urns you write a giant 0 and on the outside of the other urn you write a giant 1. Then if you want to do A many 0 to 0 transitions, you put A balls labeled 0 in the 0 urn and if you want to do B many 0 to 1 transitions you put B balls labeled 1 in the 0 urn and so on. Same thing for the 1 urn.

Then you just do sampling without replacement in the following way: say you start at 0, you reach into the 0 urn, you pull out a ball, you write it down and you throw it away and then if that ball is a 0 you reach into the 0 urn again. If the ball is a 1, the next time you reach into the 1 urn. So you are doing some analog of sampling without replacement but you will balance out and get the transition counts that you want. Now one problem that occurs is if you just do it naïvely, you could get stuck. You could reach into a 0 urn and it is empty, and the math came in figuring out what to do, how to adjust the totals a little bit, so you never get stuck, in that it works. But the adjustment didn’t foul things up, and we had our sharp bounds on sampling without replacement then that enabled us to do the asymptotics, but it needed—going back to the magic book because that’s where it started—the math of the growing sequences and deBruijn, Ehrenfest, Smith, Tutte theorem and if I had not done it earlier because of the magic, I wouldn’t have known that math. So it’s a nice circle.

Persi Diaconis: One other thing that might be an interesting story on the connections between statistics and magic and me is the way that I got to go to graduate school. I guess it is told in the book but I’m not positive. Fred Mosteller who was a famous old statistician, great man, was this very serious amateur magician and, Martin Gardner wrote to him and that was my letter of reference. I had poor grades, Martin Gardner wrote to him, saying “of the ten best card tricks invented in the last 10 years, this kid invented two of them and maybe you should give him a break.”

Fred was on the committee and said, “yeah, let’s take this kid.” And so that was a way in which magic made a big impact on my career. I got to be a graduate student at Harvard because of it. One of the things that you might think, is that magic might enliven, teaching. Fred tried it. He used to do something where he would be lecturing for a class, he’d be writing a lecture and he’d put the chalk down on the ledge and then he’d talk to the kids for a second and then he’s start writing and he had chalk in his hands again. Or he’d be writing on the blackboard and all of a sudden he’d underline something but it would appear in red or something like that. What he said he found was, it was a disaster, the kids thought he wasn’t a professor but some actor or magician and they stopped listening. You know, because they do take us with a certain kind of respect. He found that mixing show-business to liven things up, killed the academic beliefs of the kids. If we hear a lecture from some senior person we listen. Even we listen with a sort of rapt attention. As if the truth is coming from on high or something. Some of that interaction between students and faculty and entertainment part killed it so he stopped doing it. I don’t do it either; I mean I don’t mix my worlds.

I am going to have a problem this next quarter starting in April. I’m going to teach a course out of the book and I put a cap on it at 40 students and it is already completely filled and the students are saying, “I came to Stanford because I wanted to take this course.” I don’t know what I am going to do about it, but, I am going to do a course that mixes mathematics and magic. There are probability tricks that did not get in the book. There are tricks such that people don’t do calculations so well and there are tricks that, you know, work 90% of the time. When they work, folks say, “that’s amazing” and when they don’t work you have to say, “you have to concentrate and believe.”

Christian Robert: You don’t associate directly these card tricks with statistics because there is this level of uncertainty that you wouldn’t want with card tricks?

Persi Diaconis: It’s true. The magicians did a survey: they put out a call in their journals saying, “Would everybody ask 10 people to name a card?” and then they recorded what cards people named and people named very few cards. They are not good at randomizing and they just name the ace of spades or the king of hearts, and so if you know those ten cards, well you could be prepared for that, and appear to do a miracle, whereas people think they just told me any card. Anyway, it is limited, but that was one of the places where statistics touched magic.

Sam Behseta: So are there tricks with coins? The reason I am asking is because you have done this wonderful work in which you studied if there is such a thing as a fair toss and ended up showing that 51% of times it lands on what you started with.

Persi Diaconis: Of course there are coin tricks, lots and lots of them but they’re mostly sleight of hand. There are some mathematical coin tricks which use parity, that are not very good. I’ll answer with a recommendation for a good novel which is called American Gods and it’s a slightly strange book but very readable and, uh –

Christian Robert: From Neil Gaiman?

Persi Diaconis: Yes, that’s right. It’s filled with coin tricks; the guy who is the protagonist does coin magic on the side. I just asked a friend, a magician friend who is in Hollywood if he knows the author and he said, yeah, he is a magic buff.

I don’t know a single magic trick that works using calculus, and there must be some tricks that use calculus, so I am hoping somebody gets annoyed at me and says, what about this?

The book has had some very nice reviews which is nice. It wasn’t just something that we knocked off. We worked very hard at it over a very long time. It’s a lifetime of mine being in magic that is woven into it. I recommend the chapter on juggling, too. Ron Graham has taught thousands of people to juggle. He is a very good mathematician and often in his talks, after the talk, he will say, if anybody brought three balls—and hundreds of people will have brought three balls—he will give a public juggling lesson. Well he has really learned how to teach people how to juggle three balls and all of that is woven into that chapter. I don’t juggle, and the reason is because all my friends who juggle are really good at it.

Sam Behseta: Thank you Persi for the wonderful conversation.

Persi Diaconis: Sure, what a pleasure. It’s nice to hear your voice, both of you. Nice to meet you, Sam, and I think doing what you’re doing and keeping CHANCE lively and visible is great for the field so I’m all in favor.

Christian Robert: Thank you, Persi.

Persi Diaconis: Okay, a pleasure and hope I see you both soon.

Visit Christian Robert’s blog to read a review of Magical Mathematics.

Back to Top

Tagged as: , , , , ,


  1. Nice interview! Readers interested in connecting magic to their statistics teaching may also enjoy this paper available online at:

    Lesser, L. M. & Glickman, Mark E. (2009). Using Magic in the Teaching of Probability and Statistics. Model Assisted Statistics and Applications, 4(4), 265-274.

  2. Doubtless Persi’s interest in magic explains his keen ability to scrutinize experiments in PSI.
    I was disappointed, though, that Persi didn’t show me any magic tricks, either at that conference or others, as I’m a big magic fan. True, Freedman warned me not to ask Persi to do a trick, yet I did.