Of Wranglers and Bankers

Reflections on applied statistics from a career in finance

Stephen Blyth

As an undergraduate at Cambridge University, I was fascinated by the renown of wranglers, students who graduated with first-class honors in the mathematical tripos. Until the early 20th century, wranglers were publicly ranked with the senior wrangler announced amid much fanfare. They included some of the great quantitative minds of the Victorian era: physicists such as George Stokes (senior wrangler, 1841), Lord Rayleigh (senior wrangler, 1865), James Maxwell (second, 1854), and J.J. Thomson (second, 1880); logicians such as Bertrand Russell (seventh, 1893); and economists like John Maynard Keynes (twelfth, 1904). Indeed, these wranglers form a cohort that, in a different era, would have been a prime target for recruitment by Wall Street firms and hedge funds.

My favorite wrangler story—probably apocryphal—concerns the brilliant mathematician William Thomson, later Lord Kelvin. Kelvin was so confident he would be senior wrangler in 1845 that he sent a friend to the Senate House, where the results were read, to find out who was second. On his return, the friend reported, “you, sir.” A further embellishment to the story is that a question on the exam involved a result Kelvin, himself, had derived but not memorized. He had to spend time re-deriving the proof, whilst Stephen Parkinson (himself later FRS) had memorized the proof and was able to regurgitate it by rote—and became senior wrangler.

I tell this story for two reasons. First, there is a lesson here about the difference between quantitative talent on the one hand and mechanical, procedural skills on the other—a theme I will revisit later. Second, it allows me to quote Kelvin: “There cannot be a greater mistake than that of looking superciliously upon the practical applications of science. The life and soul of science is its practical application.”

The life and soul of statistics also surely is its practical application. Three decades after Kelvin, the third wrangler at Cambridge in 1879, was a certain Karl Pearson. I wrote in the 1994 International Statistical Review article “Karl Pearson and the Correlation Curve” about Pearson’s contributions to the nascent field of statistics and the development of correlation in particular, and it could well be argued that Pearson is the founder of applied statistics, having established the first such department at University College London in 1911.

I too was a wrangler, but part of a generation for whom a career in quantitative finance, rather than science, was an attractive option. My career, which has taken me from Wall Street to the City of London and the Harvard endowment, has informed my views about the practice of applied statistics, a discipline started by a fellow wrangler 100 years ago. In this article, I summarize some conclusions I have drawn from two decades in finance about the “life and soul of statistics,” the practical application of quantitative reasoning. These observations may, I believe, translate across other areas and provide insight into the practice of applied statistics in other fields of endeavor.

Applied Statistical Reasoning

I have been greatly influenced by my graduate statistical training at Harvard, where I worked with professor Arthur Dempster. Art, himself, was a student of John Tukey, and my experience in finance has confirmed to me that many of Tukey’s writings about applied statistical reasoning are still relevant.

First, subject matter knowledge is essential. Finance is a multifaceted world full of subtle definitions, jargon, detail, and complexity, and it is remarkable how many people have tried to bring complex probabilistic modeling to bear without understanding the basics that render their analysis immediately meaningless. One has to speak the language.

Early in my career, I was told the cautionary tale of a senior, quantitatively strong trader who was attempting to exploit the mispricing between bonds and bond futures, a non-trivial mathematical problem since a complicated delivery option underlies the bond futures contract. However, while his mathematical modeling of the option was sophisticated, he had not verified the notional size of the bond future contract (which is $100,000, not $1 million per contract). You can imagine what happened: He bought $1 billion bonds against selling 1,000 bond futures, leaving a net position of $900 million.

Second, expertise and ability in the quantitative techniques, themselves, is essential. Die-hard applied statisticians may sometimes cast aspersions on those with a purely theoretical training. However, as we prepare students for an applied or interdisciplinary career, we should not abdicate the requirement to provide them a proper rigorous foundation. Indeed, I think there is a greater responsibility to provide the appropriate technical training for students who will use statistics in other disciplines.

When I worked at Morgan Stanley in the late 1990s, the firm had developed a highly sophisticated simulation engine, one of the earliest implementations of what would become known as the HJM and BGM interest rate modeling frameworks. There was a large room full of Silicon Graphics super computers, reputedly one of the most powerful clusters outside the Department of Defense, running vast multivariate simulations of complex portfolios of derivative contracts (contracts whose value is a function of—derives from—the value of another financial variable). The time steps for these simulations were weekly out to 30 years, and whenever one needed a daily setting of a particular rate in between time steps, linear interpolation was used.

This procedure gives the correct marginal distribution of the interest rate for that date. However, upon closer inspection, a probabilist can readily deduce that the conditional distribution of a midweek rate, conditional for example on the prior and subsequent weekly fixes, is badly mis-specified. Any derivative whose price depends, say, on the distribution of the difference between the value of the rate on a Monday and a Wednesday was being mispriced. The solution was not prohibitively complex—one needed to implement a Brownian bridge to correct the problem—yet this example shows that, without sufficient quantitative adeptness, costly errors can be made. Finance is an unforgiving field: One of its challenges (or attractions) is that mistakes are almost always exposed and frequently exploited. It was inevitable that the errors in the modeling framework were discovered after mispricing just such an intra-week derivative.

At Morgan Stanley, I also worked with a young Harvard mathematics graduate, who would no doubt have been a wrangler in a different life. Working on another problem involving interest rate options, he discovered—using detailed subject matter knowledge and quantitative skills—that the price of a non-standard swaption, a particular type of interest rate derivative, depended on the second-order term of a Taylor series, and that other traders were ignoring this term. The difference in price was only six basis points (six one hundredths of one percent), but on $1 billion notional, that is $600,000. This example illustrated the unforgiving dynamism of finance: Within hours of him discovering this opportunity, we had executed more than $2 billion of trades.

Third, and perhaps most important, good judgment is essential. Dempster drew the distinction between procedural statistics—mechanical application of tests and techniques such as t-tests—and logicist statistics involving reasoning about the problem under investigation. Proceduralism (perhaps unfairly represented by the example of Parkinson above), the rote application of techniques—especially mathematically sophisticated ones—can lead to serious problems in finance just as it can in other areas of applied statistics.

For example, there has been much work on stochastic volatility models in finance, where the standard deviation of the asset price process is, itself, a random process. Practitioners were excited when they developed a tractable model that could be fit to observed option prices (which fully determine the terminal distribution of the asset price). The model became adopted widely. However, deeper analysis showed certain calibrations led to distributions with no higher moments and that this model would price another financial contract called the constant maturity swap to have infinite value. This was clearly absurd.

More broadly, finance in the 2000s was rife with the curse of proceduralism, the poor reasoning and lack of judgment that led to illogical conclusions. There was, in particular, a procedural tendency for practitioners to adopt what one might call the minimally sufficient model, the simplest model that could produce a reasonable price for a particular product. For example, traders might use a Black-Scholes pricing model with a skew matrix of volatilities for different strikes to price European options, then add in correlation parameters to price options on the spread between two rates, and then construct a one-factor lattice to price American options with early exercise decisions, etc. Lost in this pragmatic approach is that one is constructing a modeling foundation with inconsistent dynamics, implicitly allowing different models for the evolution of the asset simultaneously to hold. One statement about asset price modeling that one knows for sure is that there can only be one underlying dynamic for the asset price, even if one does not know what that is. Surely a logical quantitative approach must demand such consistency.

The Financial Crisis

Since 2010, I have been teaching an undergraduate course at Harvard, and one of the texts I use is Mark Joshi’s book, Concepts and Practice of Mathematical Finance. On Page 1 is the following passage:

There is of course a possibility that the government will renege on its promise to pay (i.e., default). But if we pick the right government this possibility is sufficiently remote that we can for practical purposes neglect it. If this seems unreasonable, consider that if the British, American or German government reached such straits, the world’s financial system would be in such a mess that there would be precious few banks left to employ financial mathematicians.”

There is wonderful reductio ad absurdum logic to this statement. Of course, the possibility of government default is now decidedly non-ignorable, being at the heart of recent financial turmoil. This quote begins to give an idea of the depth of the fundamental upheavals that have shaken the foundations of quantitative finance since the financial crisis and the bankruptcy of Lehman Brothers. I’ve written recently on this subject in a series of papers under the theme of “The Quant Delusion.” The edifice of quantitative finance built by the generation of wranglers and other ex-scientists over the past 20 years, and the logically sound fundamental arguments underpinning this edifice, have been challenged in deep ways by empirical price data during and since the financial crisis.

For one example, consider the conundrum of collateralized versus unsecured funding. One can immediately assert that unsecured borrowing between banks (which takes place at the infamous libor rate) must be at a higher rate than repo financing between the same institutions, which is collateralized by U.S. government bonds. If not, the borrower would simply borrow on an unsecured basis at the lower rate and keep hold of their bond collateral. This has indeed always been the case: The difference between short-dated libor and repo rates has always been positive.

The spread tends to widen at times of financial stress, and reached record levels soon after the Lehman default; so far, nothing untoward. However, a few further steps of financial engineering—in particular the construction of a trade involving a bond purchase, a repo transaction to finance the bond, and a fixed-rate interest rate swap—led practitioners to conclude that a quantity known as the swap spread—the difference between the swap rate and treasury yield, essentially a measure of the long-term average of unsecured less collateralized rates—also must be positive. If not, a portfolio of positive cash-flows can be constructed at minimal cost.

In the United States, the 30-year swap spread was indeed always positive—until the aftermath of the Lehman bankruptcy, when it gapped down dramatically to become negative. The swap spread has remained negative from early 2009 to this day, and continues to puzzle Harvard Business School students as they work through the case study on this and related phenomena.

The second example is even starker. Basic orderings of derivative prices that had been taken for granted since the inception of markets were shown not to hold in certain situations. In particular, the prices of three derivative contracts became dislocated to such a degree that bounds based simply on the triangle inequality did not hold.

To be more precise, consider three contracts with the same maturity date whose payouts are max{X-K₁, 0}, max{Y-K₂, 0}, and max{X+Y-K₁– K₂, 0} respectively, where X and Y are the prices at the maturity date of two financial assets and K₁ and K₂ are constants. It is clear upon inspection that if one owns the first two contracts, then the payout one receives at maturity must be greater than that of the third contract.

However, during the financial crisis, there were cases in which the sum of the prices of the first two contracts was lower than the price of the third contract. (The financial quantities in these actual cases were euro interest rates.) Such price behavior stunned seasoned practitioners. How can one find bounds for derivative prices or order such prices when even the triangle inequality no longer holds?

I find an analogy with mathematics to be illuminating here. The Axiom of Choice is an axiom underpinning mathematical logic and set theory. Assuming this axiom holds, one can then develop a coherent approach to set theory and establish, for example, the well ordering of the real numbers. Without the axiom, it is fair to say that mathematics is doable but becomes far messier. The financial crisis represented the financial equivalent of mathematics without the axiom of choice. Ordering of derivative prices that seemed inviolate no longer held. Sound logical arguments were shown to be false. In such an environment, finance is still doable—indeed the derivative products that have caused such puzzlement still exist—yet the practice of it becomes far messier.

Thus, judgment, the ability to reason about uncertainty, and the ability to combine subject matter knowledge and technical expertise with experience is ever more important. In finance, such judgment can represent the difference between success—solvency and calmness—and failure—bankruptcy and crisis.