Thursday, March 20, 2008

Meet the thinkers: The curious aviary of Dr. Taleb

Cygnus atratusWe also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns - the ones we don't know we don't know.
- Donald Rumsfeld



Some of us been waiting for something like this book for a long time, and its Levantine author has come a long way - all the way from the hills of northern Lebanon and the Syro-Greek Orthodox town of Amyoun. The book is The Black Swan: The Impact of the Highly Improbable, and the author, Nassim Nicholas Taleb, former financial trader and now extraordinary professor of the inexact sciences at the University of Massachusetts, Amherst, etc., etc. - essentially, the Dean's pet, and they don't know where to put him. The Black Swan is one of the most important science books for a non-science audience in many years. Like the best chaos and complexity books of a decade or two ago, The Black Swan deals with scientific questions arising from the stuff of everyday life, not far-off galaxies and times long ago.

The core of Taleb's point is the impact of what we don't know, the improbable, and how "randomness" is really another name for our ignorance. But Taleb has a larger target: a whole book was needed to attack and dismantle the legitimacy of bell curve statistics, "Mediocristan" methods wrongly applied to "Extremistan," and explain why so much of the world doesn't follow the "middle of the road" behavior prescribed by the Gaussian-normal distribution and its cousins, such as the binomial or Poisson distributions.

The book is rich with fallacies exploded:
  • The Ludic Fallacy. This is the fallacy we pick up when we learn probability based on tightly constrained assumptions, "rule of the game," that make understanding statistical methods based on them as easy as an elementary cookbook. (Ludus is Latin for "game" or "fun.") Real life often presents us with situations of limited knowledge, where probabilistic thinking is appropriate, but where we don't know the "rules of the game," at least not all of them. Many trained in probability and statistics apply the cookbook methods anyway, for the lack of anything better. They capture risk - the known unknowns - but not true uncertainty - the unknown unknowns.

  • The Narrative Fallacy. This is a biggie, practiced on an industrial scale by the news media, every day. We draw connections between dots where the real connections are different, or don't exist, or there are no dots to be found. The news media does it to keep our attention with frequently made-up stories, or "narratives," to use the post-modern jargon, that seem better than no story, or a different one.

  • The Narrative Fallacy supports a related fallacy, one of Misattributed or Reified Intentionality, the fallacy that human society is collectively a result of human intentions or consciousness. In fact, most of it is not, and attempts to force it to be so have led to one disaster after another. Our minds are too limited and possess too narrow a scope of awareness to make this possible. Human society is mostly made behind our backs, so to speak. Taleb's developed views on this question end up very close to the views of the famous Austrian school of economics and sociology.
Taleb has an outrageously funny time explaining what went wrong with statistics and the social sciences in the 19th century, when they were invaded by the concept of the Average Man, and everything was reduced to bell curves, means, and small variations.* All this would hold if our world were Mediocristan. But much of our world is not.

Who's Stan, and what's the difference? Mediocristan is tightly constrained by "fixed totals," or what physicists call "conservation laws." We've already met these and seen what they do. They force the collective behavior into highly restricted patterns, with "equipartitions" of energy, or number, or volume. This certainly is an aspect of our world, and not just in thermodynamics. Heights and weight, for example, both of them strongly limited by gravity and metabolic limits, are distributed in a way close to the bell curve. But then again, consider the distribution of weights in aquatic animals, and you can already see: without gravity, the maximum size is much bigger (think of whales and octopi).

The key to Mediocristan is the Central Limit Theorem. If a population's distribution (of whatever attribute) is made up of independent instances and has well-defined moments (weightings), then the distribution approaches the bell curve in the limit of "large numbers." The presence of "fixed totals" guarantees well-defined distribution weights (moments).

But in many, perhaps the majority of, cases, it fails. The instances are not independent of one another, not distributed with well-defined weights, or neither. The distribution then has much less reason to clump near the mean. In fact, in such cases, many of our usual statistical clichés (means, variances, medians, etc.) fail to capture what's going on.

This is the world Taleb calls Extremistan.** If there's no "fixed total" of something being distributed (like economic wealth, or the total number of books sold by a single author, say), there's no reason to think that the total will be broken up in a roughly even way among instances. Here is the key to understanding much of our world - economic markets, wealth, and income in particular. Many days on markets are boring. Some are interesting. A few are extraordinary - and it these days, the black swans of the financial world, that end up dominating the cumulative history of the market. Just look at the last few months' newspapers.

We encounter similar truths in biological evolution, in contrast to the anodyne but wrong gradualism still dominantly taught. Most of the cumulative change in biological evolution is due to a small number of extraordinary turns of events that have outsized impacts echoing through the millennia. Ditto for human history.

And of course, on Taleb's home ground of financial markets, the reality of black swans, and fractal or fat-tailed distributions, is of intense interest. The disastrous application of bell curve-based statistical methods to quantitative finance in the last generation has not made markets better-behaved or investment strategies sounder. On the contrary: the 1998 Long Term Capital Management and 2008 mortgage crises make clear just how wrong these methods are. They're "state-of-the-art" in some sociological sense, but it's a mistake to call them an art, much less a science.

We've met these strange birds already: Taleb's black swans are the stream of unique events of chaos. His grey swans are those occasional, semi-tamable events at the low frequency end of the spectrum.

Plato in Nerdistan. As the book develops in its middle, Taleb wanders through the thickets of epistemology, how we know what we know. This part is somewhat weaker than the book's earlier and last parts, because the argument goes too far afield and loses a bit of focus. Taleb over-blurs the distinction between event (his specialty) and entity. Before European explorers reached Australia, they believed that all swans are white. The whiteness was not an essential part of the definition of "swan," nor was the belief obviously false. It was a contingent statement about two different properties of things: "swanness" and "whiteness." This supposed connection met its end when the explorers encountered the black swans of Australia. A deeper lesson took a longer to sink in, and some still resist it: disproving something is much easier than proving it. Proving something requires understanding its nature more deeply and thoroughly than our knowledge often runs.

Even this middle part is rich with deserving targets. Taleb calls them "Platonified abstractions," the stuff of academic knowledge. They're thrown around confidently by people who don't know what they don't know. This might almost be a definition of nerdity: what you know fits into cut-and-dried abstractions, and you confuse these with the actual world only known to us very imperfectly. Nerds stand in counterpoise to Taleb's foil, the Fat Tonys, the proverbial cabdrivers of the world who know better and who understand that when it comes to Platonicity, you can take it or leave it.

What do you know, and how do you know it? Exact human knowledge is coined under laboratory control or by precise logic. Most of the knowledge we use in everyday life is approximate knowledge in well-defined, if not controlled, conditions. At the edges of what we know is amorphous knowledge, often mixed in with a lot of prejudice and guessing. And if we want more and better knowledge, we face the reality of trade-offs. I can be sure something will happen today, but I don't know its significance. I can also be sure something significant will happen in the next year, but I don't know when.

Modern science is not based on induction, contrary to common belief. It's based on a mixture of hypothesis, deduction, controlled experiment, and controlled mathematics. It's not because scientists are dogmatists that they live by deduction. It's because deduction allows one's reasoning to be kept under precise control, with all the assumptions on the table and the steps clear. Induction (like statistical correlation) can certainly be strongly suggestive of hypotheses, and it's essential for developing logical definitions. But you can't prove anything with it. One counterexample - the black swan - destroys it. Silent evidence is always lurking to upset the induction cart.

The essence of probability. Coping with limited knowledge means falling back on probabilistic arguments, and this is in fact the origin of statistics. Its modern founders (Pascal, Bayes, Laplace, Gauss) all identified probability with a greater or lesser sense of certainty about something, not its frequency. This distinction fueled a great 19th-century debate between Bayesians and frequentists. Until the 1920s, the frequentists had the upper hand. But modern mathematics has abandoned frequentism, except as an approximation in carefully circumscribed situations where the Ludus isn't a Fallacy (like sports or gambling, for example). With frequentism came many long-unexamined false assumptions; for example, that "noise" and "randomness" are "theory-free" concepts. In fact, few things are more loaded down with theoretical assumptions than "randomness," if taken as a metaphysical category. Taking it as a statement about the limits of human knowledge, OTOH, makes it almost a truism. In most cases, the Ludic Fallacy will come back to bite us: we often don't know all of the rules of the game.

Unfortunately, the frequentist approach to statistics is still taught because it's cookbook. Even in situations where a canned approach is not appropriate, a recipe feels comforting, relieving people of having to think. I might even call this the Cookbook Fallacy: having a wrong recipe is better than no recipe. Actually, no recipe is better than a bad one - at least it's honest and doesn't force us into wrong assumptions.

Taleb in his garden. Along with his skeptical empiricism, Taleb exhibits other exquisitely refined scientific tastes, paralleling his capacious gourmand tastes in literature and food. This might seem an affectation, but it points to an important truth.

Richard Feyman, another man of powerful scientific intuition, said: to do good science, you gotta have taste! Science, like the arts, has its forms of kitsch: rules mechanically applied without the imagination and drive for the fully worked-out development, but without indulging in useless repetition. Science is still and will always remain partly an art. To have taste is to avoid weak arguments and rationalizing, and to avoid applying methods and concepts where and when they are not valid. It is to think that, if there's no deep fundamental principle that prevents something, then why not? What would the world look like if it were so? Maybe you lack the imagination to see that that is our world. Taste is seeing that not taking obvious things for granted is a true royal road to discovery. It is paying attention to the silent evidence, to the dog that didn't bark, and to the pious who prayed and drowned anyway: survivor bias.

Taste in science also requires revisiting fundamental issues, ones never completely resolved. The progress of science has solved many problems defined more narrowly. But deep issues remain, even if transformed. Science has its classics and its literature, history, and philosophy; progress doesn't erase their importance. Read them and avoid being a cultural philistine.†

Taleb reminds us that what we don't know can hurt us, and that what we don't know is often more important than what we do. The Black Swan is a fine book. Buy, read, and enjoy it, patiently and slowly. And if nothing else, be charmed by the bittersweet tale of Yevgenia and her unknown masterpiece.
---
* Hayek attacked much the same in his Counterrevolution of Science, laying out the 19th-century origins of Platonified pseudo-knowledge in the social sciences and the pretensions of social planning that often went with it. Plenty of perceptive people, like our old friend Poincaré, resisted this development, this misapplication of inappropriate mathematical methods to society. But the Tyranny of the Cookbook is an unrelenting one.

** Not to be confused with Wackistan. That's where Ahmadinejad lives.

† Taleb uses the German term, Bildungsphilister, just to show, I suppose, that he isn't one.

Be alert to a real affectation, indulging in philosophical problems isolated from anything real. As Taleb points out, most philosophical issues worth bothering with are suggested by something outside philosophy.

Labels: , , , , , , ,

0 Comments:

Post a Comment

<< Home