Posts tagged with quantification

Random portfolios have the power to revolutionize fund management.

There is no convincing evidence that more than a handful of funds have consistently outperformed. This should tell every active fund manager on the planet that the present form of performance measurement is inadequate.

Performance measurement via a benchmark is hopelessly noisy — it takes decades to get a real answer.

A fund manager that can outperform should do better when the tracking error constraint is removed. Much better to use random portfolios to measure the performance of active funds to see if they are adding value. Funds should be judged with minimum tracking error constraints. It is in the investor’s best interest for the active funds they invest in to be as uncorrelated as possible with the indices that they invest in passively. That means a large tracking error.

Patrick Burns

(edited and amalgamated by me, without adding anything substantial)


The Audacity of Despair

by David Simon (creator of The Wire)

  • arch cynicism about the public purpose of television
  • The Wire is not hyperbolic about our inability to solve our own problems.
  • The news media buries and forgets relevant information.
  • New Orleans was not destroyed by Hurricane Katrina. An untethered barge breached the retaining wall, destroying the Ninth Ward.
  • Three years later during Hurricane Gustav, another barge was unsecured in the same canal.
  • The Wire is not about sinister people doing sinister things. There’s no fun in that. There’s no drag in writing a show about bad guys and good guys. First of all, it’s not credible. And second of all, it’s not where the real evil lurks.
  • As a reporter: “Every time someone dragged out a statistic, I immediately distrusted it as [probably fabricated or] dubious [method]”
  • Management: No sooner does someone invent a useful measure of institutional progress, than someone else begins to game it to the point that the measure becomes useless.
  • "In my city [Baltimore], every single effort to quantify progress was an effort by somebody to advance themselves.”
  • People are promoted or leave to another job before anyone figures out what they got was dross.
  • Cops retire with a pension despite making zero progress in 40 years in the war on drugs.
  • Why? is the only of the 5W’s+1H that matters. That could have made journalism “a game for grown-ups”.
  • Bulls∗∗t US government claims about progress in Vietnam.
  • More profitable for Chicago Tribune Company’s shareholders to stop asking Why?—and lay off reporters.
  • This was due to their monopoly: they didn’t need top-quality journalism to compete. But the drop in quality, if efficient at the time, made the papers soft targets when the Web became big half a decade later.
  • He thinks Internet reporting is less magazine-like and more frothy. I contend ∃ both.
  • Crime wasn’t going down anymore. So robberies became larcenies. Aggravated assaults became common assaults. Felonies were leached down to misdemeanors.” Robberies in southwest Baltimore went down 70%. The commander was promoted to head of CID. Next boss went in, crime went up 70%, he took the flack.
  • "40% decline in crime, but the murder rate stayed constant. [red flag] The only thing that that says rationally is that they’ve opened up a gun range in West Baltimore and they’re better shots.”
  • Any reporter who had any sense of his beat would know this was a huge red flag, would dig deeper into the data and call the complainants.
  • "How is it that we’re able to talk about this in an entertainment medium—television—but not in journalism?”
  • Curfew for Blacks in Baltimore (fallacious arrests). ACLU tries to sue, but by the time it wends its way through the courts the practice has stopped; the Mayor has become Governor.
  • "If you walk into The Other America and ask people how they feel about certain things, you’re likely to hear how they feel.”
  • "We stole facts from real life, but thematically the people we stole the most from were Euripides, Aeschylus, and Sophocles."


The Bechdel Test

Does a film contain

  • two named females
  • who talk to each other
  • about something other than a man?

This seemingly low bar for female inclusion fails for a surprisingly high fraction of media. Even some excellent films, like The Godfather, fail it.

The Godfather. Can you remember what the actual final shot of the film is? Some people are surprised that the POV is on Kay from inside Michael’s room - her anguished face is blocked out by the closing door. The preceding shot is her looking in as men greet the new godfather. Ultimately the final shot is saying we are with Michael now in that room, and he/we are part of the closing of the door. A great, symbolic final shot. The first of three perfect final shots in the series. A

(You could argue that female exclusion is a theme of The Godfather, but still wouldn’t it have been interesting to view some of the wives’ and daughters’ thoughts to each other about the boys’ mobster behaviour? This isn’t asking for the movie to be about women, just to feature their speech.)


The Bechdel test is interesting mathematically because it is a global non-local test. Not every movie needs to pass for “things to be good” but if too many movies fail then things are not good.


You could also view the Bechdel test as a vague or smudged boundary condition. Like in sensitivity analysis (in linear programming) where you nudge the boundary planes with a slack vector to see how the system responds. We could perturb the definition of the test, and as we change the criteria or interpretation more or fewer movies will pass. But the test makes its point whether we interpret it loosely or stringently, so we could consider it a suite of boundaries rather than a single, crisp boundary.

Individual playwrights can write whatever they want. Blue Lagoon with two boys? Be my guest. An all-white cast in a story set in rural Sweden circa 1320? Makes sense. Nju Bao (in 炮打双灯) isolated without female counsel in a man’s world? Appropriate. But when the Bechdel test fails en masse something insidious is going on. Which focus group told film investors that audiences hate seeing women talk to each other? Who went through all the scripts and changed all the female names to male ones? I’m guessing no-one.


Sexism, racism, and so on are often discussed on a case-by-case basis. Was this or that action sexist|racist|etc on its own? But not every property can be observed at a zoomed-in level. Some properties are only visible at a systemic or macro level.

As a side note, the frequent failure of the Bechdel test also argues, via modus tollens, against a certain kind of “markets will fix things” logic. I would think that economic forces would incent film producers away from being so exclusionary. Aren’t Hollywood executives leaving massive amounts of money on the table by working so assiduously to make sure women are only faces, bodies, and tropes? But yet, count the number of movies that fail this basic inclusivity test. Even though movies are a $X billion industry (therefore locking in a few percent of audience is worth a lot in absolute terms and ∴ worth the time to look at), they still frequently exclude minority perspectives.

Here are some stories that fail the Bechdel test:

  • Bladerunner
  • Red Firecracker Green Firecracker (炮打双灯)
  • Amélie
  • The Graduate
  • King of California
  • The Last Emperor
  • The Godfather
  • The Quiet American [fails for women and for Vietnamese]
  • The Wrestler
  • Dr Strangelove

and here are some that pass the Bechdel test:

  • Star Wars: Clone Wars (both)
  • Firefly
  • Scream
  • Magnolia
  • A Streetcar Named Desire
  • Kill Bill


Topology gets appropriate for qualitative rather than quantitative properties, since it deals with closeness and not distance.

It is also appropriate where distances exist, but are ill-motivated.

These approaches have already been used successfully, for analyzing:

  • • physiological properties in Diabetes patients
  • • neural firing patterns in the visual cortex of Macaques
  • • dense regions in ℝ⁹ of 3×3 pixel patches from natural [black-and-white] images
  • • screening for CO₂ adsorbative materials
Michi Johanssons (@michiexile)


49 Plays

An interesting story about industrial rail in the United States. About 20 mins. From The Economist.

commercial railways in the United States

  • Europe has an impressive and growing network of high-speed passenger links
  • America’s freight railways are one of the unsung transport successes of the past 30 years.
  • Before deregulation America’s railways were going bust. … By 1980 a fifth of rail mileage was owned by bankrupt firms.
  • Since 1981 productivity has risen by 172%, after years of stagnation. Adjusted for inflation, rates are down by 55% 
  • Coal is the biggest single cargo, accounting for 45% by volume and 23% by value.
  • since 1990 the average horsepower of their fleet has risen by 72%
  • [since 1990] the number of ton-miles per (American) gallon of fuel [rose] from 332 to 457—an improvement of 38%
  • But the fastest-growing part of rail freight has been "intermodal" traffic: containers or truck trailers loaded on to flat railcars. The number of such shipments rose from 3m in 1980 to 12.3m in 2006, before the downturn caused a slight falling back.
  • one freight train can carry as much as 280 lorries can

(Source: )


Counting generates from the programmer’s successor function ++ and the number one. (You might argue that to get out to infinity requires also repetition. Well every category comes with composition by default, which includes composition of ƒ∘ƒ∘ƒ∘….)

But getting to one is nontrivial. Besides the mystical implications of 1, it’s not always easy to draw a boundary around “one thing”. Looking at snow (without the advantage of modern optical science) I couldn’t find “one snow”. Even where it is cut off by a plowed street it’s still from the same snowfall.
a larger &lsquot;thing&rsquot; with holes in it ... like the snow has &lsquot;road holes&rsquot; in it
And if you got around on skis a lot of your life you wouldn’t care about one snow-flake (a reductive way to define “one” snow), at least not for transport, because one flake amounts to zero ability to travel anywhere. Could we talk about one inch of snow? One hour of snow? One night of snow?

Speaking of the cold, how about temperature? It has no inherent units; all of our human scales pick endpoints and define a continuum in between. That’s the same as in measure theory which gave (along with martingales) at least an illusion of technical respectability to the science of chances. If you use Kolmogorov’s axioms then the difficult (impossible?) questions—what the “likelihood” of a one-shot event (like a US presidential election) actually means or how you could measure it—can be swept under the rug whilst one computes random walks on trees or Gaussian copulæ. Meanwhile the sum-total of everything that could possibly happen Ω is called 1.

With water or other liquids as well. Or gases. You can have one grain of powder or grain (granular solids can flow like a fluid) but you can’t have one gas or one water. (Well, again you can with modern science—but with even more moderner science you can’t, because you just find a QCD dynamical field balancing (see video) and anyway none of the “one” things are strictly local.)

And in my more favourite realm, the realm of ideas. I have a really hard time figuring out where I can break off one idea for a blogpost. These paragraphs were a stalactite growth off a blobular self-rant that keeps jackhammering away inside my head on the topic of mathematical modelling and equivalence classes. I’ve been trying to write something called “To equivalence class” and I’ve also been trying to write something called “Statistics for People Who Program Computers” and as I was talking this out to myself, another rant squeezed out between my fingers and I knew if I dropped the other two I could pull One off it could be sculpted into a readable microtract. Leaving “To Equivalence Class”, like so many of the harder-to-write things, in the refrigerator—to marinate or to mould, I don’t know which.

But notice that I couldn’t fully disconnect this one from other shared-or-not-shared referents. (Shared being English language and maybe a lot of unspoken assumptions we both hold. Unshared being my own personal jargon—some of which I’ve tried to share in this space—and rants that continually obsess me such as the fallaciousness of probabilistic statements and of certain economic debates.) This is why I like writing on the Web: I can plug in a picture from Wikipedia or point back to somewhere else I’ve talked on the other tangent so I don’t ride off on the connecting track and end up away from where I tried to head.

The difficulty of drawing a firm boundary of "one" to begin the process of counting may be an inverse of the "full" paradox or it may be that certain things (like liquid) don’t lend themselves to counting in an obvious way—in jargon, they don’t map nicely onto the natural numbers (the simplest kind of number). If that’s a motivation to move from discrete things to continuous when necessary, then I feel a similar motivation to move from Euclidean to Hausdorff, or from line to poset. Not that the simpler things don’t deserve as well a place at the table.

We thinkers are fairly free to look at things in different ways—to quotient and equivalence-class creatively or at varying scales. And that’s also a truth of mathematical modelling. Even if maths seems one-right-answer from the classroom, the same piece of reality can bear multiple models—some refining each other, some partially overlapping, some mutually disjoint.

Forestry is the province of variability.

From a spatial point of view this variability ranges from within-tree variation (e.g. modeling wood properties) to billions of trees growing in millions of hectares (e.g. forest inventory).

From a temporal point of view we can deal with daily variation in a physiological model to many decades in an empirical growth and yield model.

Luis Apiolaza


One example of a total ordering is the “hotness” scale from 110.

Because of the widespread disagreement about the meaning of the numbers, the only thing one can infer based on a man’s rating of a woman is that she is more attractive than those who score below her.


The Hotness Scale derives, I think, from a need to explain one’s tastes to peers and hear them justified.

It typically surfaces in sleepover conversations like this:

  • Chris (secretly likes Kelly Russell): Who do you think is hotter: Liz Jones, or Kelly Russell?
  • Dave: Are you kidding?! Liz Jones is waaaay hotter.
  • Chris: Oh, yeah. I mean, obviously. I was just checking. I just meant, you know, that I think Kelly Russell is like maybe a 7.
  • Dave: Are you crazy?! She’s like a 2.
  • Chris: Come on, 2 is like people who have skin grafts. 2 is people who were burned in fires.
  • Dave: Whatever. Maybe.
    (sheepish retreat to celebrity hotness)
    But man, Jane March is the hottest woman on the planet.
  • Chris: No way, Jenny McCarthy is hotter.

I’m a little embarrassed to admit (though I’ll still admit it) that when my best friend and I started using the hotness scale, we scored girls in different categories, like

  1. tan
  2. boobs
  3. personality
  4. legs
  5. I forget what else
  6. overall score

Yeah, we were really cool. (Also we were really twelve.)


Fast forward to college. We guys were joking about using the ten-point scale, which by then was passé (although I did once use the phrase “a Bloomington 6 is a hometown 9”). We were trying to answer, what is the difference between a 6 and a 7 anyway? And is the distance between 6 and 7 greater or less than the distance between a 9 and a 10?

Everybody had taken calculus by this point so statements involving derivatives were bandied about (even though none of us meant to use real numbers … it was calculus as metaphor).

One guy proposed that each number should correspond to a decile —

  • 1 to the ugliest decile,
  • 10 to the hottest decile,
  • and so on.

 Someone else said that one’s initial reaction put a girl either in

  • the >4 (most of the time) or
  • <5 — but that since no one would ever hit on someone in the latter category,
  • in fact 1≈2≈3≈4.

Another said that he never assigned a 10 to anybody because that would mean he had met his wife. Um, yeah … we were still super cool.

Also contentious was whether each of us accepted the truth of whatever our own numerical ranking was. All I know is that whatever I said my score was, I secretly hoped it was 2 points higher.

There was a lot of inconsistency to the scores, which is why I’m bringing this up under the topic of rank without distance measure. Although I would wager that transitivity is violated, so perhaps this scale does not have a rational basis.


As I’m writing all of this I desperately want to jump ahead to partial orderings. But I haven’t defined them yet and I refuse to link to Wikipedia, so I’ll have to put that topic off.

Suffice to say that attraction is a perfect jumping-off point for one further generalization I want to make in order to get mathematics into bed with human experience.

Not everyone can be ranked side-by-side against everyone else. People can be attractive for different reasons (multiple >'s) and some people you just aren't comparable. All of these reasons and more are great justifications for switching to posets.


I’d be interested to hear other interpretations of the hotness scale, or other scales of attractiveness, and how they evolved over time. What measure, if any, do you give guys when you’re in your 30’s or 40’s?

PS When I was 24, my girlfriend told me that “24 is a very hot age”. Ha ha.

PPS Tim Ferriss claims to be able to quantify the difference between a 6 and a 9. Tell that to Jimi Hendrix.